Commit Graph

7 Commits

Author SHA1 Message Date
Nicolas Hillegeer
8a744307e3 internal/cmd/generate-types: manual CSE of m.messageInfo()
messageInfo() looks like this:

    func (ms *messageState) messageInfo() *MessageInfo {
    	mi := ms.LoadMessageInfo()
    	if mi == nil {
    		panic("invalid nil message info; this suggests memory corruption due to a race or shallow copy on the message struct")
    	}
    	return mi
    }

    func (ms *messageState) LoadMessageInfo() *MessageInfo {
    	return (*MessageInfo)(atomic.LoadPointer((*unsafe.Pointer)(unsafe.Pointer(&ms.atomicMessageInfo))))
    }

Which is an atomic load and a predictable branch. On x86, this 64-bit
load is just a MOV. On other platforms, like ARM64, there's actual
atomics involved (LDAR).

Meaning, it's cheap, but not free. Eliminate redundant copies of this
(Common Subexpression Elimination).

The newly added benchmarks improve by (geomean) 2.5%:

    $ benchstat pre post | head -10
    goarch: amd64
    cpu: AMD Ryzen Threadripper PRO 3995WX 64-Cores
                          │     pre     │                post                │
                          │   sec/op    │   sec/op     vs base               │
    Extension/Has/None-12   106.4n ± 2%   104.0n ± 2%  -2.21% (p=0.020 n=10)
    Extension/Has/Set-12    116.4n ± 1%   114.4n ± 2%  -1.76% (p=0.017 n=10)
    Extension/Get/None-12   184.2n ± 1%   181.0n ± 1%  -1.68% (p=0.003 n=10)
    Extension/Get/Set-12    144.5n ± 3%   140.7n ± 2%  -2.63% (p=0.041 n=10)
    Extension/Set-12        227.2n ± 2%   218.6n ± 2%  -3.81% (p=0.000 n=10)
    geomean                 149.6n        145.9n       -2.42%

I didn't test on ARM64, but the difference should be larger due to the
reduced atomics.

Change-Id: I8eebeb6f753425b743368a7f5c7be4d48537e5c3
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/575036
Reviewed-by: Michael Stapelberg <stapelberg@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Damien Neil <dneil@google.com>
Commit-Queue: Nicolas Hillegeer <aktau@google.com>
Auto-Submit: Nicolas Hillegeer <aktau@google.com>
2024-04-02 14:35:14 +00:00
Damien Neil
de9682ad16 internal/impl: improve MessageInfo.New performance
Calling the ProtoReflect method of the newly-constructed
message avoids an allocation in MessageInfo.MessageOf in
the common case of a generated message with an optimized
ProtoReflect method.

Benchmark for creating an empty message, darwin/arm64 M1 laptop:

    name                 old time/op    new time/op    delta
    EmptyMessage/New-10    32.1ns ± 2%    23.7ns ± 2%  -26.06%  (p=0.000 n=10+9)

    name                 old alloc/op   new alloc/op   delta
    EmptyMessage/New-10     64.0B ± 0%     48.0B ± 0%  -25.00%  (p=0.000 n=10+10)

    name                 old allocs/op  new allocs/op  delta
    EmptyMessage/New-10      2.00 ± 0%      1.00 ± 0%  -50.00%  (p=0.000 n=10+10)

Change-Id: Ifa3c3ffa8edc76f78399306d0f4964eae4aacd28
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/418677
Reviewed-by: Michael Stapelberg <stapelberg@google.com>
Reviewed-by: Lasse Folger <lassefolger@google.com>
2022-07-21 16:01:25 +00:00
Damien Neil
466dd77288 all: fast-path method refactoring
Move all fast-path inputs and outputs into the Input/Output structs.
Collapse all booleans into bitfields.

Change-Id: I79ebfbac9cd1d8ef5ec17c4f955311db007391ca
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/219505
Reviewed-by: Joe Tsai <joetsai@google.com>
2020-02-19 22:01:50 +00:00
Damien Neil
e8e8875f94 proto, runtime/protoiface, internal/impl: add fast-path Merge
Comparing -tags=protoreflect to fast-path:

name                              old time/op    new time/op    delta
pkg:google.golang.org/protobuf/internal/benchmarks goos:linux goarch:amd64
/Clone/google_message1_proto2-12    1.70µs ± 1%    0.30µs ± 1%  -82.64%  (p=0.001 n=7+7)
/Clone/google_message1_proto3-12    1.01µs ± 1%    0.19µs ± 1%  -80.77%  (p=0.000 n=7+8)
/Clone/google_message2-12            818µs ± 8%     141µs ± 6%  -82.78%  (p=0.000 n=8+8)
pkg:google.golang.org/protobuf/internal/benchmarks/micro goos:linux goarch:amd64
EmptyMessage/Clone-12               51.1ns ± 1%    39.3ns ± 3%  -23.03%  (p=0.000 n=7+8)
RepeatedInt32/Clone-12              24.5µs ± 1%     1.1µs ± 3%  -95.64%  (p=0.000 n=8+8)
Required/Clone-12                    978ns ± 1%     132ns ± 2%  -86.46%  (p=0.000 n=8+8)

name                              old alloc/op   new alloc/op   delta
pkg:google.golang.org/protobuf/internal/benchmarks goos:linux goarch:amd64
/Clone/google_message1_proto2-12    1.08kB ± 0%    0.74kB ± 0%  -31.85%  (p=0.000 n=8+8)
/Clone/google_message1_proto3-12      872B ± 0%      544B ± 0%  -37.61%  (p=0.000 n=8+8)
/Clone/google_message2-12            602kB ± 0%     411kB ± 0%  -31.65%  (p=0.000 n=8+8)
pkg:google.golang.org/protobuf/internal/benchmarks/micro goos:linux goarch:amd64
EmptyMessage/Clone-12                96.0B ± 0%     64.0B ± 0%  -33.33%  (p=0.000 n=8+8)
RepeatedInt32/Clone-12              25.4kB ± 0%     3.2kB ± 0%  -87.33%  (p=0.000 n=8+8)
Required/Clone-12                     416B ± 0%      256B ± 0%  -38.46%  (p=0.000 n=8+8)

name                              old allocs/op  new allocs/op  delta
pkg:google.golang.org/protobuf/internal/benchmarks goos:linux goarch:amd64
/Clone/google_message1_proto2-12      52.0 ± 0%      21.0 ± 0%  -59.62%  (p=0.000 n=8+8)
/Clone/google_message1_proto3-12      33.0 ± 0%       3.0 ± 0%  -90.91%  (p=0.000 n=8+8)
/Clone/google_message2-12            22.3k ± 0%      7.5k ± 0%  -66.41%  (p=0.000 n=8+8)
pkg:google.golang.org/protobuf/internal/benchmarks/micro goos:linux goarch:amd64
EmptyMessage/Clone-12                 3.00 ± 0%      2.00 ± 0%  -33.33%  (p=0.000 n=8+8)
RepeatedInt32/Clone-12               1.51k ± 0%     0.00k ± 0%  -99.80%  (p=0.000 n=8+8)
Required/Clone-12                     51.0 ± 0%      18.0 ± 0%  -64.71%  (p=0.000 n=8+8)

Change-Id: Ife9018097c34cb025dc9c4fdd9a61b2f947853c6
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/219147
Reviewed-by: Joe Tsai <thebrokentoaster@gmail.com>
2020-02-14 21:47:10 +00:00
Damien Neil
cadb4ab3b1 internal/impl: refactor validation a bit
Return the size of the field read from the validator, permitting us to
avoid an extra parse when skipping over groups.

Return an UnmarshalOutput from the validator, since it already combines
two of the validator outputs: bytes read and initialization status.

Remove initialization status from the ValidationStatus enum, since it's
covered by the UnmarshalOutput.

Change-Id: I3e684c45d15aa1992d8dc3bde0f608880d34a94b
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/217763
Reviewed-by: Joe Tsai <joetsai@google.com>
2020-02-05 05:32:50 +00:00
Damien Neil
f25a6ca994 internal/benchmarks/micro: add validator microbenchmarks
Change-Id: Ice768d90e650bf51d619b8cd5f5d51e6b00c53b6
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/216623
Reviewed-by: Joe Tsai <joetsai@google.com>
2020-01-28 22:46:05 +00:00
Damien Neil
8729675da7 internal/benchmarks/micro: add a place for microbenchmarks
Add a place to put microbenchmarks used to justify performance-related changes.

Change-Id: I6e90a3500594b3f6297cee0b8e321a50d0a556ca
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/216480
Reviewed-by: Joe Tsai <joetsai@google.com>
2020-01-26 22:22:44 +00:00