protobuf-go/internal/cmd
Vanja Pejovic 86bdc4705a internal/impl: preallocate memory when unmarshalling packed repeated fields
This improves wall time and allocated memory. I haven't found any
case (0 values, few values, many values) where this change is a
consistent regression. For fields with thousands of values, this
reduces memory usage by 50% and wall time by 20%.

Benchmark results from:
go test bench_test.go testmessages_test.go decode_test.go -run=none -bench=BenchmarkDecode/packed -benchmem -count=6 -timeout=0

goos: linux
goarch: amd64
cpu: Intel(R) Xeon(R) W-2135 CPU @ 3.70GHz
                                  │ base        │ fast
                                  │ sec/op      │ sec/op      vs base
repeated0packedAllTypes         432.9n ± 5%  420.1n ± 11%       ~ p=0.39
repeated0packed3AllTypes        431.2n ± 6%  433.8n ±  3%       ~ p=0.69
repeated0packedAllExt           2.748µ ± 6%  2.845µ ±  2%       ~ p=0.06
repeated0length_packedAllTypes  310.0n ± 0%  307.4n ±  1%  -0.84% p=0.00
repeated0length_packed3AllTypes 309.7n ± 1%  309.1n ±  4%       ~ p=0.41
repeated0length_packedAllExt    1.689µ ± 2%  1.732µ ±  5%       ~ p=0.39
packedPackedTypes               308.6n ± 1%  276.3n ±  1% -10.47% p=0.00
packedPackedExt                 2.727µ ± 2%  2.685µ ±  1%  -1.54% p=0.00
packed0lengthPackedTypes        163.4n ± 1%  160.8n ±  4%       ~ p=0.06
packed0lengthPackedExt          1.676µ ± 1%  1.748µ ±  4%  +4.30% p=0.01
geomean                             673.4n       668.3n        -0.75%

                                  │ base         │ fast
                                  │ B/op         │ B/op       vs base
repeated0packedAllTypes         1.328Ki ± 0%  1.281Ki ± 0% -3.53% p=0.00
repeated0packed3AllTypes        1.328Ki ± 0%  1.281Ki ± 0% -3.53% p=0.00
repeated0packedAllExt           5.364Ki ± 0%  5.364Ki ± 0%      ~ p=1.00
repeated0length_packedAllTypes  1.125Ki ± 0%  1.125Ki ± 0%      ~ p=1.00
repeated0length_packed3AllTypes 1.125Ki ± 0%  1.125Ki ± 0%      ~ p=1.00
repeated0length_packedAllExt    4.208Ki ± 0%  4.208Ki ± 0%      ~ p=1.00
packedPackedTypes                 592.0 ± 0%    544.0 ± 0% -8.11% p=0.00
packedPackedExt                 5.364Ki ± 0%  5.364Ki ± 0%      ~ p=1.00
packed0lengthPackedTypes          384.0 ± 0%    384.0 ± 0%      ~ p=1.00
packed0lengthPackedExt          4.208Ki ± 0%  4.208Ki ± 0%      ~ p=1.00
geomean                             1.735Ki       1.708Ki      -1.55%

                                  │ base       │ fast
                                  │ allocs/op  │ allocs/op   vs base
repeated0packedAllTypes         21.00 ± 0% 15.00 ± 0%  -28.57% p=0.002
repeated0packed3AllTypes        21.00 ± 0% 15.00 ± 0%  -28.57% p=0.002
repeated0packedAllExt           131.0 ± 0% 131.0 ± 0%        ~ p=1.000
repeated0length_packedAllTypes  1.000 ± 0% 1.000 ± 0%        ~ p=1.000
repeated0length_packed3AllTypes 1.000 ± 0% 1.000 ± 0%        ~ p=1.000
repeated0length_packedAllExt    33.00 ± 0% 33.00 ± 0%        ~ p=1.000
packedPackedTypes               21.00 ± 0% 15.00 ± 0%  -28.57% p=0.002
packedPackedExt                 131.0 ± 0% 131.0 ± 0%        ~ p=1.000
packed0lengthPackedTypes        1.000 ± 0% 1.000 ± 0%        ~ p=1.000
packed0lengthPackedExt          33.00 ± 0% 33.00 ± 0%        ~ p=1.000
geomean                             13.30      12.02        -9.60%

Change-Id: I622dd2055c3ca936f948f86ae8434387f42f8d8e
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/534196
Reviewed-by: Michael Stapelberg <stapelberg@google.com>
Reviewed-by: Damien Neil <dneil@google.com>
2023-10-12 06:52:24 +00:00
..
generate-corpus
generate-protos types/descriptorpb: regenerate using latest protobuf v22.0 release 2023-02-22 09:33:03 +00:00
generate-types internal/impl: preallocate memory when unmarshalling packed repeated fields 2023-10-12 06:52:24 +00:00
pbdump all: remove shorthand import aliases 2022-05-24 20:05:50 +00:00