11 Commits

Author SHA1 Message Date
Damien Neil
adbbc8ec47 internal/impl: inline some small varint decoding
Inline decoding of 1- and 2-byte varints in generated unmarshal
functions.

name                             old time/op  new time/op  delta
EmptyMessage/Wire/Unmarshal      40.2ns ± 2%  40.1ns ± 1%     ~     (p=0.355 n=37+37)
EmptyMessage/Wire/Unmarshal-12   7.12ns ± 1%  6.87ns ± 1%   -3.49%  (p=0.000 n=37+39)
RepeatedInt32/Wire/Unmarshal     6.46µs ± 1%  5.78µs ± 1%  -10.65%  (p=0.000 n=35+33)
RepeatedInt32/Wire/Unmarshal-12  1.05µs ± 2%  0.98µs ± 2%   -6.79%  (p=0.000 n=33+40)
Required/Wire/Unmarshal           251ns ± 1%   216ns ± 1%  -13.69%  (p=0.000 n=38+36)
Required/Wire/Unmarshal-12       42.4ns ± 1%  37.7ns ± 2%  -11.02%  (p=0.000 n=37+39)

Change-Id: Iecfc38fcae00979b89a093368821cca7f2357578
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/216421
Reviewed-by: Joe Tsai <thebrokentoaster@gmail.com>
2020-01-27 19:09:55 +00:00
Damien Neil
f0831e87e2 internal/impl: change unmarshal func return to unmarshalOptions
The fast-path unmarshal funcs return the number of bytes consumed.

Change these functions to return an unmarshalOutput struct instead, to
make it easier to add to the results. This is groundwork for allowing
the fast-path unmarshaler to indicate when the unmarshaled message is
known to be initialized.

Change-Id: Ia8c44731a88f5be969a55cd98ea26282f412c7ae
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/215720
Reviewed-by: Joe Tsai <joetsai@google.com>
2020-01-22 00:22:58 +00:00
Damien Neil
2c0824b512 internal/impl: fix size for zero-length packed extensions
The size calculation for packed repeated extension fields was
considering a zero-length list as encoding to a zero-length
wire.BytesType field, rather than being omitted entirely.

Change-Id: I7d4424a21ca8afd4fa81391caede49cadb4e2505
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/212297
Reviewed-by: Joe Tsai <joetsai@google.com>
2019-12-20 22:08:18 +00:00
Damien Neil
5366f825ad proto: consistently use non-nil, zero-length []bytes for empty bytes strings
The fast-path decoder decodes zero-length repeated bytes values as
non-nil, zero-length []bytes. Do the same in the reflection decoder.

This isn't really a correctness issue, since there's no ambiguity about what a
nil entry in a [][]byte means. Still a good idea for consistency, and
retains v1 behavior.

Change-Id: Icd2cb726d14ff1f2b9f142e65756777a359971f3
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/210257
Reviewed-by: Joe Tsai <thebrokentoaster@gmail.com>
2019-12-09 17:33:50 +00:00
Damien Neil
4b3a82f6b1 internal/impl: clean up Value codecs
Remove the Go type from consideration when creating Value codecs, as it
is unnecessary. Value codecs convert between wire form and Values,
while Converters convert between Values and the Go type.

Change-Id: Iaa4bc7db81ad0a29dabd42c2229e6f33a0c91c67
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/193457
Reviewed-by: Joe Tsai <thebrokentoaster@gmail.com>
2019-09-05 16:31:05 +00:00
Damien Neil
68b81c3117 internal/impl: store extension values as Values
Change the storage type of ExtensionField from interface{} to
protoreflect.Value.

Replace the codec functions operating on interface{}s with ones
operating on Values.

Values are potentially more efficient, since they can represent
non-pointer types without allocation. This also reduces the number of
types used to represent field values.

Additionally, this change lays groundwork for changing the
user-visible representation of repeated extension fields from
*[]T to []T. The storage type for extension fields must support mutation
(thus *[]T currently); changing the storage type to a Value permits this
without the need to introduce yet another view on field values.

Change-Id: Ida336be14112bb940f655236eb58df21bf312525
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/192218
Reviewed-by: Joe Tsai <thebrokentoaster@gmail.com>
2019-09-03 20:58:28 +00:00
Damien Neil
8003f08e51 proto, internal/impl: zero-length proto2 bytes fields should be non-nil
Fix decoding of zero-length bytes fields to produce a non-nil []byte.

Change-Id: Ifb7791a47df81091700f7226523371d1386fb1ad
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/188765
Reviewed-by: Joe Tsai <thebrokentoaster@gmail.com>
2019-08-05 21:44:30 +00:00
Joe Tsai
c51e2e0293 all: support enforce_utf8 override
In 2014, when proto3 was being developed, there were a number of early
adopters of the new syntax. Before the finalization of proto3 when
it was released in open-source in July 2016, a decision was made to
strictly validate strings in proto3. However, some of the early adopters
were already using invalid UTF-8 with string fields.
The google.protobuf.FieldOptions.enforce_utf8 option only exists to support
those grandfathered users where they can opt-out of the validation logic.
Practical use of that option in open source is impossible even if a user
specifies the proto1_legacy build tag since it requires a hacked
variant of descriptor.proto that is not externally available.

This CL supports enforce_utf8 by modifiyng internal/filedesc to
expose the flag if it detects it in the raw descriptor.
We add an strs.EnforceUTF8 function as a centralized place to determine
whether to perform validation. Validation opt-out is supported
only in builds with legacy support.

We implement support for validating UTF-8 in all proto3 string fields,
even if they are backed by a Go []byte.

Change-Id: I9c0628b84909bc7181125f09db730c80d490e485
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/186002
Reviewed-by: Damien Neil <dneil@google.com>
2019-07-15 19:53:05 +00:00
Damien Neil
7492a09da9 internal/impl: support packed extensions
Change-Id: I5a9e22f1c98f5db9caae1681775017da5aa67394
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/185541
Reviewed-by: Joe Tsai <thebrokentoaster@gmail.com>
2019-07-11 18:07:16 +00:00
Damien Neil
e91877de26 internal/impl: add fast-path unmarshal
Benchmarks run with:
  go test ./benchmarks/ -bench=Wire  -benchtime=500ms -benchmem -count=8

Fast-path vs. parent commit:

  name                                      old time/op    new time/op    delta
  Wire/Unmarshal/google_message1_proto2-12    1.35µs ± 2%    0.45µs ± 4%  -67.01%  (p=0.000 n=8+8)
  Wire/Unmarshal/google_message1_proto3-12    1.07µs ± 1%    0.31µs ± 1%  -71.04%  (p=0.000 n=8+8)
  Wire/Unmarshal/google_message2-12            691µs ± 2%     188µs ± 2%  -72.78%  (p=0.000 n=7+8)

  name                                      old allocs/op  new allocs/op  delta
  Wire/Unmarshal/google_message1_proto2-12      60.0 ± 0%      25.0 ± 0%  -58.33%  (p=0.000 n=8+8)
  Wire/Unmarshal/google_message1_proto3-12      42.0 ± 0%       7.0 ± 0%  -83.33%  (p=0.000 n=8+8)
  Wire/Unmarshal/google_message2-12            28.6k ± 0%      8.5k ± 0%  -70.34%  (p=0.000 n=8+8)

Fast-path vs. -v1:

  name                                      old time/op    new time/op    delta
  Wire/Unmarshal/google_message1_proto2-12     702ns ± 1%     445ns ± 4%   -36.58%  (p=0.000 n=8+8)
  Wire/Unmarshal/google_message1_proto3-12     604ns ± 1%     311ns ± 1%   -48.54%  (p=0.000 n=8+8)
  Wire/Unmarshal/google_message2-12            179µs ± 3%     188µs ± 2%    +5.30%  (p=0.000 n=7+8)

  name                                      old allocs/op  new allocs/op  delta
  Wire/Unmarshal/google_message1_proto2-12      26.0 ± 0%      25.0 ± 0%    -3.85%  (p=0.000 n=8+8)
  Wire/Unmarshal/google_message1_proto3-12      8.00 ± 0%      7.00 ± 0%   -12.50%  (p=0.000 n=8+8)
  Wire/Unmarshal/google_message2-12            8.49k ± 0%     8.49k ± 0%    -0.01%  (p=0.000 n=8+8)

Change-Id: I6247ac3fd66a63d9acb902cbd192094ee3d151c3
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/185147
Reviewed-by: Joe Tsai <thebrokentoaster@gmail.com>
2019-07-09 19:56:42 +00:00
Damien Neil
edf7bdda31 internal/impl: rename encode->codec in various places
The code organization is simpler if we keep the functions encoding and
decoding a particular type (e.g., maps) together rather than split
across files. Rename various "encode" files to "codec" in preparation
for adding fast-path decoding.

Change-Id: If1e271da99d31533ffefc19b1fc847936fa9484a
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/185241
Reviewed-by: Joe Tsai <thebrokentoaster@gmail.com>
2019-07-08 20:12:42 +00:00