636 Commits

Author SHA1 Message Date
Joe Tsai
831b8f59b4 reflect/protoregistry: add conflict override
The ignoreConflict function provides the ability to ignore certain conflicts.
By default, all conflicts are ignored with a log message produced instead.

Change-Id: I67fe56eef492e12421e5c8cb8d618dc2a46c82ed
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/186658
Reviewed-by: Damien Neil <dneil@google.com>
2019-07-18 17:47:11 +00:00
Joe Tsai
43761bdfe7 cmd/protoc-gen-go: update deprecation warning
Change-Id: Ie1a854bf6f47d4ca9e941b6ccc64dc24ff32bd19
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/186657
Reviewed-by: Damien Neil <dneil@google.com>
2019-07-18 02:26:18 +00:00
Joe Tsai
1ac7b53cc5 internal/filedesc: print warnings on registration conflicts
Rather than panicking at init time due to registration failures,
print a warning to stderr. Historically, the Go protobuf implementation
has not been strict about registration conflicts, which has led users
to unknowningly tolerating conflicts that may or may not expose
themselvs as a bug.

Registration conlicts now produce a log message:
<<<
2019/07/17 17:36:42 WARNING: proto: file "path/to/example.proto" is already registered
	previously from: "example.com/company/example_proto"
	currently from:  "example.com/user/example_proto"
A future release will panic on registration conflicts.

>>>

Change-Id: I2d583f04977c8bc8cb6bbd33d239277690bbec54
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/186181
Reviewed-by: Damien Neil <dneil@google.com>
2019-07-18 02:19:58 +00:00
Joe Tsai
f647c82cc3 internal/impl: expose MessageInfo
The v2 MessageInfo is needed by v1 to be able to access the OneofWrappers
and Exporter function. The ProtoMessageInfo method can be deleted
once v1 is entirely implemented in terms of v2.

Change-Id: Iabb1b429af5210faffc6477f52b5020b3aa1fb50
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/186577
Reviewed-by: Damien Neil <dneil@google.com>
2019-07-17 23:01:13 +00:00
Joe Tsai
5ae10aa9f0 encoding: unify MessageSet extension handling logic
This CL unifies common MessageSet logic in prototext and protojson
into the messageset package. While we are at it, also enable
MessageSet support only if the proto1_legacy build flag is enabled.

Change-Id: I1a7d475e8bb1dad61ecd286df45e4239e5bef072
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/185898
Reviewed-by: Damien Neil <dneil@google.com>
2019-07-15 21:07:58 +00:00
Joe Tsai
af57087245 reflect/protoregistry: provide more informative errors for conflicts
The v2 implementation strictly enforces that there are no conflicts at
all in the protobuf namespace unlike the prior v1 implementation.
This change is almost certainly going to cause loud failures for users
that were unknowingly tolerating registration conflicts.

We modify internal/filedesc to be able to record the Go package path
that the file descriptor is declared within. This information is used
by reflect/protoregistry to print both the previous Go package that
registered some declaration, and current Go package that is attempting
to register some declaration.

Change-Id: Ib5eb21c1c98495afc51aa08bd4404bd9d64b5b57
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/186177
Reviewed-by: Damien Neil <dneil@google.com>
2019-07-15 20:39:24 +00:00
Joe Tsai
c51e2e0293 all: support enforce_utf8 override
In 2014, when proto3 was being developed, there were a number of early
adopters of the new syntax. Before the finalization of proto3 when
it was released in open-source in July 2016, a decision was made to
strictly validate strings in proto3. However, some of the early adopters
were already using invalid UTF-8 with string fields.
The google.protobuf.FieldOptions.enforce_utf8 option only exists to support
those grandfathered users where they can opt-out of the validation logic.
Practical use of that option in open source is impossible even if a user
specifies the proto1_legacy build tag since it requires a hacked
variant of descriptor.proto that is not externally available.

This CL supports enforce_utf8 by modifiyng internal/filedesc to
expose the flag if it detects it in the raw descriptor.
We add an strs.EnforceUTF8 function as a centralized place to determine
whether to perform validation. Validation opt-out is supported
only in builds with legacy support.

We implement support for validating UTF-8 in all proto3 string fields,
even if they are backed by a Go []byte.

Change-Id: I9c0628b84909bc7181125f09db730c80d490e485
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/186002
Reviewed-by: Damien Neil <dneil@google.com>
2019-07-15 19:53:05 +00:00
Damien Neil
302cb325fb proto: support message_set_wire_format
MessageSets are a deprecated proto1 feature, long since superseded by
extensions. Add disabled-by-default support behind flags.Proto1Legacy.

Change-Id: I7d3ace07f3b0efd59673034f3dc633b908345a88
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/185538
Reviewed-by: Joe Tsai <thebrokentoaster@gmail.com>
2019-07-15 19:32:30 +00:00
Joe Tsai
d47ea19d2f encoding: support weak fields in text and JSON unmarshaling
If the message for a weak field is linked in,
we treat it as if it were identical to a normal known field.
However, if the weak field is not linked in,
we treat it as if the field were not known.

Change-Id: I576d911deec98e13211304024a6353734d055465
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/185457
Reviewed-by: Herbie Ong <herbie@google.com>
2019-07-15 19:09:08 +00:00
Joe Tsai
3d8e369c4e all: implement proto1 weak fields
This implements generation of and reflection support for weak fields.
Weak fields are a proto1 feature where the "weak" option can be specified
on a singular message field. A weak reference results in generated code
that does not directly link in the dependency containing the weak message.

Weak field support is not added to any of the serialization logic.

Change-Id: I08ccfa72bc80b2ffb6af527a1677a0a81dcf33fb
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/185399
Reviewed-by: Damien Neil <dneil@google.com>
2019-07-15 18:44:12 +00:00
Joe Tsai
e182c917f0 reflect/protoreflect: add FileDescriptor.SourceLocations
This adds minimal support for preserving the source context information.

Change-Id: I4b3cac9690b7469ecb4e5434251a809be4d7894c
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/183157
Reviewed-by: Damien Neil <dneil@google.com>
2019-07-13 00:15:59 +00:00
Joe Tsai
d421150c3b reflect/protoreflect: add Enum.Type and Message.Type
CL/174938 removed these methods in favor of a method that returned
only the descriptors. This CL adds back in the Type methods alongside
the Descriptor methods.

In a vast majority of protobuf usages, only the descriptor information
is needed. However, there is a small percentage that legitimately needs
the Go type information. We should provide both, but document that the
descriptor-only information is preferred.

Change-Id: Ia0a098997fb1bd009994940ae8ea5257ccd87cae
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/184578
Reviewed-by: Damien Neil <dneil@google.com>
2019-07-12 23:43:21 +00:00
Joe Tsai
f8b855d768 runtime/protoiface: API adjustments
The following adjustments were made:
* The pragma.NoUnkeyedLiterals is moved to be the first field.
This is done to keep the options struct smaller. Even if the last
field is zero-length, Go GC implementation details forces the struct
to be padded at the end.
* Methods are documented as always treating AllowPartial as true.
* Added a support flag for UnmarshalOptions.DiscardUnknown.

Change-Id: I1f75d226542ab2bb0123d9cea143c7060df226d8
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/185998
Reviewed-by: Damien Neil <dneil@google.com>
2019-07-12 23:12:26 +00:00
Joe Tsai
0f81b38d61 runtime/protoiface: move and rename XXX_Methods
This CL moves and renames the protoreflect.ProtoMessage.XXX_Methods
to protoreflect.Message.ProtoMethods.

Since one needs to obtain a protoreflect.Message now to get at
the fast-path methods, we modify the method signatures to take
in a protoreflect.Message instead of protoreflect.ProtoMessage.
Doing so also avoids the wrapper hack that was formerly done on
impl.messageReflectWrapper.

After this change the new protoc-gen-go no longer generates
any XXX fields or methods. All internal fields and methods are truly
hidden from the end-user.

name                                     old time/op    new time/op    delta
Wire/Unmarshal/google_message1_proto2-4    1.50µs ±10%    1.50µs ± 2%    ~     (p=0.483 n=10+9)
Wire/Unmarshal/google_message1_proto3-4    1.06µs ± 6%    1.06µs ± 4%    ~     (p=0.814 n=9+9)
Wire/Unmarshal/google_message2-4            734µs ±22%     689µs ±13%    ~     (p=0.133 n=10+9)
Wire/Marshal/google_message1_proto2-4       790ns ±46%     652ns ± 8%    ~     (p=0.590 n=10+9)
Wire/Marshal/google_message1_proto3-4       872ns ± 4%     857ns ± 3%    ~     (p=0.168 n=9+9)
Wire/Marshal/google_message2-4              232µs ±16%     221µs ± 3%  -4.75%  (p=0.014 n=9+9)
Wire/Size/google_message1_proto2-4          164ns ± 2%     167ns ± 4%  +1.87%  (p=0.046 n=9+10)
Wire/Size/google_message1_proto3-4          240ns ± 9%     229ns ± 1%  -4.81%  (p=0.000 n=9+8)
Wire/Size/google_message2-4                58.9µs ± 9%    59.6µs ± 2%  +1.23%  (p=0.040 n=9+9)

name                                     old alloc/op   new alloc/op   delta
Wire/Unmarshal/google_message1_proto2-4      912B ± 0%      912B ± 0%    ~     (all equal)
Wire/Unmarshal/google_message1_proto3-4      688B ± 0%      688B ± 0%    ~     (all equal)
Wire/Unmarshal/google_message2-4            470kB ± 0%     470kB ± 0%    ~     (p=0.215 n=10+10)
Wire/Marshal/google_message1_proto2-4        240B ± 0%      240B ± 0%    ~     (all equal)
Wire/Marshal/google_message1_proto3-4        224B ± 0%      224B ± 0%    ~     (all equal)
Wire/Marshal/google_message2-4             90.1kB ± 0%    90.1kB ± 0%    ~     (all equal)
Wire/Size/google_message1_proto2-4          0.00B          0.00B         ~     (all equal)
Wire/Size/google_message1_proto3-4          0.00B          0.00B         ~     (all equal)
Wire/Size/google_message2-4                 0.00B          0.00B         ~     (all equal)

name                                     old allocs/op  new allocs/op  delta
Wire/Unmarshal/google_message1_proto2-4      24.0 ± 0%      24.0 ± 0%    ~     (all equal)
Wire/Unmarshal/google_message1_proto3-4      6.00 ± 0%      6.00 ± 0%    ~     (all equal)
Wire/Unmarshal/google_message2-4            8.49k ± 0%     8.49k ± 0%    ~     (all equal)
Wire/Marshal/google_message1_proto2-4        1.00 ± 0%      1.00 ± 0%    ~     (all equal)
Wire/Marshal/google_message1_proto3-4        1.00 ± 0%      1.00 ± 0%    ~     (all equal)
Wire/Marshal/google_message2-4               1.00 ± 0%      1.00 ± 0%    ~     (all equal)
Wire/Size/google_message1_proto2-4           0.00           0.00         ~     (all equal)
Wire/Size/google_message1_proto3-4           0.00           0.00         ~     (all equal)
Wire/Size/google_message2-4                  0.00           0.00         ~     (all equal)

Change-Id: Ibf3263ad0f293326695c22020a92a6b938ef4f65
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/185697
Reviewed-by: Damien Neil <dneil@google.com>
2019-07-12 19:31:58 +00:00
Damien Neil
a8a2cea3e7 proto: move T->*T wrappers from internal/scalar to proto
Usage of these is pervasive in code which works with proto2, and proto2
will be with us for a long, long time to come. Move them to the proto
package.

Change-Id: I1b2e57429fd5a8f107a848a4492d20c27f304bd7
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/185543
Reviewed-by: Joe Tsai <thebrokentoaster@gmail.com>
2019-07-12 17:35:01 +00:00
Joe Tsai
6c28674cea proto: fix merge semantics for oneof message
The proper semantics for a message field within a oneof
when unmarshaling is to merge into an existing message,
rather than replacing it.

Change-Id: I7c08f6e4fa958c6ee6241e9083f7311515a97e15
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/185957
Reviewed-by: Damien Neil <dneil@google.com>
2019-07-12 16:20:52 +00:00
Damien Neil
7492a09da9 internal/impl: support packed extensions
Change-Id: I5a9e22f1c98f5db9caae1681775017da5aa67394
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/185541
Reviewed-by: Joe Tsai <thebrokentoaster@gmail.com>
2019-07-11 18:07:16 +00:00
Joe Tsai
3e6a39b3f1 internal/impl: avoid preserving presence on proto3 bytes field
When setting a proto3 bytes field to an empty, non-nil bytes slice,
just store a nil slice in the underderlying storage.
This is done to avoid presenting the illusion to the user that
presence is preserved for proto3.

Updates golang/protobuf#896

Change-Id: I1b97bedd547d336863c65d9418d8f07edf69ccd6
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/185577
Reviewed-by: Herbie Ong <herbie@google.com>
2019-07-10 19:52:51 +00:00
Joe Tsai
82760ceffa internal/impl: add MessageState to every generated message
We define MessageState, which is essentially an atomically set *MessageInfo.
By nesting this as the first field in every generated message, we can
implement the reflective methods on a *MessageState when obtained by
unsafe casting a concrete message pointer as a *MessageState.
The MessageInfo held by MessageState provides additional Go type information
to interpret the memory that comes after the contents of the MessageState.

Since we are nesting a MessageState in every message,
the memory use of every message instance grows by 8B.

On average, the body of ProtoReflect grows from 133B to 202B (+50%).
However, this is offset by XXX_Methods, which is 108B and
will be removed in a future CL. Taking into account the eventual removal
of XXX_Methods, this is a net reduction of 25%.

name          old time/op    new time/op    delta
Name/Value-4    70.3ns ± 2%    17.5ns ± 6%   -75.08%  (p=0.000 n=10+10)
Name/Nil-4      70.6ns ± 3%    33.4ns ± 2%   -52.66%  (p=0.000 n=10+10)

name          old alloc/op   new alloc/op   delta
Name/Value-4     16.0B ± 0%      0.0B       -100.00%  (p=0.000 n=10+10)
Name/Nil-4       16.0B ± 0%      0.0B       -100.00%  (p=0.000 n=10+10)

name          old allocs/op  new allocs/op  delta
Name/Value-4      1.00 ± 0%      0.00       -100.00%  (p=0.000 n=10+10)
Name/Nil-4        1.00 ± 0%      0.00       -100.00%  (p=0.000 n=10+10)

Change-Id: I92bd58dc681c57c92612fd5ba7fc066aea34e95a
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/185460
Reviewed-by: Damien Neil <dneil@google.com>
2019-07-10 19:44:24 +00:00
Joe Tsai
36dc22ddb8 encoding: use strs.UnsafeString to avoid duplicated code
The strs.UnsafeString casts a []byte as a string.
This allows us to avoid duplicated functionality.

Change-Id: I9930b94bae35eac0f98c0fa62963b300bc8d7e49
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/185459
Reviewed-by: Herbie Ong <herbie@google.com>
2019-07-10 07:01:20 +00:00
Joe Tsai
ace50e2b5e internal/encoding/tag: support marshaling weak fields
We already support unmarshaling weak fields, support the other direction.

Change-Id: I514e6b7b18cf9a3567fb1775a1e389fe64f22d22
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/185340
Reviewed-by: Herbie Ong <herbie@google.com>
2019-07-09 22:25:24 +00:00
Joe Tsai
e5900a6a90 internal/encoding/tag: fix handling of JSON name
When decoding, only treat the name as being explicitly set if the
name differs from what the JSON name would have been had it been
automatically derived from the protobuf field name.

Change-Id: Ida9256eafe7af20c7a06be2d4fb298be44276104
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/185398
Reviewed-by: Herbie Ong <herbie@google.com>
2019-07-09 22:16:29 +00:00
Joe Tsai
97a87391b1 internal/strs: unify string manipulation functionality
Create a new internal/strs package that unifies common functionality:
* Since protobuf itself pseudo-specifies at least 4 different camel-case
and snake-case conversion functions, we define all variants in one place.
* We move the internal/filedesc.nameBuilder function to this package.
We simplify its implementation to not depend on a strings.Builder fork
under the hood since the semantics we desire is simpler than what
strings.Builder provides.
* We use strs.Builder in reflect/protodesc in its construction of all
the full names. This is perfect use case of strs.Builder since all
full names within a file descriptor share the same lifetime.
* Add an UnsafeString and UnsafeBytes cast function that will be useful
in the near future for optimizing encoding/prototext and encoding/protojson.

Change-Id: I2cf07cbaf6f72e5f9fd6ae3d37b0d46f6af2ad59
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/185198
Reviewed-by: Damien Neil <dneil@google.com>
2019-07-09 19:57:23 +00:00
Damien Neil
e91877de26 internal/impl: add fast-path unmarshal
Benchmarks run with:
  go test ./benchmarks/ -bench=Wire  -benchtime=500ms -benchmem -count=8

Fast-path vs. parent commit:

  name                                      old time/op    new time/op    delta
  Wire/Unmarshal/google_message1_proto2-12    1.35µs ± 2%    0.45µs ± 4%  -67.01%  (p=0.000 n=8+8)
  Wire/Unmarshal/google_message1_proto3-12    1.07µs ± 1%    0.31µs ± 1%  -71.04%  (p=0.000 n=8+8)
  Wire/Unmarshal/google_message2-12            691µs ± 2%     188µs ± 2%  -72.78%  (p=0.000 n=7+8)

  name                                      old allocs/op  new allocs/op  delta
  Wire/Unmarshal/google_message1_proto2-12      60.0 ± 0%      25.0 ± 0%  -58.33%  (p=0.000 n=8+8)
  Wire/Unmarshal/google_message1_proto3-12      42.0 ± 0%       7.0 ± 0%  -83.33%  (p=0.000 n=8+8)
  Wire/Unmarshal/google_message2-12            28.6k ± 0%      8.5k ± 0%  -70.34%  (p=0.000 n=8+8)

Fast-path vs. -v1:

  name                                      old time/op    new time/op    delta
  Wire/Unmarshal/google_message1_proto2-12     702ns ± 1%     445ns ± 4%   -36.58%  (p=0.000 n=8+8)
  Wire/Unmarshal/google_message1_proto3-12     604ns ± 1%     311ns ± 1%   -48.54%  (p=0.000 n=8+8)
  Wire/Unmarshal/google_message2-12            179µs ± 3%     188µs ± 2%    +5.30%  (p=0.000 n=7+8)

  name                                      old allocs/op  new allocs/op  delta
  Wire/Unmarshal/google_message1_proto2-12      26.0 ± 0%      25.0 ± 0%    -3.85%  (p=0.000 n=8+8)
  Wire/Unmarshal/google_message1_proto3-12      8.00 ± 0%      7.00 ± 0%   -12.50%  (p=0.000 n=8+8)
  Wire/Unmarshal/google_message2-12            8.49k ± 0%     8.49k ± 0%    -0.01%  (p=0.000 n=8+8)

Change-Id: I6247ac3fd66a63d9acb902cbd192094ee3d151c3
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/185147
Reviewed-by: Joe Tsai <thebrokentoaster@gmail.com>
2019-07-09 19:56:42 +00:00
Damien Neil
3d0706ac2e proto, internal/impl: make IsInitialized more consistent
Make the fast-path and slow-path versions of IsInitialized report
exactly the same errors: An errors.RequiredNotSet containing the
full name of one of the unset required fields.

Bugfix: Fast-path IsInitialized on a nil message reports an error only
when the message directly contains required fields.

Bugfix: Include fast-path IsInitialized in legacy messageIfaceWrapper.

Fixes golang/protobuf#887

Change-Id: Ia5e4b386f8c23f6f855d995f4a098b1338acbae3
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/185397
Reviewed-by: Joe Tsai <thebrokentoaster@gmail.com>
2019-07-09 19:49:22 +00:00
Joe Tsai
6ceeaab1ba cmd/protoc-gen-go: remove MessageSet hackery
The encoding/prototext and encoding/protojson are implemented entirely
in terms of protobuf reflection, which side-steps this information.
Remove the hacks in the generator to special-case MessageSet.

Change-Id: I708c4636b77672545a103b7ab686f103b9dfc514
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/185240
Reviewed-by: Herbie Ong <herbie@google.com>
2019-07-08 21:17:17 +00:00
Joe Tsai
c0e4bb2054 cmd/protoc-gen-go: unexport implementation-specific XXX fields
We modify protoc-gen-go to stop generating exported XXX fields.
The unsafe implementation is unaffected by this change since unsafe
can access fields regardless of visibility. However, for the purego
implementation, we need to respect Go visibility rules as enforced
by the reflect package.

We work around this by generating a exporter function that given
a reference to the message and the field to export, returns a reference
to the unexported field value. This exporter function is protected by
a constant such that it is not linked into the final binary in non-purego
build environment.

Updates golang/protobuf#276

Change-Id: Idf5c1f158973fa1c61187ff41440acb21c5dac94
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/185141
Reviewed-by: Damien Neil <dneil@google.com>
2019-07-08 20:45:09 +00:00
Joe Tsai
09912274cc cmd/protoc-gen-go: remove generation of XXX_OneofWrappers
Associate the oneof wrapper types with a message by conveying that
information to the associated MessageInfo.

Change-Id: Iabfca593850e1d6a89498a37eacbf22dbb73bd20
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/185239
Reviewed-by: Damien Neil <dneil@google.com>
2019-07-08 20:36:12 +00:00
Joe Tsai
9aeddcdcf0 internal/encoding/wire: fix trivial misspelling
Change-Id: I71f1715145ebc58aadec1d5ecb5f1900998f65a0
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/184277
Reviewed-by: Damien Neil <dneil@google.com>
2019-07-08 20:14:46 +00:00
Damien Neil
8c457bb057 internal/impl: rename encode_field.go->codec_field.go
The code organization is simpler if we keep the functions encoding and
decoding a particular type (e.g., maps) together rather than split
across files.

This rename is happening in a separate CL from cl/185241 to preserve
rename history. (Git gets confused when you rename a->b and b->c in the
same commit.)

Change-Id: Idfbb3ff8cf0db149c68d650f89ff3fb8ac833322
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/184942
Reviewed-by: Joe Tsai <thebrokentoaster@gmail.com>
2019-07-08 20:13:10 +00:00
Damien Neil
edf7bdda31 internal/impl: rename encode->codec in various places
The code organization is simpler if we keep the functions encoding and
decoding a particular type (e.g., maps) together rather than split
across files. Rename various "encode" files to "codec" in preparation
for adding fast-path decoding.

Change-Id: If1e271da99d31533ffefc19b1fc847936fa9484a
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/185241
Reviewed-by: Joe Tsai <thebrokentoaster@gmail.com>
2019-07-08 20:12:42 +00:00
Joe Tsai
a804450cf6 all: fix stale comments
Change-Id: I4bfef75bc74c8d876a3926635bea12bbbaf4993e
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/185238
Reviewed-by: Damien Neil <dneil@google.com>
2019-07-08 18:33:38 +00:00
Joe Tsai
8a4c3d18b1 internal/filedesc: avoid deep-copying the options
The protoreflect.Descriptor.Options method is currently documented as
returning a reference to the options, where the user must not mutate
the returned message. This changes internal/filedesc to avoid returning
a copy of the options by caching the first unmarshal.

See golang/protobuf#877

Change-Id: I15701d33fbda7535b21b2add72628b02992c373f
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/185197
Reviewed-by: Benny Siegert <bsiegert@gmail.com>
2019-07-08 17:26:50 +00:00
Joe Tsai
93fd968b71 internal/impl: centralize the logic for finding XXX fields
Change-Id: I177fbeeb474eeb86e4a47229ac5c48cb7191e95b
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/185140
Reviewed-by: Damien Neil <dneil@google.com>
2019-07-08 17:14:37 +00:00
Joe Tsai
15076350e8 reflect/protodesc: enforce strict validation
Hyrum's Law dictates that if we do not prevent naughty behavior,
people will rely on it. If we do not validate that the provided
file descriptor is correct today, it will be near impossible
to add proper validation checks later on.

The logic added validates that the provided file descriptor is
correct according to the same semantics as protoc,
which was reversed engineered to derive the set of rules implemented here.
The rules are unfortunately complicated because protobuf is a language
full of many non-orthogonal features. While our logic is complicated,
it is still 1/7th the size of the equivalent C++ code!

Change-Id: I6acc5dc3bd2e4c6bea6cd9e81214f8104402602a
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/184837
Reviewed-by: Damien Neil <dneil@google.com>
2019-07-03 20:46:51 +00:00
Joe Tsai
b2107fbd8d reflect/protodesc: robustify dependency resolution implementation
Overview of changes:
* Add an option that specifies whether to replace unresolvable references
with a placeholder instead of producing an error. Since the prior behavior
produced placeholders (not always), we default to that behavior for now,
but will enable strict resolving in a future CL.
* The option is not yet exported because there is concern about what the
public API should look like. This will be exposed in a future CL.
* Unlike before, we now permit placeholders for unresolvable enum values.
* We implement relative name resolution logic.
* We handle the case where the type is unknown, but type_name is specified.
In such a case, we populate both FieldDescriptor.{Enum,Message} and leave
the FieldDescriptor.Kind with the zero value. If the type_name happened
to resolve, we use that to determine the type.
* If a placeholder is used to represent a relative name,
the FullName reports an invalid full name with a "*." prefix.

Change-Id: Ifa8c750423c488fb9324eec4d033a2f251505fda
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/184317
Reviewed-by: Damien Neil <dneil@google.com>
2019-07-03 19:17:36 +00:00
Joe Tsai
8d30bbeede all: remove dependency on proto v1
This does not remove all dependencies,
but all of the cases where it can now be implemented in terms of v2.

Change-Id: Idc5b0273f0d35c284bf2141eb9cce998692ceb15
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/184878
Reviewed-by: Herbie Ong <herbie@google.com>
2019-07-03 04:59:17 +00:00
Joe Tsai
32e8a52cbf internal/impl: fix race in aberrant message logic
Previously, when aberrantLoadMessageDesc returned it was guaranteed
to have initialized the current message through the use of the done signal.
However, this does not guarantee that the descriptor for a cylic reference
has also finished initialization.

Rather than add more complicated logic to wait until all cyclic references
have finished initializing, just add a global lock for the entire
aberrantLoadMessageDesc function.

This slows down performance, but is easier to reason about.

Change-Id: I4cdae8b955f71ee40fa6979f5a8d548d9749042c
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/184657
Reviewed-by: Damien Neil <dneil@google.com>
2019-07-02 18:39:06 +00:00
Joe Tsai
28e61ba70b internal/impl: drop XXX_MessageName aberrant support
The aberrant support logic only has access to the Go type
information, and not a concrete value. However, the XXX_MessageName
method exists on some hacky dynamic proto implementations where
it is only valid to call on a concrete value, not just newly created
instance of the given type.

However, from the perspective of the support logic, it is impossible
to distinguish between dynamic messages and hand-crafted custom messages.
Thus, just drop support for XXX_MessageName. We won't get the full name
of the message right, but oh well, what can we do.

Change-Id: Icc272861e11a355639fb82a991ca2854a9edc0c7
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/184557
Reviewed-by: Damien Neil <dneil@google.com>
2019-07-01 21:51:21 +00:00
Joe Tsai
851185dae3 internal/impl: support aberrant enums and messages
Aberrant messages are hand-crafted messages that happen to work because
they use the same struct tags that generated code emits.
This happens to work in v1, but is unspecified behavior and entirely outside
the compatibility promise.

Support for this was added early on in the history of the v2 implementation,
but entirely untested. It was removed in CL/182360 to reduce the
technical debt of the legacy implementation. Unfortunately, sufficient number
of targets do rely on this aberrant support, so it is being added back.

The logic being added is essentially the same thing as the previous logic,
but ported to use internal/filedesc instead of the now deleted
internal/prototype package.

Change-Id: Ib5cab3e90480825b9615db358044ce05a14b05bd
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/184517
Reviewed-by: Damien Neil <dneil@google.com>
2019-07-01 21:18:15 +00:00
Damien Neil
4ae30bbb21 internal/impl: refactor fast-path
Move data used by the fast-path implementations into a substructure of
MessageInfo and initialize it separately.

Change-Id: Ib855ee8ea5cb0379528b52ba0e191319aa5e2dff
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/184077
Reviewed-by: Joe Tsai <thebrokentoaster@gmail.com>
2019-06-27 20:08:19 +00:00
Damien Neil
a9940822d4 all: remove protoreflect.Message.Len
Len looks like it should be O(1), but the need to check for
non-zero-length repeated fields makes it at minimum O(n) where n is
the number of repeated fields. In practice, it's O(n) where n is the
number of fields altogether.

The Len function is not especially useful, easily duplicated with Range
and a counter, and can be surprisingly inefficient. Drop it.

Change-Id: I24b27433217e131e842bd18dd58475bcdf62ef97
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/183678
Reviewed-by: Joe Tsai <thebrokentoaster@gmail.com>
2019-06-25 21:59:46 +00:00
Herbie Ong
20aefe9c5e encoding/prototext: fix parsing of field names
Previous code tries to do all-lowercase match all non-extension field
names to match group field names, which is incorrect. Fix to check for
only group field type and make sure that the format is correct as well.

Fixes golang/protobuf#878.

Fix typo in text proto string in internal/impl/message_test.go that
wasn't caught before due to above issue.

Change-Id: Ief952907306435ed76a095e96e29fcc9c0027b73
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/183737
Reviewed-by: Joe Tsai <thebrokentoaster@gmail.com>
2019-06-25 21:57:50 +00:00
Joe Tsai
e407ee162b reflect/protoregistry: remove Files.Find{Enum,Message,Extension,Service}ByName
This is a breaking change.

The replacement is the Files.FindDescriptorByName method,
which is more flexible as it handles all descriptor types.

Change-Id: I2ccd544a7630396a2428b1d41f836c5246070912
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/183700
Reviewed-by: Damien Neil <dneil@google.com>
2019-06-24 23:47:12 +00:00
Joe Tsai
424139789a reflect/protoreflect: register all desciptors in Files
This change makes it such that Files now functionally registers all
descriptors in a file (not just enums, messages, extensions, and services),
but also including enum values, messages fields/oneofs, and service methods.

The ability to look up any descriptor by full name is needed to:
1) properly detect namespace conflicts on enum values
2) properly implement the relative name lookup logic in reflect/protodesc

The approach taken:
1) Assumes that a FileDescriptor has no internal name conflicts.
This will (in a future CL) be guaranteed by reflect/protodesc and
is guaranteed today by protoc for generated descriptors.
2) Observes that the only declarations that can possibly conflict
with another file are top-level declarations (i.e., enums, enum values,
messages, extensions, and services). Enum values are annoying
since they live in the same scope as the parent enum, rather than
being under the enum.

For the internal data structure of Files, we only register the top-level
declarations. This is the bare minimum needed to detect whether the file
being registered has any namespace conflicts with previously registered files.

We shift the effort to lookups, where we now need to peel off the end fragments
of a full name until we find a match in the internal registry. If a match
is found, we may need to descend into that declaration to find a nested
declaration by name.

For initialization, we modify internal/filedesc to initialize the
enum values for all top-level enums. This performance cost is offsetted
by the fact that Files.Register now avoids internally registering
nested enums, messages, and extensions.

For lookup, the cost has shifted from O(1) to O(N),
where N is the number of segments in the full name.
Top-level descriptors still have O(1) lookup times.
Nested descriptors have O(M) lookup times,
where M is the level of nesting within a single file.

Change-Id: I950163423431f04a503b6201ddcc20a62ccba017
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/183697
Reviewed-by: Damien Neil <dneil@google.com>
2019-06-24 22:35:47 +00:00
Damien Neil
5322bdb290 internal/impl: add fast-path for IsInitialized
This currently returns uninformative errors from the fast path and then
consults the slow, reflection-based path only when an error is detected.
Perhaps it's worth going through the effort of producing better errors
directly on the fast path.

Change-Id: I68536e9438010dbd97dbaff4f47b78430221d94b
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/171462
Reviewed-by: Joe Tsai <thebrokentoaster@gmail.com>
2019-06-24 17:42:23 +00:00
Damien Neil
cedb595154 proto, internal/impl: avoid string->[]byte conversions
We trust the compiler to optimize away the string->[]byte conversion in
code like:

	b = wire.AppendBytes(b, []byte(s))

In testing (go 1.12.5 linux/amd64), this optimization is not happening.
Perhaps newer versions of the compiler will optimize this, but we
shouldn't rely on it; avoid unnecessary conversions.

Benchmark differences vs https://golang.org/cl/171462:

  name                                   old time/op    new time/op    delta
  Wire/Marshal/google_message1_proto2-6     310ns ± 2%     189ns ± 3%  -39.20%  (p=0.000 n=8+8)
  Wire/Marshal/google_message1_proto3-6     389ns ± 8%     261ns ± 2%  -33.03%  (p=0.000 n=8+8)
  Wire/Marshal/google_message2-6            103µs ±11%      59µs ± 4%  -42.17%  (p=0.000 n=8+8)

  name                                   old alloc/op   new alloc/op   delta
  Wire/Marshal/google_message1_proto2-6      592B ± 0%      240B ± 0%  -59.46%  (p=0.000 n=8+8)
  Wire/Marshal/google_message1_proto3-6      576B ± 0%      224B ± 0%  -61.11%  (p=0.000 n=8+8)
  Wire/Marshal/google_message2-6            196kB ± 0%      90kB ± 0%  -54.05%  (p=0.000 n=8+8)

  name                                   old allocs/op  new allocs/op  delta
  Wire/Marshal/google_message1_proto2-6      5.00 ± 0%      1.00 ± 0%  -80.00%  (p=0.000 n=8+8)
  Wire/Marshal/google_message1_proto3-6      5.00 ± 0%      1.00 ± 0%  -80.00%  (p=0.000 n=8+8)
  Wire/Marshal/google_message2-6            1.66k ± 0%     0.00k ± 0%  -99.94%  (p=0.000 n=8+8)

Change-Id: Idab7634b8c86604dffa46895ba2e61be38c9bd9c
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/183380
Reviewed-by: Joe Tsai <thebrokentoaster@gmail.com>
2019-06-24 17:41:45 +00:00
Damien Neil
8c86fc5e7d all: remove non-fatal UTF-8 validation errors (and non-fatal in general)
Immediately abort (un)marshal operations when encountering invalid UTF-8
data in proto3 strings. No other proto implementation supports non-UTF-8
data in proto3 strings (and many reject it in proto2 strings as well).
Producing invalid output is an interoperability threat (other
implementations won't be able to read it).

The case where existing string data is found to contain non-UTF8 data is
better handled by changing the field to the `bytes` type, which (aside
from UTF-8 validation) is wire-compatible with `string`.

Remove the errors.NonFatal type, since there are no remaining cases
where it is needed. "Non-fatal" errors which produce results and a
non-nil error are problematic because they compose poorly; the better
approach is to take an option like AllowPartial indicating which
conditions to check for.

Change-Id: I9d189ec6ffda7b5d96d094aa1b290af2e3f23736
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/183098
Reviewed-by: Joe Tsai <thebrokentoaster@gmail.com>
2019-06-20 20:55:13 +00:00
Damien Neil
a80229e4ed proto: add benchmark using protobuf repo test data
The primary (cross-language) protobuf repository contains benchmark data
sets. Add benchmarks using this data. (A version of this benchmark exists
in the protobuf repository, but it uses the v1 API and isn't trivial to
get working.)

Fetch the small benchmark datasets from the
github.com/protocolbuffers/protobuf repo by default. Add a
download_benchdata.bash script which fetches the larger datasets as
well.

Generate necessary packages under internal/testprotos/benchmarks.

To run:

  go run ./proto -bench=BenchmarkData

Usual caveats about benchmarking apply: While these benchmarks use
realistic data, isolated microbenchmarking of proto operations is not
necessarily representitive of performance in production systems.

Change-Id: I58d107554baf104568c86997b5ad50be8b2a5790
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/183297
Reviewed-by: Joe Tsai <thebrokentoaster@gmail.com>
2019-06-20 20:38:50 +00:00
Joe Tsai
7ea76d2a4c cmd/protoc-gen-go: un-deprecate enum maps
If we were starting from scratch, we would not have added the enum maps.
However, they already exist and see a fair amount of usage.
The effort to remove them is not worth it. Thus, remove the deprecation
warning since they are here to stay.

Note that the generated code does not refer to the generated enum maps.
One day, the linker should be able to elide them if unused by the user.
However, https://golang.org/issue/2559 would need to be resolved first.

Change-Id: Ia8b9b1812b5d8462ca2fa1d543170e4a09ff9e4f
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/183177
Reviewed-by: Damien Neil <dneil@google.com>
2019-06-20 19:36:37 +00:00