This is a result of the discussion in [1]. Before editions, a group defined a multiple things:
* a type
* a field
* an encoding scheme
With editions this has changed and groups no longer exist and the different parts have to be defined individually. Most importantly, the field and the type also had the same name (usually and CamelCase name). To keep compatibility with proto2 groups, [2] introduced a concept of group-like fields and adjusted the Text/JSON parsers to accept the type name instead of the field name for such fields. This means you can convert from proto2 groups to editions without changing the semantics.
Furthermore, to avoid suprises with group-like fields (e.g. when a user by coincident specified a field that is group-like) protobuf decided that group-like fields should always accept the type and the field name for group like fields. This also allows us to eventually emit the field name rather than the type name for group like fields in the future.
This change implements this decision in Go.
[1] https://github.com/protocolbuffers/protobuf/issues/16239
[2] https://go.dev/cl/575916
Change-Id: I701c4cd228d2e0867b2a87771b6c6331459c4910
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/582755
Reviewed-by: Lasse Folger <lassefolger@google.com>
Reviewed-by: Mike Kruskal <mkruskal@google.com>
Commit-Queue: Michael Stapelberg <stapelberg@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Michael Stapelberg <stapelberg@google.com>
Auto-Submit: Michael Stapelberg <stapelberg@google.com>
This brings go into conformance with other implementations. Group-like message fields with delimited encoding will continue to use the type name for text-format, but everything else will use the field name.
Change-Id: Ib6d07f19ccfa853ce0370392c89fd24fb7148793
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/575896
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Michael Stapelberg <stapelberg@google.com>
Commit-Queue: Michael Stapelberg <stapelberg@google.com>
If `DiscardUnknown` was enabled, previously this syntax was rejected
with an error "unexpected token: ]":
unknown_field: [
{}
]
This is because the parser short-circuited after parsing the {}, and
didn't handle the closing square brace. Now it no longer short-circuits.
Change-Id: I6230b56766632752a5cc8822b17ed8262873d6cc
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/530616
Reviewed-by: Damien Neil <dneil@google.com>
Reviewed-by: Michael Stapelberg <stapelberg@google.com>
Fix a panic when parsing the incomplete negative number "- ".
Fixesgolang/protobuf#1530
Change-Id: Iba5e8ee68d5f7255c28f1a74f31beee36c9ed847
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/475995
Reviewed-by: Lasse Folger <lassefolger@google.com>
Run-TryBot: Damien Neil <dneil@google.com>
The text format specification[1] indicates that whitespace and comments
may appear after a minus sign and before the subsequent numeric component
in negative number literals. But the Go implementation does not allow
this.
This brings the Go implementation info conformance with this aspect.
Fixesgolang/protobuf#1526
[1] https://protobuf.dev/reference/protobuf/textformat-spec/#parsing
Change-Id: I3996c89ee9d37cf2b7502fc6736d6e2ed6dbcf43
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/473015
Reviewed-by: Lasse Folger <lassefolger@google.com>
Reviewed-by: Damien Neil <dneil@google.com>
Inside decoder.skipValue we should not be calling skipValue again
since we had already read the value earlier. The only possible
composite type in the context of a list is another message,
which is already handled in the case above.
Change-Id: If40da2d369e0a64a64ba9b961377331231158fe2
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/317430
Trust: Joe Tsai <joetsai@digital-static.net>
Trust: Herbie Ong <herbie@google.com>
Reviewed-by: Herbie Ong <herbie@google.com>
Centralize the MessageSet extension resolution logic in the registry.
This avoids needless replication of this exact logic in multiple places
(for JSON and text) and elsewhere.
Change-Id: I70bfea899e295e8c589f418965bf0dd099f93628
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/240077
Reviewed-by: Herbie Ong <herbie@google.com>
The following changes are made:
* Permit invalid UTF-8 in proto2. This goes against specified behavior,
but matches functional behavior in wire marshaling (not just for Go,
but also in the other major language implementations as well).
* The Format function is specified as ignoring errors since its intended
purpose is to surface information to the human user even if it's not
exactly parsible back into a message. As such, add an unexported
allowInvalidUTF8 option that is specially used by Format.
* Add an EmitASCII option that forces the formatting of
strings and bytes to always be encoded as ASCII.
This ensures that the entire output is always ASCII as well.
Note that we do not replicate this behavior for protojson since:
* The JSON format fundamentally has a stricter and well-specified
grammar for exactly what is valid/invalid, while the text format
has not had a well-specified grammar for the longest time,
leading to all sorts of weird usages due to Hyrum's law.
* This is to ease migration from the legacy implementation,
which did permit invalid UTF-8 in proto2.
* The EmitASCII option relies on the ability to always escape
Unicode characters using ASCII escape sequences, but this is not
possible in JSON since the grammar only has an escape sequence defined
for Unicode characters \u0000 to \uffff, inclusive.
However, Unicode v12.0.0 defines characters up to \U0010FFFF,
which is beyond what the JSON grammar provides escape sequences for.
Change-Id: I2b524a904e9ec59f9ed5500e299613bc27c31a14
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/233077
Reviewed-by: Herbie Ong <herbie@google.com>
In the upcoming 3.12.x release of protoc, the proto3 language will be
amended to support true presence for scalars. This CL adds support
to both the generator and runtime to support these semantics.
Newly added public API:
protogen.Plugin.SupportedFeatures
protoreflect.FieldDescriptor.HasPresence
protoreflect.FieldDescriptor.HasOptionalKeyword
protoreflect.OneofDescriptor.IsSynthetic
Change-Id: I7c86bf66d0ae56642109beb5f2132184593747ad
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/230698
Reviewed-by: Damien Neil <dneil@google.com>
The encoding/testprotos and reflect/protoregistry/testprotos are
accessible by other modules. Move them under internal/testprotos
to dissuade programmers who are too lazy to use their own test protos
when they need one.
Change-Id: I3dbfbce74e68ef033ec252bed076861cb47dd21e
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/214341
Reviewed-by: Damien Neil <dneil@google.com>
Change tests which use private types registries to use the global one.
Except in cases where we want to explicitly test that the private
registry is used, it's simpler to use the global registry.
Change-Id: I998fb463b6beef91c7f5ce2ca2083251ae24d1db
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/199897
Reviewed-by: Joe Tsai <thebrokentoaster@gmail.com>
Reviewed-by: Herbie Ong <herbie@google.com>
This CL adds support for discarding unknown fields from the input.
We add support for parsing and resolving field numbers, so that
the DiscardUnknown option can ignore all unresolvable fields.
We continue to reject known fields identified by field number
since there are a number of edge cases that a difficult to resolve.
Change-Id: I5c88b7bae8656ce20e85e4b5c92d8564a5ff8bb6
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/195779
Reviewed-by: Herbie Ong <herbie@google.com>
Rename build tag "proto1_legacy" -> "protolegacy"
to be consistent with the "protoreflect" tag.
Rename flag constant "Proto1Legacy" -> "ProtoLegacy" since
it covers more than simply proto1 legacy features.
For example, it covers alpha-features of proto3 that
were eventually removed from the final proto3 release.
Change-Id: I0f4fcbadd4b5a61c87645e2e5be11d187e59157c
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/189345
Reviewed-by: Damien Neil <dneil@google.com>
Change protoiface.ExtensionDescV1 to implement protoreflect.ExtensionType.
ExtensionDescV1's Name field conflicts with the Descriptor Name method,
so change the protoreflect.{Message,Enum,Extension}Type types to no
longer implement the corresponding Descriptor interface. This also leads
to a clearer distinction between the two types.
Introduce a protoreflect.ExtensionTypeDescriptor type which bridges
between ExtensionType and ExtensionDescriptor.
Add extension accessor functions to the proto package:
proto.{Has,Clear,Get,Set}Extension. These functions take a
protoreflect.ExtensionType parameter, which allows writing the
same function call using either the old or new API:
proto.GetExtension(message, somepb.E_ExtensionFoo)
Fixesgolang/protobuf#908
Change-Id: Ibc65d12a46666297849114fd3aefbc4a597d9f08
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/189199
Reviewed-by: Joe Tsai <thebrokentoaster@gmail.com>
This CL unifies common MessageSet logic in prototext and protojson
into the messageset package. While we are at it, also enable
MessageSet support only if the proto1_legacy build flag is enabled.
Change-Id: I1a7d475e8bb1dad61ecd286df45e4239e5bef072
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/185898
Reviewed-by: Damien Neil <dneil@google.com>
If the message for a weak field is linked in,
we treat it as if it were identical to a normal known field.
However, if the weak field is not linked in,
we treat it as if the field were not known.
Change-Id: I576d911deec98e13211304024a6353734d055465
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/185457
Reviewed-by: Herbie Ong <herbie@google.com>
Usage of these is pervasive in code which works with proto2, and proto2
will be with us for a long, long time to come. Move them to the proto
package.
Change-Id: I1b2e57429fd5a8f107a848a4492d20c27f304bd7
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/185543
Reviewed-by: Joe Tsai <thebrokentoaster@gmail.com>
This does not remove all dependencies,
but all of the cases where it can now be implemented in terms of v2.
Change-Id: Idc5b0273f0d35c284bf2141eb9cce998692ceb15
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/184878
Reviewed-by: Herbie Ong <herbie@google.com>
Previous code tries to do all-lowercase match all non-extension field
names to match group field names, which is incorrect. Fix to check for
only group field type and make sure that the format is correct as well.
Fixesgolang/protobuf#878.
Fix typo in text proto string in internal/impl/message_test.go that
wasn't caught before due to above issue.
Change-Id: Ief952907306435ed76a095e96e29fcc9c0027b73
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/183737
Reviewed-by: Joe Tsai <thebrokentoaster@gmail.com>
Immediately abort (un)marshal operations when encountering invalid UTF-8
data in proto3 strings. No other proto implementation supports non-UTF-8
data in proto3 strings (and many reject it in proto2 strings as well).
Producing invalid output is an interoperability threat (other
implementations won't be able to read it).
The case where existing string data is found to contain non-UTF8 data is
better handled by changing the field to the `bytes` type, which (aside
from UTF-8 validation) is wire-compatible with `string`.
Remove the errors.NonFatal type, since there are no remaining cases
where it is needed. "Non-fatal" errors which produce results and a
non-nil error are problematic because they compose poorly; the better
approach is to take an option like AllowPartial indicating which
conditions to check for.
Change-Id: I9d189ec6ffda7b5d96d094aa1b290af2e3f23736
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/183098
Reviewed-by: Joe Tsai <thebrokentoaster@gmail.com>
The wire representation of the contents of an Any is a `bytes` field
containing the the wire encoding of the contained message. The in-memory
representation is just these bytes.
The wire (un)marshaler has no special handling for Any values, and
happily accepts whatever bytes are present.
The text and JSON marshalers will unmarshal the bytes in an Any,
and the unmarshalers will marshal the Any to produce the in-memory
representation. This makes them stricter than the wire (un)marshaler:
Marshaling an Any which contains invalid data to text/JSON fails, while
marshaling it to wire format does not. This does make some sense, since
the Any already contains wire-format data; validation is performed at an
earlier time when producing that data.
This change brings the text and JSON (un)marshal functions a bit more
into alignment with the wire format by never performing checks for
required fields on the contents of an Any.
This has the advantage of consistently performing checks for required
fields exactly once per marshal/unmarshal operation, and over the same
data no matter which encoding is used. It also eliminates the one case
where a required field check is considered "non-fatal", generating an
error but not terminating the (un)marshal operation.
Change-Id: I70c62419d37ea0a07cb73c3ee2d26c0b0bec724b
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/182982
Reviewed-by: Herbie Ong <herbie@google.com>
Rename each generated protobuf package such that the base of the
Go package path is always equal to the Go package name to follow
proper Go package naming conventions.
The Go package name is derived from the .proto source file name by
replacing ".proto" with "pb" and stripping all underscores.
Change-Id: Iea05d1b5d94b1b2821ae10276ab771bb2df93c0e
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/177380
Reviewed-by: Damien Neil <dneil@google.com>
While it is general convention that the receiver being mutated
is the first argument, the standard library specifically goes against
this convention when it comes to the Unmarshal function.
Switch the ordering of the Unmarshal function to match the Go stdlib.
Change-Id: I893346680233ef9fec7104415a54a0a7ae353378
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/177258
Reviewed-by: Damien Neil <dneil@google.com>
Rename encoding/*pb to follow the convention of prefixing package names
with 'proto':
google.golang.org/protobuf/encoding/protojson
google.golang.org/protobuf/encoding/prototext
Move protogen under a compiler/ directory, just in case we ever do add
more compiler-related packages.
google.golang.org/protobuf/compiler/protogen
Change-Id: I31010cb5cabcea8274fffcac468477b58b56e8eb
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/177178
Reviewed-by: Joe Tsai <thebrokentoaster@gmail.com>