Commit Graph

7 Commits

Author SHA1 Message Date
Herbie Ong
20aefe9c5e encoding/prototext: fix parsing of field names
Previous code tries to do all-lowercase match all non-extension field
names to match group field names, which is incorrect. Fix to check for
only group field type and make sure that the format is correct as well.

Fixes golang/protobuf#878.

Fix typo in text proto string in internal/impl/message_test.go that
wasn't caught before due to above issue.

Change-Id: Ief952907306435ed76a095e96e29fcc9c0027b73
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/183737
Reviewed-by: Joe Tsai <thebrokentoaster@gmail.com>
2019-06-25 21:57:50 +00:00
Damien Neil
8c86fc5e7d all: remove non-fatal UTF-8 validation errors (and non-fatal in general)
Immediately abort (un)marshal operations when encountering invalid UTF-8
data in proto3 strings. No other proto implementation supports non-UTF-8
data in proto3 strings (and many reject it in proto2 strings as well).
Producing invalid output is an interoperability threat (other
implementations won't be able to read it).

The case where existing string data is found to contain non-UTF8 data is
better handled by changing the field to the `bytes` type, which (aside
from UTF-8 validation) is wire-compatible with `string`.

Remove the errors.NonFatal type, since there are no remaining cases
where it is needed. "Non-fatal" errors which produce results and a
non-nil error are problematic because they compose poorly; the better
approach is to take an option like AllowPartial indicating which
conditions to check for.

Change-Id: I9d189ec6ffda7b5d96d094aa1b290af2e3f23736
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/183098
Reviewed-by: Joe Tsai <thebrokentoaster@gmail.com>
2019-06-20 20:55:13 +00:00
Damien Neil
0c9f0a9c5d encoding/protojson, encoding/prototext: always allow partial Any
The wire representation of the contents of an Any is a `bytes` field
containing the the wire encoding of the contained message. The in-memory
representation is just these bytes.

The wire (un)marshaler has no special handling for Any values, and
happily accepts whatever bytes are present.

The text and JSON marshalers will unmarshal the bytes in an Any,
and the unmarshalers will marshal the Any to produce the in-memory
representation. This makes them stricter than the wire (un)marshaler:
Marshaling an Any which contains invalid data to text/JSON fails, while
marshaling it to wire format does not. This does make some sense, since
the Any already contains wire-format data; validation is performed at an
earlier time when producing that data.

This change brings the text and JSON (un)marshal functions a bit more
into alignment with the wire format by never performing checks for
required fields on the contents of an Any.

This has the advantage of consistently performing checks for required
fields exactly once per marshal/unmarshal operation, and over the same
data no matter which encoding is used. It also eliminates the one case
where a required field check is considered "non-fatal", generating an
error but not terminating the (un)marshal operation.

Change-Id: I70c62419d37ea0a07cb73c3ee2d26c0b0bec724b
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/182982
Reviewed-by: Herbie Ong <herbie@google.com>
2019-06-20 00:12:13 +00:00
Joe Tsai
378c1329de reflect/protoreflect: add alternative message reflection API
Added API:
	Message.Len
	Message.Range
	Message.Has
	Message.Clear
	Message.Get
	Message.Set
	Message.Mutable
	Message.NewMessage
	Message.WhichOneof
	Message.GetUnknown
	Message.SetUnknown

Deprecated API (to be removed in subsequent CL):
	Message.KnownFields
	Message.UnknownFields

The primary difference with the new API is that the top-level
Message methods are keyed by FieldDescriptor rather than FieldNumber
with the following semantics:
* For known fields, the FieldDescriptor must exactly match the
field descriptor known by the message.
* For extension fields, the FieldDescriptor must implement ExtensionType,
where ContainingMessage.FullName matches the message name, and
the field number is within the message's extension range.
When setting an extension field, it automatically stores
the extension type information.
* Extension fields are always considered nullable,
implying that repeated extension fields are nullable.
That is, you can distinguish between a unpopulated list and an empty list.
* Message.Get always returns a valid Value even if unpopulated.
The behavior is already well-defined for scalars, but for unpopulated
composite types, it now returns an empty read-only version of it.

Change-Id: Ia120630b4db221aeaaf743d0f64160e1a61a0f61
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/175458
Reviewed-by: Damien Neil <dneil@google.com>
2019-06-17 17:33:24 +00:00
Joe Tsai
1c28304cc5 proto, encoding/protojson, encoding/prototext: use Resolver interface
Instead of accepting a concrete protoregistry.Types type,
accept an interface that provides the necessary functionality
to perform the serialization.

The advantages of this approach:
* There is no need for complex logic to allow a Parent or custom
Resolver on the protoregistry.Types type.
* Users can pass their own custom resolver implementations directly
to the serialization functions.
* This is a more principled approach to plumbing custom resolvers
than the previous approach of overloading behavior on the concrete
Types type.

The disadvantages of this approach:
* A pointer to a concrete type is 8B, while an interface is 16B.
However, the expansion of the {Marshal,Unmarshal}Options structs
should be a concern solved separately from how to plumb custom resolvers.
* The resolver interfaces as defined today may be insufficient to
provide functionality needed in the future if protobuf expands its
feature set. For example, let's suppose the Any message permits
directly representing a enum by name. This would require the ability
to lookup an enum by name. To support that hypothetical need,
we can document that the serializers type-assert the provided Resolver
to a EnumTypeResolver and use that if possible. There is some loss
of type safety with this approach, but provides a clear path forward.

Change-Id: I81ca80e59335d36be6b43d57ec8e17abfdfa3bad
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/177044
Reviewed-by: Damien Neil <dneil@google.com>
2019-05-22 19:38:45 +00:00
Joe Tsai
cdb7773569 encoding: switch ordering of Unmarshal arguments
While it is general convention that the receiver being mutated
is the first argument, the standard library specifically goes against
this convention when it comes to the Unmarshal function.

Switch the ordering of the Unmarshal function to match the Go stdlib.

Change-Id: I893346680233ef9fec7104415a54a0a7ae353378
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/177258
Reviewed-by: Damien Neil <dneil@google.com>
2019-05-14 23:18:07 +00:00
Damien Neil
5c5b531562 protogen, encoding/jsonpb, encoding/textpb: rename packages
Rename encoding/*pb to follow the convention of prefixing package names
with 'proto':

	google.golang.org/protobuf/encoding/protojson
	google.golang.org/protobuf/encoding/prototext

Move protogen under a compiler/ directory, just in case we ever do add
more compiler-related packages.

	google.golang.org/protobuf/compiler/protogen

Change-Id: I31010cb5cabcea8274fffcac468477b58b56e8eb
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/177178
Reviewed-by: Joe Tsai <thebrokentoaster@gmail.com>
2019-05-14 20:33:22 +00:00