11 Commits

Author SHA1 Message Date
Joe Tsai
cd4a31e202 encoding/prototext: add MarshalOptions.EmitUnknown
This changes text marshaling to avoid unknown fields by default
and instead adds an option so that unknown fields be emitted.
This ensures that the default marshal/unknown can round-trip.

Change-Id: I85c84ba6ab7916d538ec6bfd4e9d399a8fcba14e
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/195778
Reviewed-by: Herbie Ong <herbie@google.com>
2019-09-17 02:56:29 +00:00
Herbie Ong
9e356dea53 encoding/prototext: document unstable marshal output
Fixes golang/protobuf#920.

Change-Id: I04c12de9a662eb67994fc7eeceee1af4a9efee55
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/188937
Reviewed-by: Joe Tsai <thebrokentoaster@gmail.com>
2019-09-07 00:56:21 +00:00
Joe Tsai
1799d1111a all: rename tag and flags for legacy support
Rename build tag "proto1_legacy" -> "protolegacy"
to be consistent with the "protoreflect" tag.

Rename flag constant "Proto1Legacy" -> "ProtoLegacy" since
it covers more than simply proto1 legacy features.
For example, it covers alpha-features of proto3 that
were eventually removed from the final proto3 release.

Change-Id: I0f4fcbadd4b5a61c87645e2e5be11d187e59157c
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/189345
Reviewed-by: Damien Neil <dneil@google.com>
2019-08-08 20:49:00 +00:00
Joe Tsai
5ae10aa9f0 encoding: unify MessageSet extension handling logic
This CL unifies common MessageSet logic in prototext and protojson
into the messageset package. While we are at it, also enable
MessageSet support only if the proto1_legacy build flag is enabled.

Change-Id: I1a7d475e8bb1dad61ecd286df45e4239e5bef072
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/185898
Reviewed-by: Damien Neil <dneil@google.com>
2019-07-15 21:07:58 +00:00
Joe Tsai
d421150c3b reflect/protoreflect: add Enum.Type and Message.Type
CL/174938 removed these methods in favor of a method that returned
only the descriptors. This CL adds back in the Type methods alongside
the Descriptor methods.

In a vast majority of protobuf usages, only the descriptor information
is needed. However, there is a small percentage that legitimately needs
the Go type information. We should provide both, but document that the
descriptor-only information is preferred.

Change-Id: Ia0a098997fb1bd009994940ae8ea5257ccd87cae
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/184578
Reviewed-by: Damien Neil <dneil@google.com>
2019-07-12 23:43:21 +00:00
Damien Neil
8c86fc5e7d all: remove non-fatal UTF-8 validation errors (and non-fatal in general)
Immediately abort (un)marshal operations when encountering invalid UTF-8
data in proto3 strings. No other proto implementation supports non-UTF-8
data in proto3 strings (and many reject it in proto2 strings as well).
Producing invalid output is an interoperability threat (other
implementations won't be able to read it).

The case where existing string data is found to contain non-UTF8 data is
better handled by changing the field to the `bytes` type, which (aside
from UTF-8 validation) is wire-compatible with `string`.

Remove the errors.NonFatal type, since there are no remaining cases
where it is needed. "Non-fatal" errors which produce results and a
non-nil error are problematic because they compose poorly; the better
approach is to take an option like AllowPartial indicating which
conditions to check for.

Change-Id: I9d189ec6ffda7b5d96d094aa1b290af2e3f23736
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/183098
Reviewed-by: Joe Tsai <thebrokentoaster@gmail.com>
2019-06-20 20:55:13 +00:00
Damien Neil
0c9f0a9c5d encoding/protojson, encoding/prototext: always allow partial Any
The wire representation of the contents of an Any is a `bytes` field
containing the the wire encoding of the contained message. The in-memory
representation is just these bytes.

The wire (un)marshaler has no special handling for Any values, and
happily accepts whatever bytes are present.

The text and JSON marshalers will unmarshal the bytes in an Any,
and the unmarshalers will marshal the Any to produce the in-memory
representation. This makes them stricter than the wire (un)marshaler:
Marshaling an Any which contains invalid data to text/JSON fails, while
marshaling it to wire format does not. This does make some sense, since
the Any already contains wire-format data; validation is performed at an
earlier time when producing that data.

This change brings the text and JSON (un)marshal functions a bit more
into alignment with the wire format by never performing checks for
required fields on the contents of an Any.

This has the advantage of consistently performing checks for required
fields exactly once per marshal/unmarshal operation, and over the same
data no matter which encoding is used. It also eliminates the one case
where a required field check is considered "non-fatal", generating an
error but not terminating the (un)marshal operation.

Change-Id: I70c62419d37ea0a07cb73c3ee2d26c0b0bec724b
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/182982
Reviewed-by: Herbie Ong <herbie@google.com>
2019-06-20 00:12:13 +00:00
Damien Neil
95faac20ee encoding/protojson, encoding/prototext: set wire unmarshal resolver
The text and JSON encodings for the google.protobuf.Any well-known type
require a call to proto.Unmarshal. Plumb through the resolver from the
UnmarshalOptions.

Change-Id: Iccc1a9d56acd9dd214f2b289216bd50acc2ef074
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/182980
Reviewed-by: Herbie Ong <herbie@google.com>
2019-06-19 17:54:09 +00:00
Joe Tsai
378c1329de reflect/protoreflect: add alternative message reflection API
Added API:
	Message.Len
	Message.Range
	Message.Has
	Message.Clear
	Message.Get
	Message.Set
	Message.Mutable
	Message.NewMessage
	Message.WhichOneof
	Message.GetUnknown
	Message.SetUnknown

Deprecated API (to be removed in subsequent CL):
	Message.KnownFields
	Message.UnknownFields

The primary difference with the new API is that the top-level
Message methods are keyed by FieldDescriptor rather than FieldNumber
with the following semantics:
* For known fields, the FieldDescriptor must exactly match the
field descriptor known by the message.
* For extension fields, the FieldDescriptor must implement ExtensionType,
where ContainingMessage.FullName matches the message name, and
the field number is within the message's extension range.
When setting an extension field, it automatically stores
the extension type information.
* Extension fields are always considered nullable,
implying that repeated extension fields are nullable.
That is, you can distinguish between a unpopulated list and an empty list.
* Message.Get always returns a valid Value even if unpopulated.
The behavior is already well-defined for scalars, but for unpopulated
composite types, it now returns an empty read-only version of it.

Change-Id: Ia120630b4db221aeaaf743d0f64160e1a61a0f61
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/175458
Reviewed-by: Damien Neil <dneil@google.com>
2019-06-17 17:33:24 +00:00
Joe Tsai
1c28304cc5 proto, encoding/protojson, encoding/prototext: use Resolver interface
Instead of accepting a concrete protoregistry.Types type,
accept an interface that provides the necessary functionality
to perform the serialization.

The advantages of this approach:
* There is no need for complex logic to allow a Parent or custom
Resolver on the protoregistry.Types type.
* Users can pass their own custom resolver implementations directly
to the serialization functions.
* This is a more principled approach to plumbing custom resolvers
than the previous approach of overloading behavior on the concrete
Types type.

The disadvantages of this approach:
* A pointer to a concrete type is 8B, while an interface is 16B.
However, the expansion of the {Marshal,Unmarshal}Options structs
should be a concern solved separately from how to plumb custom resolvers.
* The resolver interfaces as defined today may be insufficient to
provide functionality needed in the future if protobuf expands its
feature set. For example, let's suppose the Any message permits
directly representing a enum by name. This would require the ability
to lookup an enum by name. To support that hypothetical need,
we can document that the serializers type-assert the provided Resolver
to a EnumTypeResolver and use that if possible. There is some loss
of type safety with this approach, but provides a clear path forward.

Change-Id: I81ca80e59335d36be6b43d57ec8e17abfdfa3bad
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/177044
Reviewed-by: Damien Neil <dneil@google.com>
2019-05-22 19:38:45 +00:00
Damien Neil
5c5b531562 protogen, encoding/jsonpb, encoding/textpb: rename packages
Rename encoding/*pb to follow the convention of prefixing package names
with 'proto':

	google.golang.org/protobuf/encoding/protojson
	google.golang.org/protobuf/encoding/prototext

Move protogen under a compiler/ directory, just in case we ever do add
more compiler-related packages.

	google.golang.org/protobuf/compiler/protogen

Change-Id: I31010cb5cabcea8274fffcac468477b58b56e8eb
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/177178
Reviewed-by: Joe Tsai <thebrokentoaster@gmail.com>
2019-05-14 20:33:22 +00:00