For golang/protobuf#1657
Change-Id: I7b2b0c30506706015ce278e6054439c9ad9ef727
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/634815
TryBot-Bypass: Michael Stapelberg <stapelberg@google.com>
Reviewed-by: Joseph Tsai <joetsai@digital-static.net>
Reviewed-by: Damien Neil <dneil@google.com>
Also adds better benchmark cases for large message where some fields are
actually populated.
This change was previously done in Google internal cl/660848520.
Change-Id: I682aae0c9c2850bfe7638de29ab743ad7d7b119a
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/609035
Reviewed-by: Christian Höppner <hoeppi@google.com>
Reviewed-by: Cassondra Foesch <cfoesch@gmail.com>
Reviewed-by: Michael Stapelberg <stapelberg@google.com>
Reviewed-by: Damien Neil <dneil@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
+ This change introduce a default and configurable depth limit for
proto.Unmarshal. If a message is nested deeper than the limit,
unmarshaling will fail. There are two ways to nest messages. Either by
having fields which are message types itself or by using groups.
+ The default limit is 10,000 for now. This might change in the future
to align it with other language implementation (C++ and Java use 100
as limit).
+ If pure groups (groups that don't contain message fields) are nested
deeper than the default limit the unmarshaling fails with:
proto: cannot parse invalid wire-format data
+ Note: the configured limit does not apply to pure groups.
+ This change is introduced to improve security and robustness. Because
unmarshaling is implemented using recursion it can lead to stack overflows
for certain inputs. The introduced limit protects against this.
+ A secondary motivation for this limit is the alignment with other
languages. Protocol buffers are a language interoperability mechanism
and thus either all implementations should accept the input or all
implementation should reject the input.
Change-Id: I14bdb44d06e4bd1aa90d6336c2cf6446003b2037
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/385854
Trust: Dmitri Shuralyov <dmitshur@golang.org>
Reviewed-by: Damien Neil <dneil@google.com>
Trust: Damien Neil <dneil@google.com>
Reviewed-by: Nicolas Hillegeer <aktau@google.com>
Reviewed-by: Chressie Himpel <chressie@google.com>
Remove named input argument to be consistent with other methods.
Change-Id: I22c48abf76e007a1319bfb037bbb2b7bfb66cee7
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/220688
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
An Is prefix implies it returns a boolean.
A Check prefix better suggests that it could return an error.
Change-Id: I6ffcb32099a944c656c07654c294a0980efb2d0e
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/220338
Reviewed-by: Damien Neil <dneil@google.com>
The IsInitialized flag is unused and unnecessary as the error returned
inherently reports whether a message is uninitialized if it is non-nil.
Also cleanup the documentation in protoiface of stale references.
Change-Id: Id75772f4f75f8d7a2a9fbe772d57f7d69ae9fb9d
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/220337
Reviewed-by: Damien Neil <dneil@google.com>
Move all fast-path inputs and outputs into the Input/Output structs.
Collapse all booleans into bitfields.
Change-Id: I79ebfbac9cd1d8ef5ec17c4f955311db007391ca
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/219505
Reviewed-by: Joe Tsai <joetsai@google.com>
Add a test to confirm that extensions are lazily decoded when we expect.
Drop the UnmarshalDefaultResolver flag. I added it thinking for some
reason that internal/impl couldn't depend on protoregistry; since it can
(and does), it's simpler to just test if the resolver is the expected
value.
Use a default set of options when lazily unmarshaling extensions.
Change-Id: Ied7666ffdc3bf90630260a80c9568d9a945048bc
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/218038
Reviewed-by: Joe Tsai <thebrokentoaster@gmail.com>
Historically, extensions have been placed in the unknown fields section
of the unmarshaled message and decoded lazily on demand. The current
unmarshal implementation decodes extensions eagerly at unmarshal time,
permitting errors to be immediately reported and correctly detecting
unset required fields in extension values.
Add support for validated lazy extension decoding, where the extension
value is fully validated at initial unmarshal time but the fully
unmarshaled message is only created lazily.
Make this behavior conditional on the protolegacy flag for now.
Change-Id: I9d742496a4bd4dafea83fca8619cd6e8d7e65bc3
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/216764
Reviewed-by: Joe Tsai <joetsai@google.com>
Change the representation of option flags in protoiface from bools to a
bitfield. This brings the representation of options in protoiface in
sync with that in internal/impl.
This change has several benefits:
1. We will probably find that we need to add more option flags over time.
Converting to the more efficient representation of these flags as high
in the call stack as possible minimizes the performance implication of
the struct growing.
2. On a similar note, this avoids the need to convert from the compact
representation to the larger one when passing from internal/impl to
proto, since the {Marshal,Unmarshal}State methods take the compact form.
3. This removes unused options from protoiface. Instead of documenting
that AllowPartial is always set, we can just not include an AllowPartial
flag in the protoiface options.
4. Conversely, this provides a way to add option flags to protoiface
that we don't want to expose in the proto package.
name old time/op new time/op delta
EmptyMessage/Wire/Marshal-12 11.1ns ± 7% 10.1ns ± 1% -9.35% (p=0.000 n=8+8)
EmptyMessage/Wire/Unmarshal-12 7.07ns ± 0% 6.74ns ± 1% -4.58% (p=0.000 n=8+8)
EmptyMessage/Wire/Validate-12 4.30ns ± 1% 3.80ns ± 8% -11.45% (p=0.000 n=7+8)
RepeatedInt32/Wire/Marshal-12 1.17µs ± 1% 1.21µs ± 7% +4.09% (p=0.000 n=8+8)
RepeatedInt32/Wire/Unmarshal-12 938ns ± 0% 942ns ± 3% ~ (p=0.178 n=7+8)
RepeatedInt32/Wire/Validate-12 521ns ± 4% 543ns ± 7% ~ (p=0.157 n=7+8)
Required/Wire/Marshal-12 97.2ns ± 1% 95.3ns ± 1% -1.98% (p=0.001 n=7+7)
Required/Wire/Unmarshal-12 41.0ns ± 9% 38.6ns ± 3% -5.73% (p=0.048 n=8+8)
Required/Wire/Validate-12 25.4ns ±11% 21.4ns ± 3% -15.62% (p=0.000 n=8+7)
Change-Id: I3ac1b00ab36cfdf61316ec087a5dd20d9248e4f6
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/216760
Reviewed-by: Joe Tsai <joetsai@google.com>
Add a fast check for required fields to the fast path unmarshal.
This is best-effort and will fail to detect some initialized
messages: Messages with more than 64 required fields, messages
split across multiple tags, possibly other cases.
In the cases where it works (which is most of them in practice),
this permits us to skip the IsInitialized check.
Change-Id: I6b70953a333033a5e64fb7ca37a59786cb0f75a0
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/215878
Reviewed-by: Joe Tsai <joetsai@google.com>
Add functions to the proto package which plumb through the fast-path state.
As a sample use case: A followup CL adds an Initialized field to
protoiface.UnmarshalOutput, permitting the unmarshaller to report back
when it can confirm that a message is fully initialized. We want to
preserve that information when an unmarshal operation threads through
the proto package (such as when unmarshaling extensions).
To allow these functions to be added as methods of MarshalOptions and
UnmarshalOptions rather than top-level functions, separate the options
from the input structs.
Also update options passed to fast-path methods to set AllowPartial and
Merge to reflect the expected behavior of those methods. (Always allow
partial, never merge.)
Change-Id: I482477b0c9340793be533e75a86d0bb88708716a
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/215877
Reviewed-by: Joe Tsai <joetsai@google.com>
We may want to make changes to the inputs and outputs of the fast-path
functions in the future. For example, we likely want to add the ability
for the fast-path unmarshal to report back whether the unmarshaled
message is known to be initialized.
Change the signatures of these functions to take in and return struct
types which can be extended with whatever fields we want in the future.
Change-Id: Idead360785df730283a4630ea405265b72482e62
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/215719
Reviewed-by: Joe Tsai <joetsai@google.com>
Promote the fast-path magic ProtoMethods method to first-class citizen
of the protoreflect.Message interface.
To avoid polluting the protoreflect package with the various types
required by this method, make the necessary protoiface types unnamed and
duplicate them in protoreflect.
Updates golang/protobuf#1022.
Change-Id: I9595bae40b3bc7536d727fb6f99b3bce8f73da87
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/215718
Reviewed-by: Joe Tsai <joetsai@google.com>
Implementations of the legacy Marshaler type have no way to efficiently
compute the size of the message. Rather than generating an inefficient
fast-path Size method which marshals the message and examines the
length of the result, don't generate a fast-path at all.
Drop the requirement that a fast-path MarshalAppend requires a
corresponding Size.
Avoids O(N^2) behavior when marshaling a legacy Marshaler that
recursively calls proto.Marshal.
Change-Id: I4793cf32275d08f29c8e1a1a44a193d9a5724058
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/213443
Reviewed-by: Joe Tsai <joetsai@google.com>
We change Unmarshal to reset a message by default.
* We add a Merge option to UnmarshalOptions for explicit merging.
* We speed up Reset by checking for the Reset method.
* Remove TODOs in prototext and protojson about reset behavior.
Fixesgolang/protobuf#890
Change-Id: Ibd8963c741053f564acf061fbdb846699942109c
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/195457
Reviewed-by: Damien Neil <dneil@google.com>
Change protoV1.ExtensionDesc to directly implement ExtensionType
rather than delegating to one.
Unify the previous types protoiface.ExtensionDescV1 and
filetype.Extension in impl.ExtensionInfo. The protoV1.ExtensionDesc
type becomes an alias to ExtensionInfo.
This gives us:
- Just one implementation of ExtensionType.
- Generated foopb.E_Ext vars are canonical ExtensionTypes.
- Generated foopb.E_Ext vars are also v1.ExtensionDescs for backwards
compatibility.
- Conversion between legacy and modern representations happens
transparently when lazily initializing an ExtensionInfo.
Overall, a simplification for users of generated code, since they can
mostly ignore the ExtensionDesc/ExtentionType distinction and use the
same value in either the old or new API.
This is change 3/5 in a series of commits changing protoV1.ExtensionDesc
to directly implement protoreflect.ExtensionType.
1. [v2] Add protoimpl.ExtensionInfo as an alias for
protoiface.ExtensionDescV1.
2. [v1] Update references to protoimpl.ExtensionInfo to use
protoiface.ExtensionInfo.
3. [v2] Create protoimpl.ExtensionInfo (an alias to a new type in
the impl package) and remove protoiface.ExtensionDescV1.
4. [v1] Remove unneeded explicit conversions between ExtensionDesc and
ExtensionType (since the former now directly implements the latter).
5. [v2] Remove stub conversion functions.
Change-Id: I96ee890541ec11b2412e1a72c9d7b96e4d7f66b4
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/189563
Reviewed-by: Joe Tsai <thebrokentoaster@gmail.com>
Change protoiface.ExtensionDescV1 to implement protoreflect.ExtensionType.
ExtensionDescV1's Name field conflicts with the Descriptor Name method,
so change the protoreflect.{Message,Enum,Extension}Type types to no
longer implement the corresponding Descriptor interface. This also leads
to a clearer distinction between the two types.
Introduce a protoreflect.ExtensionTypeDescriptor type which bridges
between ExtensionType and ExtensionDescriptor.
Add extension accessor functions to the proto package:
proto.{Has,Clear,Get,Set}Extension. These functions take a
protoreflect.ExtensionType parameter, which allows writing the
same function call using either the old or new API:
proto.GetExtension(message, somepb.E_ExtensionFoo)
Fixesgolang/protobuf#908
Change-Id: Ibc65d12a46666297849114fd3aefbc4a597d9f08
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/189199
Reviewed-by: Joe Tsai <thebrokentoaster@gmail.com>
The following adjustments were made:
* The pragma.NoUnkeyedLiterals is moved to be the first field.
This is done to keep the options struct smaller. Even if the last
field is zero-length, Go GC implementation details forces the struct
to be padded at the end.
* Methods are documented as always treating AllowPartial as true.
* Added a support flag for UnmarshalOptions.DiscardUnknown.
Change-Id: I1f75d226542ab2bb0123d9cea143c7060df226d8
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/185998
Reviewed-by: Damien Neil <dneil@google.com>
Contrary to the WARNING, this package should be stable.
However, it still should not be imported by the end-user.
Change-Id: I05b4b9ebb1e0d28ab626f8c51fbe827b5acf237e
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/178545
Reviewed-by: Damien Neil <dneil@google.com>
Instead of accepting a concrete protoregistry.Types type,
accept an interface that provides the necessary functionality
to perform the serialization.
The advantages of this approach:
* There is no need for complex logic to allow a Parent or custom
Resolver on the protoregistry.Types type.
* Users can pass their own custom resolver implementations directly
to the serialization functions.
* This is a more principled approach to plumbing custom resolvers
than the previous approach of overloading behavior on the concrete
Types type.
The disadvantages of this approach:
* A pointer to a concrete type is 8B, while an interface is 16B.
However, the expansion of the {Marshal,Unmarshal}Options structs
should be a concern solved separately from how to plumb custom resolvers.
* The resolver interfaces as defined today may be insufficient to
provide functionality needed in the future if protobuf expands its
feature set. For example, let's suppose the Any message permits
directly representing a enum by name. This would require the ability
to lookup an enum by name. To support that hypothetical need,
we can document that the serializers type-assert the provided Resolver
to a EnumTypeResolver and use that if possible. There is some loss
of type safety with this approach, but provides a clear path forward.
Change-Id: I81ca80e59335d36be6b43d57ec8e17abfdfa3bad
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/177044
Reviewed-by: Damien Neil <dneil@google.com>
Temporarily remove go.mod, since we can't generate an accurate one until
the corresponding v1 change is submitted.
Change-Id: I1e1ad97f2b455e33f61ffaeb8676289795e47e72
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/177000
Reviewed-by: Joe Tsai <thebrokentoaster@gmail.com>
Added methods:
Enum.Descriptor
Message.Descriptor
EnumType.Descriptor
MessageType.Descriptor
ExtensionType.Descriptor
Message.New
All functionality is switched over to use those methods instead of
implicitly relying on the fact that {Enum,Message}Type implicitly
implement the associated descriptor interface.
This CL does not yet remove {Enum,Message}.Type or prevent
{Enum,Message,Extension}Type from implementating a descriptor.
That is a subsequent CL.
The Message.New method is also added to replace functionality
that will be lost when the Type methods are removed.
Change-Id: I7fefde1673bbd40bfdac489aca05cec9a6c98eb1
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/174918
Reviewed-by: Damien Neil <dneil@google.com>
Reviewed-by: Herbie Ong <herbie@google.com>
CL/172399 switches the v1 code to eagerly unmarshal extensions.
This CL does the equivalent for v2.
For the test, we simply switch from protoV1.Equal to protoV2.Equal,
since the v2 equal does not magically unmarshal raw extensions.
Change-Id: I6f64455b0a75bbc9a9a82108558641a29bd2b982
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/175838
Reviewed-by: Damien Neil <dneil@google.com>
Using an option instead of a separate method has several useful properties:
It makes it explicit whether the fast-path AppendMarshal is expected to use
cached sizes or not.
It properly plumbs the decision to use cached sizes through the call stack.
Consider the case where message A includes B includes C: If A and C support
cached sizes but B does not, we would like to use the size cache in all
messages which support it. Placing this decision in the options allows this
to work properly with no additional effort.
Placing this option in MarshalOptions permits users to request use of
existing cached sizes. This is a two-edged sword: There are places where
this ability can permit substantial efficiencies, but this is also an
exceedingly sharp-edged API. I believe that on balance the benefits
outweigh the risks, especially since the prerequisites for using
cached sizes are intuitively obvious. (You must have called Size, and
you must not have changed the message.)
This CL adds a Size method to MarshalOptions, rather than adding a SizeOptions
type. Future additions to MarshalOptions may affect the size of the encoded
output (e.g., an option to skip encoding unknown fields) and using the same
options for both Marshal and Size makes it easier to use a consistent
configuration for each.
Change-Id: I6adbb55b717dd03d39f067e1d0b7381945000976
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/171119
Reviewed-by: Joe Tsai <thebrokentoaster@gmail.com>
Remove the Reflection field from MarshalOptions and UnmarshalOptions.
Disable the fast path and use the reflection-based implementation when
the 'protoreflect' build tag is set.
Change-Id: Ic674e3af67501de27fb03ec2712fbed40eae7fef
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/170896
Reviewed-by: Joe Tsai <thebrokentoaster@gmail.com>
Move all checks for required fields into a proto.IsInitialized function.
Initial testing makes me confident that we can provide a fast-path
implementation of IsInitialized which will perform more than
acceptably. (In the degenerate-but-common case where a message
transitively contains no required fields, this check can be nearly
zero cost.)
Unifying checks into a single function provides consistent behavior
between the wire, text, and json codecs.
Performing the check after decoding eliminates the wire decoder bug
where a split message is incorrectly seen as missing required fields.
Performing the check after decoding also provides consistent and
arguably more correct behavior when the target message was partially
prepopulated.
Change-Id: I9478b7bebb263af00c0d9f66a1f26e31ff553522
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/170787
Reviewed-by: Herbie Ong <herbie@google.com>
Allow message implementations to provide optimized versions of standard
operations. Generated messages now include a ProtoReflectMethods method,
returning a protoiface.Methods struct containing pointers to assorted
optional functions.
The Methods struct also includes a Flags field indicating support for
optional features such as deterministic marshaling.
Implementation of the fast paths (and tests) will come in later CLs.
Change-Id: Idd1beed0ecf43ec5e5e7b8da2ee1e08d3ce32213
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/170340
Reviewed-by: Joe Tsai <thebrokentoaster@gmail.com>
As a goal, v2 should not depend on v1. As another step towards that end,
we move all the types that used to be in the v1 protoapi package over to v2.
For now, we place MessageV1, ExtensionRangeV1, and ExtensionDescV1
in runtime/protoiface since these are types that generated messages will
probably have to reference forever. An alternative location could be
reflect/protoreflect, but it seems unfortunate to have to dirty the
namespace of that package with these types.
We move ExtensionFieldV1, ExtensionFieldsV1, and ExtensionFieldsOf
to internal/impl, since these are related to the implementation of a
generated message.
Since moving these types from v1 to v2 implies that the v1 protoapi
package is useless, we update all usages of v1 protoapi in the v2
repository to point to the relevant v2 type or functionality.
CL/168538 is the corresponding change to alter v1.
There will be a temporary build failure as it is not possible
to submit CL/168519 and CL/168538 atomically.
Change-Id: Ide4025c1b6af5b7f0696f4b65b988b4d10a50f0b
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/168519
Reviewed-by: Herbie Ong <herbie@google.com>