The encoding/prototext and encoding/protojson are implemented entirely
in terms of protobuf reflection, which side-steps this information.
Remove the hacks in the generator to special-case MessageSet.
Change-Id: I708c4636b77672545a103b7ab686f103b9dfc514
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/185240
Reviewed-by: Herbie Ong <herbie@google.com>
We modify protoc-gen-go to stop generating exported XXX fields.
The unsafe implementation is unaffected by this change since unsafe
can access fields regardless of visibility. However, for the purego
implementation, we need to respect Go visibility rules as enforced
by the reflect package.
We work around this by generating a exporter function that given
a reference to the message and the field to export, returns a reference
to the unexported field value. This exporter function is protected by
a constant such that it is not linked into the final binary in non-purego
build environment.
Updates golang/protobuf#276
Change-Id: Idf5c1f158973fa1c61187ff41440acb21c5dac94
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/185141
Reviewed-by: Damien Neil <dneil@google.com>
Associate the oneof wrapper types with a message by conveying that
information to the associated MessageInfo.
Change-Id: Iabfca593850e1d6a89498a37eacbf22dbb73bd20
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/185239
Reviewed-by: Damien Neil <dneil@google.com>
This test verifies that the inialization logic does not accidentally
execute the lazy initialization logic that is intended to only evaluate
upon first use.
Change-Id: I5e9dea17ce88081c78195af0e86dd38223689f63
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/184259
Reviewed-by: Damien Neil <dneil@google.com>
The code organization is simpler if we keep the functions encoding and
decoding a particular type (e.g., maps) together rather than split
across files.
This rename is happening in a separate CL from cl/185241 to preserve
rename history. (Git gets confused when you rename a->b and b->c in the
same commit.)
Change-Id: Idfbb3ff8cf0db149c68d650f89ff3fb8ac833322
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/184942
Reviewed-by: Joe Tsai <thebrokentoaster@gmail.com>
The code organization is simpler if we keep the functions encoding and
decoding a particular type (e.g., maps) together rather than split
across files. Rename various "encode" files to "codec" in preparation
for adding fast-path decoding.
Change-Id: If1e271da99d31533ffefc19b1fc847936fa9484a
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/185241
Reviewed-by: Joe Tsai <thebrokentoaster@gmail.com>
The protoreflect.Descriptor.Options method is currently documented as
returning a reference to the options, where the user must not mutate
the returned message. This changes internal/filedesc to avoid returning
a copy of the options by caching the first unmarshal.
See golang/protobuf#877
Change-Id: I15701d33fbda7535b21b2add72628b02992c373f
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/185197
Reviewed-by: Benny Siegert <bsiegert@gmail.com>
Hyrum's Law dictates that if we do not prevent naughty behavior,
people will rely on it. If we do not validate that the provided
file descriptor is correct today, it will be near impossible
to add proper validation checks later on.
The logic added validates that the provided file descriptor is
correct according to the same semantics as protoc,
which was reversed engineered to derive the set of rules implemented here.
The rules are unfortunately complicated because protobuf is a language
full of many non-orthogonal features. While our logic is complicated,
it is still 1/7th the size of the equivalent C++ code!
Change-Id: I6acc5dc3bd2e4c6bea6cd9e81214f8104402602a
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/184837
Reviewed-by: Damien Neil <dneil@google.com>
Overview of changes:
* Add an option that specifies whether to replace unresolvable references
with a placeholder instead of producing an error. Since the prior behavior
produced placeholders (not always), we default to that behavior for now,
but will enable strict resolving in a future CL.
* The option is not yet exported because there is concern about what the
public API should look like. This will be exposed in a future CL.
* Unlike before, we now permit placeholders for unresolvable enum values.
* We implement relative name resolution logic.
* We handle the case where the type is unknown, but type_name is specified.
In such a case, we populate both FieldDescriptor.{Enum,Message} and leave
the FieldDescriptor.Kind with the zero value. If the type_name happened
to resolve, we use that to determine the type.
* If a placeholder is used to represent a relative name,
the FullName reports an invalid full name with a "*." prefix.
Change-Id: Ifa8c750423c488fb9324eec4d033a2f251505fda
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/184317
Reviewed-by: Damien Neil <dneil@google.com>
There is little performance benefit to aliasing the input since we copy
every field except the options. Thus, just go all the way and copy the
options as well and document this as such.
Change-Id: If6ca5ce0ee03c9f76e528023b6056ad99d3ca209
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/184879
Reviewed-by: Damien Neil <dneil@google.com>
This does not remove all dependencies,
but all of the cases where it can now be implemented in terms of v2.
Change-Id: Idc5b0273f0d35c284bf2141eb9cce998692ceb15
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/184878
Reviewed-by: Herbie Ong <herbie@google.com>
Previously, when aberrantLoadMessageDesc returned it was guaranteed
to have initialized the current message through the use of the done signal.
However, this does not guarantee that the descriptor for a cylic reference
has also finished initialization.
Rather than add more complicated logic to wait until all cyclic references
have finished initializing, just add a global lock for the entire
aberrantLoadMessageDesc function.
This slows down performance, but is easier to reason about.
Change-Id: I4cdae8b955f71ee40fa6979f5a8d548d9749042c
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/184657
Reviewed-by: Damien Neil <dneil@google.com>
This fixes a bug introduced by CL/182360.
Overview of the problem:
* CL/182360 removes the internal/prototype package, such that
protodesc was re-implemented using internal/filedesc.
* As a result of that change, resolving internal dependencies became
the responsibility of protodesc.
* Dependency resolution used the following two-pass algorithm:
1) first pass derives the full name of all declarations
2) second pass fully initializes each descriptor declaration,
now being able to resolve local dependencies from the previous step.
* When the second pass looks up a local dependency, it is guaranteed to
find it, but it is not guaranteed that the dependency has been initialized
(since it may appear later on). This is problematic for default enum values
since it implies that the enum dependency may not be sufficiently
initialized to be able to query its set of values, leading to panics.
* CL/182360 recognized the problem and attempted to enforce an initialization
ordering where nested enums were always initialized before the body of the
message declaration itself.
* However, that ordering fails to enforce that that enum declarations outside
the parent tree are initialized beforehand. For example, referring to an
enum value that is declared within a sibling of the parent message.
* This CL fixes the problem with a three-pass algorithm:
1) first pass derives the full name *and* fully initialize the
entire descriptor *except* for dependency references (i.e., type_name).
2) second pass only resolves dependency references,
where we do not need to worry about initialization ordering.
3) third pass validates the descriptors are well-formed.
This can now depend on all information being fully initialized.
* While a lot of code moves, this change is actually very mechanical.
Other than split things apart, no new logic is introduced nor removed.
Change-Id: Ia91d4aade8f6187c19d704d43ae96b3b9d276792
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/184297
Reviewed-by: Damien Neil <dneil@google.com>
The aberrant support logic only has access to the Go type
information, and not a concrete value. However, the XXX_MessageName
method exists on some hacky dynamic proto implementations where
it is only valid to call on a concrete value, not just newly created
instance of the given type.
However, from the perspective of the support logic, it is impossible
to distinguish between dynamic messages and hand-crafted custom messages.
Thus, just drop support for XXX_MessageName. We won't get the full name
of the message right, but oh well, what can we do.
Change-Id: Icc272861e11a355639fb82a991ca2854a9edc0c7
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/184557
Reviewed-by: Damien Neil <dneil@google.com>
Aberrant messages are hand-crafted messages that happen to work because
they use the same struct tags that generated code emits.
This happens to work in v1, but is unspecified behavior and entirely outside
the compatibility promise.
Support for this was added early on in the history of the v2 implementation,
but entirely untested. It was removed in CL/182360 to reduce the
technical debt of the legacy implementation. Unfortunately, sufficient number
of targets do rely on this aberrant support, so it is being added back.
The logic being added is essentially the same thing as the previous logic,
but ported to use internal/filedesc instead of the now deleted
internal/prototype package.
Change-Id: Ib5cab3e90480825b9615db358044ce05a14b05bd
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/184517
Reviewed-by: Damien Neil <dneil@google.com>
Move data used by the fast-path implementations into a substructure of
MessageInfo and initialize it separately.
Change-Id: Ib855ee8ea5cb0379528b52ba0e191319aa5e2dff
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/184077
Reviewed-by: Joe Tsai <thebrokentoaster@gmail.com>
Len looks like it should be O(1), but the need to check for
non-zero-length repeated fields makes it at minimum O(n) where n is
the number of repeated fields. In practice, it's O(n) where n is the
number of fields altogether.
The Len function is not especially useful, easily duplicated with Range
and a counter, and can be surprisingly inefficient. Drop it.
Change-Id: I24b27433217e131e842bd18dd58475bcdf62ef97
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/183678
Reviewed-by: Joe Tsai <thebrokentoaster@gmail.com>
Previous code tries to do all-lowercase match all non-extension field
names to match group field names, which is incorrect. Fix to check for
only group field type and make sure that the format is correct as well.
Fixesgolang/protobuf#878.
Fix typo in text proto string in internal/impl/message_test.go that
wasn't caught before due to above issue.
Change-Id: Ief952907306435ed76a095e96e29fcc9c0027b73
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/183737
Reviewed-by: Joe Tsai <thebrokentoaster@gmail.com>
This is a breaking change.
The replacement is the Files.FindDescriptorByName method,
which is more flexible as it handles all descriptor types.
Change-Id: I2ccd544a7630396a2428b1d41f836c5246070912
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/183700
Reviewed-by: Damien Neil <dneil@google.com>
Preserving the unknown enum in the String method helps errors
produced by reflect/protodesc be more informative.
Change-Id: I8efb09cb3c744bf4483b310053df7686da540387
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/183699
Reviewed-by: Damien Neil <dneil@google.com>
This change makes it such that Files now functionally registers all
descriptors in a file (not just enums, messages, extensions, and services),
but also including enum values, messages fields/oneofs, and service methods.
The ability to look up any descriptor by full name is needed to:
1) properly detect namespace conflicts on enum values
2) properly implement the relative name lookup logic in reflect/protodesc
The approach taken:
1) Assumes that a FileDescriptor has no internal name conflicts.
This will (in a future CL) be guaranteed by reflect/protodesc and
is guaranteed today by protoc for generated descriptors.
2) Observes that the only declarations that can possibly conflict
with another file are top-level declarations (i.e., enums, enum values,
messages, extensions, and services). Enum values are annoying
since they live in the same scope as the parent enum, rather than
being under the enum.
For the internal data structure of Files, we only register the top-level
declarations. This is the bare minimum needed to detect whether the file
being registered has any namespace conflicts with previously registered files.
We shift the effort to lookups, where we now need to peel off the end fragments
of a full name until we find a match in the internal registry. If a match
is found, we may need to descend into that declaration to find a nested
declaration by name.
For initialization, we modify internal/filedesc to initialize the
enum values for all top-level enums. This performance cost is offsetted
by the fact that Files.Register now avoids internally registering
nested enums, messages, and extensions.
For lookup, the cost has shifted from O(1) to O(N),
where N is the number of segments in the full name.
Top-level descriptors still have O(1) lookup times.
Nested descriptors have O(M) lookup times,
where M is the level of nesting within a single file.
Change-Id: I950163423431f04a503b6201ddcc20a62ccba017
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/183697
Reviewed-by: Damien Neil <dneil@google.com>
The dynamicpb package permits creating Message values from a
MessageDescriptor.
Change-Id: Ice429ae45a0835dffb5a7ec8c0bd2c1df7aac8a2
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/174960
Reviewed-by: Joe Tsai <thebrokentoaster@gmail.com>
Move the benchmarks using the common protobuf datasets out of proto/ and
into their own directory. Add benchmarks for text and JSON.
Move initialization out of the Benchmark function to avoid including it
in CPU/memory profiles.
We could put benchmarks in each individual package (proto, prototext,
etc.), but the need for common infrastructure around managing the test
data makes it simpler to keep the benchmarks together. Also, it's nice
to have a one-stop overview of performance.
Change-Id: I17c37efb91b2413fc43ab1b4c35bff2e1330bc0a
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/183245
Reviewed-by: Joe Tsai <thebrokentoaster@gmail.com>
This currently returns uninformative errors from the fast path and then
consults the slow, reflection-based path only when an error is detected.
Perhaps it's worth going through the effort of producing better errors
directly on the fast path.
Change-Id: I68536e9438010dbd97dbaff4f47b78430221d94b
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/171462
Reviewed-by: Joe Tsai <thebrokentoaster@gmail.com>
Immediately abort (un)marshal operations when encountering invalid UTF-8
data in proto3 strings. No other proto implementation supports non-UTF-8
data in proto3 strings (and many reject it in proto2 strings as well).
Producing invalid output is an interoperability threat (other
implementations won't be able to read it).
The case where existing string data is found to contain non-UTF8 data is
better handled by changing the field to the `bytes` type, which (aside
from UTF-8 validation) is wire-compatible with `string`.
Remove the errors.NonFatal type, since there are no remaining cases
where it is needed. "Non-fatal" errors which produce results and a
non-nil error are problematic because they compose poorly; the better
approach is to take an option like AllowPartial indicating which
conditions to check for.
Change-Id: I9d189ec6ffda7b5d96d094aa1b290af2e3f23736
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/183098
Reviewed-by: Joe Tsai <thebrokentoaster@gmail.com>
The primary (cross-language) protobuf repository contains benchmark data
sets. Add benchmarks using this data. (A version of this benchmark exists
in the protobuf repository, but it uses the v1 API and isn't trivial to
get working.)
Fetch the small benchmark datasets from the
github.com/protocolbuffers/protobuf repo by default. Add a
download_benchdata.bash script which fetches the larger datasets as
well.
Generate necessary packages under internal/testprotos/benchmarks.
To run:
go run ./proto -bench=BenchmarkData
Usual caveats about benchmarking apply: While these benchmarks use
realistic data, isolated microbenchmarking of proto operations is not
necessarily representitive of performance in production systems.
Change-Id: I58d107554baf104568c86997b5ad50be8b2a5790
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/183297
Reviewed-by: Joe Tsai <thebrokentoaster@gmail.com>
Fix a bug where using Message.Range alone does not clear
memory references for empty, but allocated lists and maps.
Thus, we iterate over every known field and explicitly call Clear.
Change-Id: I9c1847d4056cf66f3199947150f3140d0783444a
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/183197
Reviewed-by: Damien Neil <dneil@google.com>
If we were starting from scratch, we would not have added the enum maps.
However, they already exist and see a fair amount of usage.
The effort to remove them is not worth it. Thus, remove the deprecation
warning since they are here to stay.
Note that the generated code does not refer to the generated enum maps.
One day, the linker should be able to elide them if unused by the user.
However, https://golang.org/issue/2559 would need to be resolved first.
Change-Id: Ia8b9b1812b5d8462ca2fa1d543170e4a09ff9e4f
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/183177
Reviewed-by: Damien Neil <dneil@google.com>
The internal/fileinit package is split apart into two packages:
* internal/filedesc constructs descriptors from the raw proto.
It is very similar to the previous internal/fileinit package.
* internal/filetype wraps descriptors with Go type information
Overview:
* The internal/fileinit package will be deleted in a future CL.
It is kept around since the v1 repo currently depends on it.
* The internal/prototype package is deleted. All former usages of it
are now using internal/filedesc instead. Most significantly,
the reflect/protodesc package was almost entirely re-written.
* The internal/impl package drops support for messages that do not
have a Descriptor method (pre-2016). This removes a significant amount
of technical debt.
filedesc.Builder to parse raw descriptors.
* The internal/encoding/defval package now handles enum values by name.
Change-Id: I3957bcc8588a70470fd6c7de1122216b80615ab7
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/182360
Reviewed-by: Damien Neil <dneil@google.com>
The wire representation of the contents of an Any is a `bytes` field
containing the the wire encoding of the contained message. The in-memory
representation is just these bytes.
The wire (un)marshaler has no special handling for Any values, and
happily accepts whatever bytes are present.
The text and JSON marshalers will unmarshal the bytes in an Any,
and the unmarshalers will marshal the Any to produce the in-memory
representation. This makes them stricter than the wire (un)marshaler:
Marshaling an Any which contains invalid data to text/JSON fails, while
marshaling it to wire format does not. This does make some sense, since
the Any already contains wire-format data; validation is performed at an
earlier time when producing that data.
This change brings the text and JSON (un)marshal functions a bit more
into alignment with the wire format by never performing checks for
required fields on the contents of an Any.
This has the advantage of consistently performing checks for required
fields exactly once per marshal/unmarshal operation, and over the same
data no matter which encoding is used. It also eliminates the one case
where a required field check is considered "non-fatal", generating an
error but not terminating the (un)marshal operation.
Change-Id: I70c62419d37ea0a07cb73c3ee2d26c0b0bec724b
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/182982
Reviewed-by: Herbie Ong <herbie@google.com>
The text and JSON encodings for the google.protobuf.Any well-known type
require a call to proto.Unmarshal. Plumb through the resolver from the
UnmarshalOptions.
Change-Id: Iccc1a9d56acd9dd214f2b289216bd50acc2ef074
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/182980
Reviewed-by: Herbie Ong <herbie@google.com>
The change to make protodesc.NewFile take an interface rather than a
concrete type means that NewFile(f, nil) now causes a panic. (A nil
*protoregistry.Files is valid.)
Fix this panic by using a default, empty registry when NewFile's second
parameter is nil.
Change-Id: I70a1f0759e7ea5b57fba5b6123ee85188f4d560c
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/182979
Reviewed-by: Joe Tsai <thebrokentoaster@gmail.com>
CL/177044 switches the serialization APIs to take in a resolver interface.
This does the moral equivalent for protodesc.
This is technically a breaking change since the signature of NewFile changes.
However, it is unlikely that anything is affected by this.
Change-Id: I7b44d5c3d5570a17c052add4d229550e4a0ad163
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/182638
Reviewed-by: Damien Neil <dneil@google.com>
This is a breaking change in light of new methods added in CL/175458.
This removes:
Message.KnownFields: equivalent functionality have been hoisted up
to the Message interface itself.
Message.UnknownFields: equivalent functionality is via
the Message.{Set,Get}Unknown methods.
Change-Id: Ia08b26894d2b45033a6ad6616258ff0fb9f8b7a4
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/182597
Reviewed-by: Joe Tsai <thebrokentoaster@gmail.com>
This is a breaking change in light of new methods added in CL/174918.
This removes:
Enum.Type: replacement is Descriptor
Message.Type: replacement is Descriptor and New
Change-Id: Iaa5328795407c8401ef14ed038bd5ace19d8e03b
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/174938
Reviewed-by: Damien Neil <dneil@google.com>
This is a breaking change in light of new methods added in CL/176977.
This removes:
FieldDescriptor.Oneof: replacement is ContainingOneof
FieldDescriptor.Extendee: replacement is IsExtension and ContainingMessage
Change-Id: I82008e46fb3b80de8e8a0ac42afc54e8c4b67411
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/176942
Reviewed-by: Damien Neil <dneil@google.com>
This is a breaking change.
The equivalent replacement logic is to trivially check whether the
parent descriptor is not nil.
Change-Id: I5c89c1d9f29f9e6f721bbfbcf7774188d8f0086a
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/175987
Reviewed-by: Damien Neil <dneil@google.com>
This is a breaking change in light of API added in CL/182497.
This removes:
Files.RangeFilesByPath: replacement is Files.FindFileByPath
Change-Id: I47bf59b37c355844984661056212953853a0db51
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/182537
Reviewed-by: Damien Neil <dneil@google.com>
Previously, we liberally permitted mutiple files to be registered that
have the same path. However, doing so causes complexity in various places
that need to assume that file paths are unique. Since unique paths are
the intention of the proto language, we strictly enforce that now.
Change-Id: Ie8fdd57c824c9809a51859cf20c4bc477b6871be
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/182497
Reviewed-by: Damien Neil <dneil@google.com>