Commit Graph

15 Commits

Author SHA1 Message Date
Joe Tsai
afb455eaf8 cmd/protoc-gen-go: correlate v1 ExtensionDesc with v2 ExtensionType
Unfortunately a good amount of code uses pointer comparisons on the
v1 ExtensionDesc to determine exactly which extension field is set,
rather than checking whether the extension descriptor semantically
describes the field that they are interested in.

To preserve this behavior in v1, we need a 1:1 mapping between
a v2 ExtensionType and a specific v1 ExtensionDesc.

Change-Id: I852b3cefb4585bd656e48e5adad6cc28795d02df
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/167759
Reviewed-by: Damien Neil <dneil@google.com>
2019-03-15 01:24:42 +00:00
Joe Tsai
990b9f5919 internal/prototype: move from reflect/prototype
The prototype package was initially used by generated reflection support,
but has now been replaced by internal/fileinit.
Eventually, this functionality should be deleted and re-written in terms
of other components in the repo.

Usages that prototype currently provides (but should be moved) are:
* Constructing standalone messages and enums, which is behavior we should
provide in reflect/protodesc. The google.protobuf.{Enum,Type} are well-known
proto messages designed for this purpose.
* Constructing placeholder files, enums, and messages.
* Consructing protoreflect.{Message,Enum,Extension}Types, which are protobuf
descriptors with associated Go type information.

Change-Id: Id7dbefff952682781b439aa555508c59b2629f9e
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/167383
Reviewed-by: Damien Neil <dneil@google.com>
2019-03-13 20:17:00 +00:00
Joe Tsai
4989810018 cmd/protoc-gen-go: always generate support for reflection
Now that binary bloat concerns have been addressed, remove the flag
to control whether to generate support for reflection.

Change-Id: Ia0d11183707572caaf91d2f01dfa77e3aac0a417
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/167140
Reviewed-by: Damien Neil <dneil@google.com>
2019-03-12 23:04:05 +00:00
Damien Neil
0fc224513b cmd/protoc-gen-go: enforce init order within packages
Ensure that the init funcs for files within a Go package run in the
dependency order of the source .proto files. That is, if a.proto and b.proto
are part of the same Go package, and a.proto imports b.proto, then b.pb.go's
init funcs must run before a.pb.go's.

Change-Id: I0e86ff22e5c4cab9df7a73fe4805390fadd34b0d
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/166419
Reviewed-by: Joe Tsai <thebrokentoaster@gmail.com>
Reviewed-by: Herbie Ong <herbie@google.com>
2019-03-10 22:19:14 +00:00
Damien Neil
6bb8dec7f6 cmd/protoc-gen-go: change some arrays to slices to save bytes
Using arrays in the generated reflection information adds unnecessary
eq and hash functions being added to the package. Change to slices
to reduce bloat.

Change-Id: I1a4f6d59021644d93dd6c24679b9233141e89a75
Reviewed-on: https://go-review.googlesource.com/c/164640
Reviewed-by: Joe Tsai <thebrokentoaster@gmail.com>
2019-03-01 23:53:03 +00:00
Damien Neil
8012b444ee internal/fileinit: generate reflect data structures from raw descriptors
This CL takes a significantly different approach to generating support
for protobuf reflection. The previous approach involved generating a
large number of Go literals to represent the reflection information.
While that approach was correct, it resulted in too much binary bloat.

The approach taken here initializes the reflection information from
the raw descriptor proto, which is a relatively dense representation
of the protobuf reflection information. In order to keep initialization
cost low, several measures were taken:
* At program init, the bare minimum is parsed in order to initialize
naming information for enums, messages, extensions, and services declared
in the file. This is done because those top-level declarations are often
relevant for registration.
* Only upon first are most of the other data structures for protobuf
reflection actually initialized.
* Instead of using proto.Unmarshal, a hand-written unmarshaler is used.
This allows us to avoid a dependendency on the descriptor proto and also
because the API for the descriptor proto is fundamentally non-performant
since it requires an allocation for every primitive field.

At a high-level, the new implementation lives in internal/fileinit.

Several changes were made to other parts of the repository:
* cmd/protoc-gen-go:
  * Stop compressing the raw descriptors. While compression does reduce
the size of the descriptors by approximately 2x, it is a pre-mature
optimization since the descriptors themselves are around 1% of the total
binary bloat that is due to generated protobufs.
  * Seeding protobuf reflection from the raw descriptor significantly
simplifies the generator implementation since it is no longer responsible
for constructing a tree of Go literals to represent the same information.
  * We remove the generation of the shadow types and instead call
protoimpl.MessageType.MessageOf. Unfortunately, this incurs an allocation
for every call to ProtoReflect since we need to allocate a tuple that wraps
a pointer to the message value, and a pointer to message type.
* internal/impl:
  * We add a MessageType.GoType field and make it required that it is
set prior to first use. This is done so that we can avoid calling
MessageType.init except for when it is actually needed. The allows code
to call (*FooMessage)(nil).ProtoReflect().Type() without fearing that the
init code will run, possibly triggering a recursive deadlock (where the
init code depends on getting the Type of some dependency which may be
declared within the same file).
* internal/cmd/generate-types:
  * The code to generate reflect/prototype/protofile_list_gen.go was copied
and altered to generated internal/fileinit.desc_list_gen.go.

At a high-level this CL adds significant technical complexity.
However, this is offset by several possible future changes:
* The prototype package can be drastically simplified. We can probably
reimplement internal/legacy to use internal/fileinit instead, allowing us
to drop another dependency on the prototype package. As a result, we can
probably delete most of the constructor types in that package.
* With the prototype package significantly pruned, and the fact that generated
code no longer depend on depends on that package, we can consider merging
what's left of prototype into protodesc.

Change-Id: I6090f023f2e1b6afaf62bd3ae883566242e30715
Reviewed-on: https://go-review.googlesource.com/c/158539
Reviewed-by: Herbie Ong <herbie@google.com>
Reviewed-by: Joe Tsai <thebrokentoaster@gmail.com>
2019-01-30 01:33:46 +00:00
Joe Tsai
3bc7d6f5cd reflect: switch MessageType.New to return Message
Most usages of New actually prefer to interact with the reflective view
rather than the native Go type. Thus, change New to return that instead.
This parallels reflect.New, which returns the reflective view
(i.e., reflect.Value) instead of native type (i.e., interface{}).
We make the equivalent change to KnownFields.NewMessage, List.NewMessage,
and Map.NewMessage for consistency.

Since this is a subtle change where the type system will not always
catch the changed type, this change was made by both changing the type
and renaming the function to NewXXX and manually looking at every usage
of the the function to ensure that the usage correctly operates
on either the native Go type or the reflective view of the type.
After the entire codebase was cleaned up, a rename was performed to convert
NewXXX back to New.

Change-Id: I153fef627b4bf0a427e4039ce0aaec52e20c7950
Reviewed-on: https://go-review.googlesource.com/c/157077
Reviewed-by: Damien Neil <dneil@google.com>
2019-01-09 20:29:29 +00:00
Damien Neil
a8593bae57 reflect/protoreflect: drop the ProtoEnum type
Drop the protoreflect.ProtoEnum type (containing a single method
returning a protoreflect.Enum) and make generated enum types
directly implement protoreflect.Enum instead.

Messages have a two-level type split (ProtoMessage and Message) to
minimize conflicts between reflection methods and field names. Enums
need no such split, since enums do not have fields and therefore have
no source of conflicts.

Change-Id: I2b6222e9404253e6bfef2217859e1b760ffcd29b
Reviewed-on: https://go-review.googlesource.com/c/156902
Reviewed-by: Joe Tsai <thebrokentoaster@gmail.com>
Reviewed-by: Damien Neil <dneil@google.com>
2019-01-09 00:40:35 +00:00
Damien Neil
232ea15589 reflect/prototype: hoist semantic options into builders
Add fields to the Message and Field builder structs which hold the value
of MessageOptions.map_entry, FieldOptions.packed, and FieldOptions.weak
options. Remove all access to the contents of options messages from the
prototype package.

Change IsPacked to always return false for unpackable field types,
which is consistent with the equivalent C++ API.

This change helps avoid dependency cycles between prototype and the
options messages. (Previously this was resolved by accessing options
with reflection, but just breaking the dependency from prototype to the
options message is cleaner and simpler.)

Change-Id: I756aefe2e04cfa8fea31eaaaa0b5a99d4ac9e851
Reviewed-on: https://go-review.googlesource.com/c/153517
Reviewed-by: Joe Tsai <thebrokentoaster@gmail.com>
2018-12-11 20:25:45 +00:00
Damien Neil
97e7f57dbb reflect/protoreflect: replace Mutable with NewMessage
Remove the Mutable methods from KnownFields, List, and Map, replacing
them with methods which return a new, empty message value without adding
that value to the collection.

The new API is simpler, since it clearly applies only to message values,
and more orthogonal, since it provides a way to create a value without
mutating the collection. This latter point is particularly useful in
map deserialization, where the key may be unknown at the time the value
is deserialized.

Drop the Mutable interface, since it is no longer necessary.

Change-Id: Ic5f3d06a2aa331a5d5cd2b4e670a3dba4a74f77c
Reviewed-on: https://go-review.googlesource.com/c/153278
Reviewed-by: Joe Tsai <thebrokentoaster@gmail.com>
2018-12-10 21:17:16 +00:00
Joe Tsai
bce82b8e0d reflect/protoreflect: add HasJSONName, ReservedRanges, and ReservedNames
These properties of descriptors are currently missing and makes it impossible
to convert a FileDescriptorProto into one of the structured Go representations
and convert it back to a proto message without loss of information.

Furthermore, ReservedRanges and ReservedNames has semantic importance
to text serialization.

Change-Id: Ic33c30020ad51912b143156b95f47a4fb8da3503
Reviewed-on: https://go-review.googlesource.com/c/153019
Reviewed-by: Damien Neil <dneil@google.com>
2018-12-07 20:10:15 +00:00
Joe Tsai
a4cbffe4bc reflect/prototype: add registration hooks for options
Add pseudo-hidden functions to register the concrete Go type used for
the optional types. Also, augment protoc-gen-go to specially generate the
descriptor proto to register these types.

This change does not add validation logic yet to ensure that the correct option
types are passed to the API.

Change-Id: I5decc897e14b4bf570a61cf17b57a066a2a0f9d7
Reviewed-on: https://go-review.googlesource.com/c/153017
Reviewed-by: Damien Neil <dneil@google.com>
2018-12-07 19:38:57 +00:00
Joe Tsai
9667c4816d cmd/protoc-gen-go: reduce technical debt
The following TODOs were addressed:
* Consistently collect all enums, messages, and extensions in a breadth-first order.
The practical affect of this is that the declaration order in a Go file may change.
This simplifies reflection generation, which relies on consistent ordering.
* Removal of placeholder declarations (e.g., "var _ = proto.Marshal") since
protogen is intelligent about including imports as necessary.
* Always generate a default variable or constant for explicit empty strings.
The practical effect of this is the addition of new declarations in some cases.
However, it simplifies our logic such that it matches the protobuf data model.
* Generate the registration statements in a consistent order.

Change-Id: I627bb72589432bb65d53b50965ea88e5f7983977
Reviewed-on: https://go-review.googlesource.com/c/152778
Reviewed-by: Damien Neil <dneil@google.com>
2018-12-07 03:13:48 +00:00
Joe Tsai
24ceb2b095 cmd/protoc-gen-go: generate descriptor and plugin with reflection
In CL/152020, we checked in pre-generated versions of descriptor and plugin.
This CL makes it such that they are generated by protoc-gen-go.

We modify protoc-gen-go to avoid reflection support by default
since the binary size increase is still an issue to investigate.
Reflection support is temporarily enabled by setting a special
PROTOC_GEN_GO_ENABLE_REFLECT environment variable.

Reflection support is always enabled for descriptor and plugin.
Furthermore, we change descriptor to depend on the protoapi package
instead of the proto package. The reason we do not switch to protoapi
for all generated protos is because we still depend on v1 proto
for the table-driven InternalMessageInfo type. Dropping it from descriptor
is semantically correct, but does incur slight performance cost.
It does not seem appropriate to drop it for all generated messages.

We could move InternalMessageInfo to protoapi, but the logic behind that
is significant.

Change-Id: I5c3fff7f6eab1a5a2399049d42fa6bf42d4c93f9
Reviewed-on: https://go-review.googlesource.com/c/152547
Reviewed-by: Damien Neil <dneil@google.com>
2018-12-06 17:47:27 +00:00
Joe Tsai
b6405bd782 cmd/protoc-gen-go: add support for protobuf reflection
Implement support in protoc-gen-go for generating messages and enums
that satisfy the v2 protobuf reflection interfaces. Specifically, the following
are added:
* top-level variable representing the file descriptor
* ProtoReflect method on enums (to implement protoV2.Enum)
* ProtoReflect method on messages (to implement protoV2.Message)

The following are not supported yet:
* resolving transitive dependencies for file imports
* Extension descriptors
* Service descriptors

The implementation approach creates a single array for all the message and enum
declarations and references sections of that array to complete dependencies.
Since protobuf declarations can form a graph (a message may depend on itself),
it is difficult to construct a graph as a single literal. One way is to use
placeholder descriptors, but that is not efficient as it requires encoding
the full name of each dependent enum and message and then later resolving it;
thus, both expanding the binary size and also increasing initialization cost.
Instead, we add a prototype.{Enum,Message}.Reference method to obtain a
descriptor reference for the purposes for satisfying dependencies.
As such, nested declarations and dependencies are populated in an init function.

Other changes to support the implementation:
* Added a protoimpl package to expose the MessageType type and also the
MessageTypeOf and EnumTypeOf helper functions.
* Added a protogen.File.GoIdent field to provide a suggested variable name
for the file descriptor.
* Added prototype.{Enum,Message}.Reference that provides a descriptor reference
for the purposes for satisfying cyclic dependencies.
* Added protoreflect.{Syntax,Cardinality,Kind}.GoString to obtain a Go source
identifier that represents the given constant.

Change-Id: I9455764882dee6ad10f251901e7d419091e8bf1d
Reviewed-on: https://go-review.googlesource.com/c/150074
Reviewed-by: Damien Neil <dneil@google.com>
2018-12-05 19:32:12 +00:00