Commit Graph

13 Commits

Author SHA1 Message Date
Joe Tsai
5d72cc2d37 cmd/protoc-gen-go: lazily GZIP-encode the raw descriptor
This reduces the init-time cost slightly since the GZIP'd
raw descriptor is constructed lazily on demand.

Change-Id: I482c6a2201b8786e425d7dee5612fdfd60ab1500
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/169917
Reviewed-by: Herbie Ong <herbie@google.com>
2019-04-01 20:24:54 +00:00
Joe Tsai
35ec98fdcb cmd/protoc-gen-go: generate for v2-only dependencies
This removes yet another set of dependencies of v2 on v1.
The only remaining dependency are in the _test.go files,
primarily for proto.Equal.

Changes made:
* cmd/protoc-gen-go no longer generates any functionality that depends
on the v1 package, and instead only depends on v2.
* internal/fileinit.FileBuilder.MessageOutputTypes is switched from
protoreflect.MessageType to protoimpl.MessageType since the
implementation must be fully inialized before registration occurs.
* The test for internal/legacy/file_test.go is switched to a legacy_test
package to avoid a cyclic dependency.
This requires Load{Enum,Message,File}Desc to be exported.

Change-Id: I43e2fe64cff4eea204258ce11e791aca5eb6e569
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/169298
Reviewed-by: Damien Neil <dneil@google.com>
2019-03-26 17:03:31 +00:00
Joe Tsai
559d47f1da cmd/protoc-gen-go: fix init order for v1 registration
The v1 registration leaks the message types out to the proto package.
When doing that, it must ensure that the reflection data structures
for those types are properly initialized first. We achieve that by
doing v1 registration at the end of the reflection init function.

Change-Id: If6df18df59d05bad50ff39c2eff6beb19e7466cc
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/168348
Reviewed-by: Damien Neil <dneil@google.com>
2019-03-20 04:16:33 +00:00
Joe Tsai
8e506a8704 cmd/protoc-gen-go: rely on protoimpl for basic helpers
The EnumName, UnmarshalJSONEnum, and CompressGZIP helpers currently live
in v1 protoapi, which would cause all generated messages to depend on v1.
In an effort to break the dependency of v2 on v1, we move these helper
functions to v2 (and re-written to take advantage of protobuf reflection).

These helpers are unfortunate, but we cannot eliminate the functionality
that they implement since they are exposed in the publicly generated API.

Since EnumName does not rely on the enum maps, it removes another dependency
on those variables. Eventually, we can get to the point where these variables
(though declared) are not linked into the binary if the user does not use them.

Also, we rely on the v1 proto package for registration instead of v1 protoapi.
This may re-introduce a cyclic dependency on descriptor proto again in the
future, but the better approach is to just start registering with v2.

Change-Id: Id755585a7a1df14e4a6a2dfa650df221a3c153fb
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/167921
Reviewed-by: Damien Neil <dneil@google.com>
2019-03-18 18:50:16 +00:00
Damien Neil
0fc224513b cmd/protoc-gen-go: enforce init order within packages
Ensure that the init funcs for files within a Go package run in the
dependency order of the source .proto files. That is, if a.proto and b.proto
are part of the same Go package, and a.proto imports b.proto, then b.pb.go's
init funcs must run before a.pb.go's.

Change-Id: I0e86ff22e5c4cab9df7a73fe4805390fadd34b0d
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/166419
Reviewed-by: Joe Tsai <thebrokentoaster@gmail.com>
Reviewed-by: Herbie Ong <herbie@google.com>
2019-03-10 22:19:14 +00:00
Damien Neil
6bb8dec7f6 cmd/protoc-gen-go: change some arrays to slices to save bytes
Using arrays in the generated reflection information adds unnecessary
eq and hash functions being added to the package. Change to slices
to reduce bloat.

Change-Id: I1a4f6d59021644d93dd6c24679b9233141e89a75
Reviewed-on: https://go-review.googlesource.com/c/164640
Reviewed-by: Joe Tsai <thebrokentoaster@gmail.com>
2019-03-01 23:53:03 +00:00
Joe Tsai
4069211bcd protogen: use full path for generated file variable name
Use the full path (including the extension) for the generation of
the per-file variable name. Several reasons for this:

* The current logic is buggy in the case where pathType == pathTypeImport
since the prefix variable will be mangled with the Go import path.

* The extension is technically part of the path.
Thus, "path/to/foo.proto" and "path/to/foo.protodevel" are two
distinctly different imports.

* Style-wise, it subjectively looks better. Rather than being a mixture
of camelCase and snake_case, it is all snake_case for the common case:
	before: ProtoFile_google_protobuf_any
	after:  File_google_protobuf_any_proto

* Since the extension is almost always ".proto", this results in a
suffix of "_proto", which provides an additional layer of protection
against possible name conflicts. The previous approach could possibly
have a conflict between "Foo.proto" and a message named ProtoFile
with a sub-message called Foo.

Also, use the per-file variable name for the raw descriptor variables
instead of the hashed version.

Change-Id: Ic91e326b7593e5985cee6ececc60539c27fe32fe
Reviewed-on: https://go-review.googlesource.com/c/164379
Reviewed-by: Damien Neil <dneil@google.com>
2019-03-01 00:13:31 +00:00
Joe Tsai
cf81e67b53 cmd/protoc-gen-go: use protoapi.CompressGZIP
Calling a helper function directly should reduce binary bloat slightly.

Change-Id: I6068dc4cd00c8d90d2e6e6d99633b81388bc8781
Reviewed-on: https://go-review.googlesource.com/c/164679
Reviewed-by: Damien Neil <dneil@google.com>
2019-02-28 22:28:32 +00:00
Damien Neil
71c6603a26 protogen: use full filename in per-file vars
For a file "foo/bar.proto", put the FileDescriptor in "ProtoFile_foo_bar"
rather than "Bar_fileDescriptor".

Avoid name clashes when a package contains "a/foo.proto" and "b/foo.proto".

Don't camelcase the filename: These vars weren't fully camelcased to begin
with, and leaving the filename relatively unchanged is clearer and more
predictable.

Move "ProtoFile" from the end of the var name to the start, so that vars
will sort better in packages with multiple descriptors.

These changes do add a chance of name collision when the input filename
begins with an uppercase letter: Foo.proto becomes "ProtoFile_Foo", which
could be the result of camelcasing "proto_file.foo". The readability
benefits seem worth it.

Change-Id: If27d3a0d7b5bf3535aa1607a8579eb057c74d2dc
Reviewed-on: https://go-review.googlesource.com/c/163199
Reviewed-by: Joe Tsai <thebrokentoaster@gmail.com>
Reviewed-by: Herbie Ong <herbie@google.com>
2019-02-21 22:37:58 +00:00
Damien Neil
8012b444ee internal/fileinit: generate reflect data structures from raw descriptors
This CL takes a significantly different approach to generating support
for protobuf reflection. The previous approach involved generating a
large number of Go literals to represent the reflection information.
While that approach was correct, it resulted in too much binary bloat.

The approach taken here initializes the reflection information from
the raw descriptor proto, which is a relatively dense representation
of the protobuf reflection information. In order to keep initialization
cost low, several measures were taken:
* At program init, the bare minimum is parsed in order to initialize
naming information for enums, messages, extensions, and services declared
in the file. This is done because those top-level declarations are often
relevant for registration.
* Only upon first are most of the other data structures for protobuf
reflection actually initialized.
* Instead of using proto.Unmarshal, a hand-written unmarshaler is used.
This allows us to avoid a dependendency on the descriptor proto and also
because the API for the descriptor proto is fundamentally non-performant
since it requires an allocation for every primitive field.

At a high-level, the new implementation lives in internal/fileinit.

Several changes were made to other parts of the repository:
* cmd/protoc-gen-go:
  * Stop compressing the raw descriptors. While compression does reduce
the size of the descriptors by approximately 2x, it is a pre-mature
optimization since the descriptors themselves are around 1% of the total
binary bloat that is due to generated protobufs.
  * Seeding protobuf reflection from the raw descriptor significantly
simplifies the generator implementation since it is no longer responsible
for constructing a tree of Go literals to represent the same information.
  * We remove the generation of the shadow types and instead call
protoimpl.MessageType.MessageOf. Unfortunately, this incurs an allocation
for every call to ProtoReflect since we need to allocate a tuple that wraps
a pointer to the message value, and a pointer to message type.
* internal/impl:
  * We add a MessageType.GoType field and make it required that it is
set prior to first use. This is done so that we can avoid calling
MessageType.init except for when it is actually needed. The allows code
to call (*FooMessage)(nil).ProtoReflect().Type() without fearing that the
init code will run, possibly triggering a recursive deadlock (where the
init code depends on getting the Type of some dependency which may be
declared within the same file).
* internal/cmd/generate-types:
  * The code to generate reflect/prototype/protofile_list_gen.go was copied
and altered to generated internal/fileinit.desc_list_gen.go.

At a high-level this CL adds significant technical complexity.
However, this is offset by several possible future changes:
* The prototype package can be drastically simplified. We can probably
reimplement internal/legacy to use internal/fileinit instead, allowing us
to drop another dependency on the prototype package. As a result, we can
probably delete most of the constructor types in that package.
* With the prototype package significantly pruned, and the fact that generated
code no longer depend on depends on that package, we can consider merging
what's left of prototype into protodesc.

Change-Id: I6090f023f2e1b6afaf62bd3ae883566242e30715
Reviewed-on: https://go-review.googlesource.com/c/158539
Reviewed-by: Herbie Ong <herbie@google.com>
Reviewed-by: Joe Tsai <thebrokentoaster@gmail.com>
2019-01-30 01:33:46 +00:00
Joe Tsai
5681bb2587 protogen: use _protoFile suffix for file descriptor variable
A "_ProtoFile" suffix can potentially conflict with a sub-message named
"ProtoFile" nested within a message that matches the camel-cased
form of the basename of the .proto source file.

Avoid unlikely conflicts and rename this to use a "_protoFile" suffix,
which can never conflict except with an enum value that is also named
"protoFile" (which is a violation of the style guide).

Change-Id: Ie9d22f9f741a63021b8f76906b20c6c2f599885b
Reviewed-on: https://go-review.googlesource.com/c/157218
Reviewed-by: Damien Neil <dneil@google.com>
2019-01-14 20:23:59 +00:00
Joe Tsai
3bc7d6f5cd reflect: switch MessageType.New to return Message
Most usages of New actually prefer to interact with the reflective view
rather than the native Go type. Thus, change New to return that instead.
This parallels reflect.New, which returns the reflective view
(i.e., reflect.Value) instead of native type (i.e., interface{}).
We make the equivalent change to KnownFields.NewMessage, List.NewMessage,
and Map.NewMessage for consistency.

Since this is a subtle change where the type system will not always
catch the changed type, this change was made by both changing the type
and renaming the function to NewXXX and manually looking at every usage
of the the function to ensure that the usage correctly operates
on either the native Go type or the reflective view of the type.
After the entire codebase was cleaned up, a rename was performed to convert
NewXXX back to New.

Change-Id: I153fef627b4bf0a427e4039ce0aaec52e20c7950
Reviewed-on: https://go-review.googlesource.com/c/157077
Reviewed-by: Damien Neil <dneil@google.com>
2019-01-09 20:29:29 +00:00
Joe Tsai
d6966a4431 protogen: fix oneof name mangling regression
The generator currently uses an unintuitive and stateful algorithm
for name generation where it "fixes" name conflicts by appending "_"
to the end of the new name.

PR#657 refactored the generator code and noticed that the above
algorithm was not properly taking into account that a Get method is
generated for parent oneofs, fixing it in the same PR. While this is
more correct, this breaks users (see #780) since it means that the
generation of names can change.

This PR changes the name mangling logic to be as it was previously.
This does mean that some new proto files may be unbuildable,
but that is arguably better than breaking existing proto files

Change-Id: I2e354f4bb5d9c2b562fa2faa9149e949e2d86a0f
Reviewed-on: https://go-review.googlesource.com/c/156877
Reviewed-by: Damien Neil <dneil@google.com>
2019-01-09 07:23:28 +00:00