14 Commits

Author SHA1 Message Date
Damien Neil
d30e561d9e proto: add MarshalState, UnmarshalState
Add functions to the proto package which plumb through the fast-path state.

As a sample use case: A followup CL adds an Initialized field to
protoiface.UnmarshalOutput, permitting the unmarshaller to report back
when it can confirm that a message is fully initialized. We want to
preserve that information when an unmarshal operation threads through
the proto package (such as when unmarshaling extensions).

To allow these functions to be added as methods of MarshalOptions and
UnmarshalOptions rather than top-level functions, separate the options
from the input structs.

Also update options passed to fast-path methods to set AllowPartial and
Merge to reflect the expected behavior of those methods. (Always allow
partial, never merge.)

Change-Id: I482477b0c9340793be533e75a86d0bb88708716a
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/215877
Reviewed-by: Joe Tsai <joetsai@google.com>
2020-01-22 20:52:17 +00:00
Damien Neil
61781dd92f all: abstract fast-path marshal and unmarshal inputs and outputs
We may want to make changes to the inputs and outputs of the fast-path
functions in the future. For example, we likely want to add the ability
for the fast-path unmarshal to report back whether the unmarshaled
message is known to be initialized.

Change the signatures of these functions to take in and return struct
types which can be extended with whatever fields we want in the future.

Change-Id: Idead360785df730283a4630ea405265b72482e62
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/215719
Reviewed-by: Joe Tsai <joetsai@google.com>
2020-01-22 00:22:46 +00:00
Damien Neil
ce413af0b3 internal/impl: faster oneof marshaling
Change size, marshal, and isinit operations on oneofs to look up the
currently-set oneof type in a map rather than testing for each possible
oneof field in turn.

Significantly improves oneof encoding speed for oneofs with a
substantial number of fields:

  go test ./proto -bench=./oneof.*string.*test.TestAll -benchmem -count=8 -cpu=1

  name                                        old time/op    new time/op    delta
  Encode/oneof_(string)_(*test.TestAllTypes)     911ns ± 1%     397ns ± 3%  -56.45%  (p=0.000 n=8+7)
  Decode/oneof_(string)_(*test.TestAllTypes)     899ns ± 1%     922ns ± 1%   +2.49%  (p=0.001 n=7+7)

Fixes golang/protobuf#950

Change-Id: I9393a87975ce09011d885a8af4a63a639ea8452f
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/210281
Reviewed-by: Joe Tsai <thebrokentoaster@gmail.com>
2019-12-09 22:00:58 +00:00
Damien Neil
79571e90e2 internal/impl: improve extension fast path performance
Stash fast-path information for extensions on the ExtensionInfo. In
the usual case where an ExtensionType's underlying implementation is
an *ExtensionInfo, fetching the fast-path information becomes a type
assertion rather than a mutex-guarded map access.

Maintain a global sync.Map for the case where an ExtensionType isn't an
*ExtensionInfo.

Substantially improves performance for fast-path operations on
extensions:

Encode/MessageSet_type_id_before_message_content-12      267ns ± 1%   185ns ± 1%  -30.44%  (p=0.001 n=7+7)
Encode/basic_scalar_types_(*test.TestAllExtensions)-12  1.94µs ± 1%  0.40µs ± 1%  -79.32%  (p=0.000 n=8+7)

Change-Id: If048b521deb3665a090ea3d0a178c61691d4201e
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/210540
Reviewed-by: Joe Tsai <joetsai@google.com>
2019-12-09 18:59:42 +00:00
Damien Neil
ce3384cd34 proto, internal/impl: store unknown MessageSet items in non-mset format
In the v1 implementation, unknown MessageSet items are stored in a
message's unknown fields section in non-MessageSet format. For example,
consider a MessageSet containing an item with type_id T and value V.
If the type_id is not resolvable, the item will be placed in the unknown
fields as a bytes-valued field with number T and contents V. This
conversion is then reversed when marshaling a MessageSet containing
unknown fields.

Preserve this behavior in v2.

One consequence of this change is that actual unknown fields in a
MessageSet (any field other than 1) are now discarded. This matches
the previous behavior.

Change-Id: I3d913613f84e0ae82481078dbc91cb25628651cc
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/205697
Reviewed-by: Joe Tsai <thebrokentoaster@gmail.com>
2019-11-11 19:40:27 +00:00
Damien Neil
68b81c3117 internal/impl: store extension values as Values
Change the storage type of ExtensionField from interface{} to
protoreflect.Value.

Replace the codec functions operating on interface{}s with ones
operating on Values.

Values are potentially more efficient, since they can represent
non-pointer types without allocation. This also reduces the number of
types used to represent field values.

Additionally, this change lays groundwork for changing the
user-visible representation of repeated extension fields from
*[]T to []T. The storage type for extension fields must support mutation
(thus *[]T currently); changing the storage type to a Value permits this
without the need to introduce yet another view on field values.

Change-Id: Ida336be14112bb940f655236eb58df21bf312525
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/192218
Reviewed-by: Joe Tsai <thebrokentoaster@gmail.com>
2019-09-03 20:58:28 +00:00
Joe Tsai
f8b855d768 runtime/protoiface: API adjustments
The following adjustments were made:
* The pragma.NoUnkeyedLiterals is moved to be the first field.
This is done to keep the options struct smaller. Even if the last
field is zero-length, Go GC implementation details forces the struct
to be padded at the end.
* Methods are documented as always treating AllowPartial as true.
* Added a support flag for UnmarshalOptions.DiscardUnknown.

Change-Id: I1f75d226542ab2bb0123d9cea143c7060df226d8
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/185998
Reviewed-by: Damien Neil <dneil@google.com>
2019-07-12 23:12:26 +00:00
Joe Tsai
0f81b38d61 runtime/protoiface: move and rename XXX_Methods
This CL moves and renames the protoreflect.ProtoMessage.XXX_Methods
to protoreflect.Message.ProtoMethods.

Since one needs to obtain a protoreflect.Message now to get at
the fast-path methods, we modify the method signatures to take
in a protoreflect.Message instead of protoreflect.ProtoMessage.
Doing so also avoids the wrapper hack that was formerly done on
impl.messageReflectWrapper.

After this change the new protoc-gen-go no longer generates
any XXX fields or methods. All internal fields and methods are truly
hidden from the end-user.

name                                     old time/op    new time/op    delta
Wire/Unmarshal/google_message1_proto2-4    1.50µs ±10%    1.50µs ± 2%    ~     (p=0.483 n=10+9)
Wire/Unmarshal/google_message1_proto3-4    1.06µs ± 6%    1.06µs ± 4%    ~     (p=0.814 n=9+9)
Wire/Unmarshal/google_message2-4            734µs ±22%     689µs ±13%    ~     (p=0.133 n=10+9)
Wire/Marshal/google_message1_proto2-4       790ns ±46%     652ns ± 8%    ~     (p=0.590 n=10+9)
Wire/Marshal/google_message1_proto3-4       872ns ± 4%     857ns ± 3%    ~     (p=0.168 n=9+9)
Wire/Marshal/google_message2-4              232µs ±16%     221µs ± 3%  -4.75%  (p=0.014 n=9+9)
Wire/Size/google_message1_proto2-4          164ns ± 2%     167ns ± 4%  +1.87%  (p=0.046 n=9+10)
Wire/Size/google_message1_proto3-4          240ns ± 9%     229ns ± 1%  -4.81%  (p=0.000 n=9+8)
Wire/Size/google_message2-4                58.9µs ± 9%    59.6µs ± 2%  +1.23%  (p=0.040 n=9+9)

name                                     old alloc/op   new alloc/op   delta
Wire/Unmarshal/google_message1_proto2-4      912B ± 0%      912B ± 0%    ~     (all equal)
Wire/Unmarshal/google_message1_proto3-4      688B ± 0%      688B ± 0%    ~     (all equal)
Wire/Unmarshal/google_message2-4            470kB ± 0%     470kB ± 0%    ~     (p=0.215 n=10+10)
Wire/Marshal/google_message1_proto2-4        240B ± 0%      240B ± 0%    ~     (all equal)
Wire/Marshal/google_message1_proto3-4        224B ± 0%      224B ± 0%    ~     (all equal)
Wire/Marshal/google_message2-4             90.1kB ± 0%    90.1kB ± 0%    ~     (all equal)
Wire/Size/google_message1_proto2-4          0.00B          0.00B         ~     (all equal)
Wire/Size/google_message1_proto3-4          0.00B          0.00B         ~     (all equal)
Wire/Size/google_message2-4                 0.00B          0.00B         ~     (all equal)

name                                     old allocs/op  new allocs/op  delta
Wire/Unmarshal/google_message1_proto2-4      24.0 ± 0%      24.0 ± 0%    ~     (all equal)
Wire/Unmarshal/google_message1_proto3-4      6.00 ± 0%      6.00 ± 0%    ~     (all equal)
Wire/Unmarshal/google_message2-4            8.49k ± 0%     8.49k ± 0%    ~     (all equal)
Wire/Marshal/google_message1_proto2-4        1.00 ± 0%      1.00 ± 0%    ~     (all equal)
Wire/Marshal/google_message1_proto3-4        1.00 ± 0%      1.00 ± 0%    ~     (all equal)
Wire/Marshal/google_message2-4               1.00 ± 0%      1.00 ± 0%    ~     (all equal)
Wire/Size/google_message1_proto2-4           0.00           0.00         ~     (all equal)
Wire/Size/google_message1_proto3-4           0.00           0.00         ~     (all equal)
Wire/Size/google_message2-4                  0.00           0.00         ~     (all equal)

Change-Id: Ibf3263ad0f293326695c22020a92a6b938ef4f65
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/185697
Reviewed-by: Damien Neil <dneil@google.com>
2019-07-12 19:31:58 +00:00
Damien Neil
e91877de26 internal/impl: add fast-path unmarshal
Benchmarks run with:
  go test ./benchmarks/ -bench=Wire  -benchtime=500ms -benchmem -count=8

Fast-path vs. parent commit:

  name                                      old time/op    new time/op    delta
  Wire/Unmarshal/google_message1_proto2-12    1.35µs ± 2%    0.45µs ± 4%  -67.01%  (p=0.000 n=8+8)
  Wire/Unmarshal/google_message1_proto3-12    1.07µs ± 1%    0.31µs ± 1%  -71.04%  (p=0.000 n=8+8)
  Wire/Unmarshal/google_message2-12            691µs ± 2%     188µs ± 2%  -72.78%  (p=0.000 n=7+8)

  name                                      old allocs/op  new allocs/op  delta
  Wire/Unmarshal/google_message1_proto2-12      60.0 ± 0%      25.0 ± 0%  -58.33%  (p=0.000 n=8+8)
  Wire/Unmarshal/google_message1_proto3-12      42.0 ± 0%       7.0 ± 0%  -83.33%  (p=0.000 n=8+8)
  Wire/Unmarshal/google_message2-12            28.6k ± 0%      8.5k ± 0%  -70.34%  (p=0.000 n=8+8)

Fast-path vs. -v1:

  name                                      old time/op    new time/op    delta
  Wire/Unmarshal/google_message1_proto2-12     702ns ± 1%     445ns ± 4%   -36.58%  (p=0.000 n=8+8)
  Wire/Unmarshal/google_message1_proto3-12     604ns ± 1%     311ns ± 1%   -48.54%  (p=0.000 n=8+8)
  Wire/Unmarshal/google_message2-12            179µs ± 3%     188µs ± 2%    +5.30%  (p=0.000 n=7+8)

  name                                      old allocs/op  new allocs/op  delta
  Wire/Unmarshal/google_message1_proto2-12      26.0 ± 0%      25.0 ± 0%    -3.85%  (p=0.000 n=8+8)
  Wire/Unmarshal/google_message1_proto3-12      8.00 ± 0%      7.00 ± 0%   -12.50%  (p=0.000 n=8+8)
  Wire/Unmarshal/google_message2-12            8.49k ± 0%     8.49k ± 0%    -0.01%  (p=0.000 n=8+8)

Change-Id: I6247ac3fd66a63d9acb902cbd192094ee3d151c3
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/185147
Reviewed-by: Joe Tsai <thebrokentoaster@gmail.com>
2019-07-09 19:56:42 +00:00
Damien Neil
4ae30bbb21 internal/impl: refactor fast-path
Move data used by the fast-path implementations into a substructure of
MessageInfo and initialize it separately.

Change-Id: Ib855ee8ea5cb0379528b52ba0e191319aa5e2dff
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/184077
Reviewed-by: Joe Tsai <thebrokentoaster@gmail.com>
2019-06-27 20:08:19 +00:00
Damien Neil
8c86fc5e7d all: remove non-fatal UTF-8 validation errors (and non-fatal in general)
Immediately abort (un)marshal operations when encountering invalid UTF-8
data in proto3 strings. No other proto implementation supports non-UTF-8
data in proto3 strings (and many reject it in proto2 strings as well).
Producing invalid output is an interoperability threat (other
implementations won't be able to read it).

The case where existing string data is found to contain non-UTF8 data is
better handled by changing the field to the `bytes` type, which (aside
from UTF-8 validation) is wire-compatible with `string`.

Remove the errors.NonFatal type, since there are no remaining cases
where it is needed. "Non-fatal" errors which produce results and a
non-nil error are problematic because they compose poorly; the better
approach is to take an option like AllowPartial indicating which
conditions to check for.

Change-Id: I9d189ec6ffda7b5d96d094aa1b290af2e3f23736
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/183098
Reviewed-by: Joe Tsai <thebrokentoaster@gmail.com>
2019-06-20 20:55:13 +00:00
Joe Tsai
89d49632e5 internal/impl: abstract away ExtensionDescV1 as the underlying descriptor
Add ExtensionField.{SetType,GetType} to hide the fact that the underlying
descriptor is actually an ExtensionDescV1.

Change-Id: I1d0595484ced0a88d2df0852a732fdf0fe9aa232
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/180538
Reviewed-by: Damien Neil <dneil@google.com>
2019-06-05 19:53:14 +00:00
Joe Tsai
4fe9663f4c internal/impl: rename MessageType as MessageInfo
The name MessageType is easily confused with protoreflect.MessageType.
Rename it as MessageInfo, which follows the pattern set by v1,
where the equivalent data structure is called InternalMessageInfo.

Change-Id: I535956e1f7c6e9b07e9585e889d5e93388d0d2ce
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/178478
Reviewed-by: Damien Neil <dneil@google.com>
2019-05-22 14:54:35 +00:00
Damien Neil
c37adefdac internal/impl: add fast-path marshal implementation
This is a port of the v1 table marshaler, with some substantial
cleanup and refactoring.

Benchstat results from the protobuf reference benchmark data comparing the
v1 package with v2, with AllowPartial:true set for the new package. This
is not an apples-to-apples comparison, since v1 doesn't have a way to
disable required field checks.  Required field checks in v2 package
currently go through reflection, which performs terribly; my initial
experimentation indicates that fast-path required field checks will
not add a large amount of cost; these results are incomplete but not
wholly inaccurate.

name                                           old time/op  new time/op  delta
/dataset.google_message3_1.pb/Marshal-12        219ms ± 1%   232ms ± 1%   +5.85%  (p=0.004 n=6+5)
/dataset.google_message2.pb/Marshal-12          261µs ± 3%   248µs ± 1%   -5.14%  (p=0.002 n=6+6)
/dataset.google_message1_proto2.pb/Marshal-12   681ns ± 2%   637ns ± 3%   -6.53%  (p=0.002 n=6+6)
/dataset.google_message1_proto3.pb/Marshal-12  1.10µs ± 8%  0.99µs ± 3%   -9.63%  (p=0.002 n=6+6)
/dataset.google_message3_3.pb/Marshal-12       44.2ms ± 3%  35.2ms ± 1%  -20.28%  (p=0.004 n=6+5)
/dataset.google_message4.pb/Marshal-12         91.4ms ± 2%  94.9ms ± 2%   +3.78%  (p=0.002 n=6+6)
/dataset.google_message3_2.pb/Marshal-12       78.7ms ± 6%  80.8ms ± 4%     ~     (p=0.310 n=6+6)
/dataset.google_message3_4.pb/Marshal-12       10.6ms ± 3%  10.6ms ± 8%     ~     (p=0.662 n=5+6)
/dataset.google_message3_5.pb/Marshal-12        675ms ± 4%   510ms ± 2%  -24.40%  (p=0.002 n=6+6)
/dataset.google_message3_1.pb/Marshal           219ms ± 1%   236ms ± 7%   +8.06%  (p=0.004 n=5+6)
/dataset.google_message2.pb/Marshal             257µs ± 1%   250µs ± 3%     ~     (p=0.052 n=5+6)
/dataset.google_message1_proto2.pb/Marshal      685ns ± 1%   628ns ± 1%   -8.41%  (p=0.008 n=5+5)
/dataset.google_message1_proto3.pb/Marshal     1.08µs ± 1%  0.98µs ± 2%   -9.31%  (p=0.004 n=5+6)
/dataset.google_message3_3.pb/Marshal          43.7ms ± 1%  35.1ms ± 1%  -19.76%  (p=0.002 n=6+6)
/dataset.google_message4.pb/Marshal            93.4ms ± 4%  94.9ms ± 2%     ~     (p=0.180 n=6+6)
/dataset.google_message3_2.pb/Marshal           105ms ± 2%    98ms ± 7%   -6.81%  (p=0.009 n=5+6)
/dataset.google_message3_4.pb/Marshal          16.3ms ± 6%  15.7ms ± 3%   -3.44%  (p=0.041 n=6+6)
/dataset.google_message3_5.pb/Marshal           676ms ± 4%   504ms ± 2%  -25.50%  (p=0.004 n=6+5)

Change-Id: I72cc4597117f4cf5d236ef505777d49dd4a5f75d
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/171020
Reviewed-by: Joe Tsai <thebrokentoaster@gmail.com>
2019-05-16 22:13:43 +00:00