Commit Graph

19 Commits

Author SHA1 Message Date
Joe Tsai
96a44732e0 proto: distinguish between invalid and empty messages in Equal
The v1 proto.Equal function treats (*Message)(nil) and new(Message)
as being different, while v2 proto.Equal treated them as equal since
a typed nil pointer is functionally an empty message since the
protobuf data model has no concept of presence as a first-class
property of messages.

Unfortunately, a significant amount of code depends on this distinction
that it would be difficult to migrate users from v1 to v2 unless we
preserved similar semantics in the v2 proto.Equal.

Also, double down on these semantics for protocmp.Transform.

Fixes #965

Change-Id: I21e78ba6251401a0ac0ccf495188093973cd7f3f
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/213238
Reviewed-by: Damien Neil <dneil@google.com>
2020-01-06 19:37:38 +00:00
Damien Neil
d0b074956d proto: rearrange test messages
Move the test inputs for the wire marshaler and unmarshaler out of
decode_test.go and into a new file. Consolidate some tests for invalid
messages (UTF-8 validation failures, field numbers out of range) into
a single list of invalid messages. Break out the no-enforce-utf8 test
into a separate file, since it is both complicated and conditional on
legacy support.

Change-Id: Ide80fa9d3aec2b6d42a57e6f9265358aa5e661a7
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/211557
Reviewed-by: Joe Tsai <joetsai@google.com>
2019-12-16 21:49:56 +00:00
Damien Neil
01c0e8d680 proto, internal/impl: make wire output more consistent with v1
The v1 wire marshaler sorts fields as follows:
  - All extensions, sorted by field number.
  - All non-oneof fields, sorted by field number.
  - All oneof fields, in indeterminate order.

We already make some steps toward supporting this ordering: The
fast path encoder places extensions in sorted order at the start
of the message.

This commit moves oneof fields to the end of the message, makes the
reflection-based encoder use this ordering when deterministic marshaling
is enabled, and adds a test to catch unintentional changes to the
ordering.

Users SHOULD NOT depend on stability of the marshal output. It is
subject to change over time. Without deterministic marshaling enabled,
it is subject to change over calls to Marshal.

Change-Id: I6cfd89090d790a3bb50785f32b94d2781d7d08db
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/206800
Reviewed-by: Joe Tsai <thebrokentoaster@gmail.com>
2019-11-12 20:59:03 +00:00
Damien Neil
1e5516a4c2 proto: improve slice growth in MarshalAppend
When allocating more space for the destination message in MarshalAppend,
use the same slice growth algorithm as the Go runtime's append rather
than allocating precisely the desired space.

Change-Id: If6033f6f7abdca473bc5188c4d3938ce57d3bdd2
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/197758
Reviewed-by: Joe Tsai <joetsai@google.com>
2019-09-28 01:00:09 +00:00
Joe Tsai
9b22b9382e internal/impl: treat a nil oneof wrapper as if it were unset
The old implementation had the behavior where a nil wrapper value:
	m := new(foopb.Message)
	m.OneofField = (*foopb.Message_OneofUint32)(nil)
was functionally equivalent to it being directly set to nil:
	m := new(foopb.Message)
	m.OneofField = nil
preserve this semantic in both the table-drive implementation
and the reflection implementation.

Change-Id: Ie44d51e044d4822e61d0e646fbc44aa8d9b90c1f
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/189559
Reviewed-by: Herbie Ong <herbie@google.com>
2019-08-16 00:24:53 +00:00
Joe Tsai
1799d1111a all: rename tag and flags for legacy support
Rename build tag "proto1_legacy" -> "protolegacy"
to be consistent with the "protoreflect" tag.

Rename flag constant "Proto1Legacy" -> "ProtoLegacy" since
it covers more than simply proto1 legacy features.
For example, it covers alpha-features of proto3 that
were eventually removed from the final proto3 release.

Change-Id: I0f4fcbadd4b5a61c87645e2e5be11d187e59157c
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/189345
Reviewed-by: Damien Neil <dneil@google.com>
2019-08-08 20:49:00 +00:00
Joe Tsai
c51e2e0293 all: support enforce_utf8 override
In 2014, when proto3 was being developed, there were a number of early
adopters of the new syntax. Before the finalization of proto3 when
it was released in open-source in July 2016, a decision was made to
strictly validate strings in proto3. However, some of the early adopters
were already using invalid UTF-8 with string fields.
The google.protobuf.FieldOptions.enforce_utf8 option only exists to support
those grandfathered users where they can opt-out of the validation logic.
Practical use of that option in open source is impossible even if a user
specifies the proto1_legacy build tag since it requires a hacked
variant of descriptor.proto that is not externally available.

This CL supports enforce_utf8 by modifiyng internal/filedesc to
expose the flag if it detects it in the raw descriptor.
We add an strs.EnforceUTF8 function as a centralized place to determine
whether to perform validation. Validation opt-out is supported
only in builds with legacy support.

We implement support for validating UTF-8 in all proto3 string fields,
even if they are backed by a Go []byte.

Change-Id: I9c0628b84909bc7181125f09db730c80d490e485
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/186002
Reviewed-by: Damien Neil <dneil@google.com>
2019-07-15 19:53:05 +00:00
Joe Tsai
09cef32bce proto: cleanup invalid extension exception
The invalidExtensions flag no longer seems necessary. Tests pass without it.

Change-Id: Ieb35e26912b047718ccbfcdc926625aec1cd8c87
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/185937
Reviewed-by: Damien Neil <dneil@google.com>
2019-07-12 16:21:09 +00:00
Joe Tsai
8d30bbeede all: remove dependency on proto v1
This does not remove all dependencies,
but all of the cases where it can now be implemented in terms of v2.

Change-Id: Idc5b0273f0d35c284bf2141eb9cce998692ceb15
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/184878
Reviewed-by: Herbie Ong <herbie@google.com>
2019-07-03 04:59:17 +00:00
Damien Neil
8c86fc5e7d all: remove non-fatal UTF-8 validation errors (and non-fatal in general)
Immediately abort (un)marshal operations when encountering invalid UTF-8
data in proto3 strings. No other proto implementation supports non-UTF-8
data in proto3 strings (and many reject it in proto2 strings as well).
Producing invalid output is an interoperability threat (other
implementations won't be able to read it).

The case where existing string data is found to contain non-UTF8 data is
better handled by changing the field to the `bytes` type, which (aside
from UTF-8 validation) is wire-compatible with `string`.

Remove the errors.NonFatal type, since there are no remaining cases
where it is needed. "Non-fatal" errors which produce results and a
non-nil error are problematic because they compose poorly; the better
approach is to take an option like AllowPartial indicating which
conditions to check for.

Change-Id: I9d189ec6ffda7b5d96d094aa1b290af2e3f23736
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/183098
Reviewed-by: Joe Tsai <thebrokentoaster@gmail.com>
2019-06-20 20:55:13 +00:00
Joe Tsai
378c1329de reflect/protoreflect: add alternative message reflection API
Added API:
	Message.Len
	Message.Range
	Message.Has
	Message.Clear
	Message.Get
	Message.Set
	Message.Mutable
	Message.NewMessage
	Message.WhichOneof
	Message.GetUnknown
	Message.SetUnknown

Deprecated API (to be removed in subsequent CL):
	Message.KnownFields
	Message.UnknownFields

The primary difference with the new API is that the top-level
Message methods are keyed by FieldDescriptor rather than FieldNumber
with the following semantics:
* For known fields, the FieldDescriptor must exactly match the
field descriptor known by the message.
* For extension fields, the FieldDescriptor must implement ExtensionType,
where ContainingMessage.FullName matches the message name, and
the field number is within the message's extension range.
When setting an extension field, it automatically stores
the extension type information.
* Extension fields are always considered nullable,
implying that repeated extension fields are nullable.
That is, you can distinguish between a unpopulated list and an empty list.
* Message.Get always returns a valid Value even if unpopulated.
The behavior is already well-defined for scalars, but for unpopulated
composite types, it now returns an empty read-only version of it.

Change-Id: Ia120630b4db221aeaaf743d0f64160e1a61a0f61
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/175458
Reviewed-by: Damien Neil <dneil@google.com>
2019-06-17 17:33:24 +00:00
Damien Neil
c37adefdac internal/impl: add fast-path marshal implementation
This is a port of the v1 table marshaler, with some substantial
cleanup and refactoring.

Benchstat results from the protobuf reference benchmark data comparing the
v1 package with v2, with AllowPartial:true set for the new package. This
is not an apples-to-apples comparison, since v1 doesn't have a way to
disable required field checks.  Required field checks in v2 package
currently go through reflection, which performs terribly; my initial
experimentation indicates that fast-path required field checks will
not add a large amount of cost; these results are incomplete but not
wholly inaccurate.

name                                           old time/op  new time/op  delta
/dataset.google_message3_1.pb/Marshal-12        219ms ± 1%   232ms ± 1%   +5.85%  (p=0.004 n=6+5)
/dataset.google_message2.pb/Marshal-12          261µs ± 3%   248µs ± 1%   -5.14%  (p=0.002 n=6+6)
/dataset.google_message1_proto2.pb/Marshal-12   681ns ± 2%   637ns ± 3%   -6.53%  (p=0.002 n=6+6)
/dataset.google_message1_proto3.pb/Marshal-12  1.10µs ± 8%  0.99µs ± 3%   -9.63%  (p=0.002 n=6+6)
/dataset.google_message3_3.pb/Marshal-12       44.2ms ± 3%  35.2ms ± 1%  -20.28%  (p=0.004 n=6+5)
/dataset.google_message4.pb/Marshal-12         91.4ms ± 2%  94.9ms ± 2%   +3.78%  (p=0.002 n=6+6)
/dataset.google_message3_2.pb/Marshal-12       78.7ms ± 6%  80.8ms ± 4%     ~     (p=0.310 n=6+6)
/dataset.google_message3_4.pb/Marshal-12       10.6ms ± 3%  10.6ms ± 8%     ~     (p=0.662 n=5+6)
/dataset.google_message3_5.pb/Marshal-12        675ms ± 4%   510ms ± 2%  -24.40%  (p=0.002 n=6+6)
/dataset.google_message3_1.pb/Marshal           219ms ± 1%   236ms ± 7%   +8.06%  (p=0.004 n=5+6)
/dataset.google_message2.pb/Marshal             257µs ± 1%   250µs ± 3%     ~     (p=0.052 n=5+6)
/dataset.google_message1_proto2.pb/Marshal      685ns ± 1%   628ns ± 1%   -8.41%  (p=0.008 n=5+5)
/dataset.google_message1_proto3.pb/Marshal     1.08µs ± 1%  0.98µs ± 2%   -9.31%  (p=0.004 n=5+6)
/dataset.google_message3_3.pb/Marshal          43.7ms ± 1%  35.1ms ± 1%  -19.76%  (p=0.002 n=6+6)
/dataset.google_message4.pb/Marshal            93.4ms ± 4%  94.9ms ± 2%     ~     (p=0.180 n=6+6)
/dataset.google_message3_2.pb/Marshal           105ms ± 2%    98ms ± 7%   -6.81%  (p=0.009 n=5+6)
/dataset.google_message3_4.pb/Marshal          16.3ms ± 6%  15.7ms ± 3%   -3.44%  (p=0.041 n=6+6)
/dataset.google_message3_5.pb/Marshal           676ms ± 4%   504ms ± 2%  -25.50%  (p=0.004 n=6+5)

Change-Id: I72cc4597117f4cf5d236ef505777d49dd4a5f75d
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/171020
Reviewed-by: Joe Tsai <thebrokentoaster@gmail.com>
2019-05-16 22:13:43 +00:00
Damien Neil
e89e6244e0 all: change module to google.golang.org/protobuf
Temporarily remove go.mod, since we can't generate an accurate one until
the corresponding v1 change is submitted.

Change-Id: I1e1ad97f2b455e33f61ffaeb8676289795e47e72
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/177000
Reviewed-by: Joe Tsai <thebrokentoaster@gmail.com>
2019-05-14 17:28:29 +00:00
Damien Neil
e6f060fdac proto: add Equal
Add support for basic equality comparison of messages.

Messages are equal if they have the same type and marshal to the
same bytes with deterministic serialization, with some exceptions:

 - Messages with different registered extensions are unequal.
 - NaN is not equal to itself.

Unlike the v1 Equal, a nil message is equal to an empty message of
the same type.

Change-Id: Ibabdadd8c767b801051b8241aeae1ba077e58121
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/174277
Reviewed-by: Joe Tsai <thebrokentoaster@gmail.com>
2019-04-29 23:34:17 +00:00
Damien Neil
bc310b58c6 proto: validate UTF-8 in proto3 strings
Change-Id: I6a495730c3f438e7b2c4ca86edade7d6f25aa47d
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/171700
Reviewed-by: Herbie Ong <herbie@google.com>
2019-04-12 17:43:10 +00:00
Damien Neil
3016b73382 proto: fix MarshalAppend fast path
Was overwriting the output buffer rather than appending to it.

Change-Id: I6ffb72a440f464f4259cfebc42c1dc75b73fb5ae
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/171117
Reviewed-by: Joe Tsai <thebrokentoaster@gmail.com>
2019-04-08 04:31:27 +00:00
Damien Neil
96c229ab14 proto: check for required fields in encoding/decoding
Change-Id: I0555a92e0399782f075b1dcd248e880dd48c7d6d
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/170579
Reviewed-by: Herbie Ong <herbie@google.com>
2019-04-04 17:56:20 +00:00
Damien Neil
61e93c70a2 proto: add generic Size
Change-Id: I4ed123f4a9747fb4aba392bc5b9608d294bacc4d
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/169697
Reviewed-by: Joe Tsai <thebrokentoaster@gmail.com>
2019-03-27 17:25:13 +00:00
Damien Neil
99f24c33b0 proto: wire encoding support
Add proto.Marshal.

Change-Id: If7254bb4c4cbbee782a2a163762f9fcf98e7ab08
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/167388
Reviewed-by: Joe Tsai <thebrokentoaster@gmail.com>
2019-03-14 19:12:27 +00:00