For large positive integers, add check for number of decimal digits
before converting number to plain integer w/o exponent.
If exponent value is large, previous implementation may end up
constructing a large string with lots of zeroes that is not useful as it
will fail later on when called with strconv.Parse{Uint,Int} anyways.
Fixesgolang/protobuf#1002.
Change-Id: I65bfad304401e076743853d7501786b7231b083b
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/213717
Reviewed-by: Damien Neil <dneil@google.com>
The JSONName constructor returns a struct value which shallow copies
a sync.Once within it; this is a dubious pattern.
Instead, add a jsonName.Init method to initialize the value.
Change-Id: I190a7239b1b62a8041ee7e4e09c0fe37b64ff623
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/213237
Reviewed-by: Damien Neil <dneil@google.com>
The deprecated messageset format permits extension fields with numbers
greater than the usual maximum (1<<29-1). To support this, the
internal/encoding/wire package has disabled field number validation when
legacy support is enabled.
We shouldn't skip validating all field numbers for validity just because
we support larger ones in messagesets.
This change drops range validation from the wire package (other than
checking that numbers fit in an int32) and adds it to the wire
unmarshalers instead. This gives us validation where we care
about it (when unmarshaling a wire-format message) and allows for
best-effort handling of out-of-range numbers everywhere else.
Fixesgolang/protobuf#996
Change-Id: I4e11b8a8aa177dd60e89723570af074a317c2451
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/210290
Reviewed-by: Joe Tsai <thebrokentoaster@gmail.com>
In the v1 implementation, unknown MessageSet items are stored in a
message's unknown fields section in non-MessageSet format. For example,
consider a MessageSet containing an item with type_id T and value V.
If the type_id is not resolvable, the item will be placed in the unknown
fields as a bytes-valued field with number T and contents V. This
conversion is then reversed when marshaling a MessageSet containing
unknown fields.
Preserve this behavior in v2.
One consequence of this change is that actual unknown fields in a
MessageSet (any field other than 1) are now discarded. This matches
the previous behavior.
Change-Id: I3d913613f84e0ae82481078dbc91cb25628651cc
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/205697
Reviewed-by: Joe Tsai <thebrokentoaster@gmail.com>
This is more consistent with the indent documentation:
If indent is a non-empty string, it causes every entry in a List or Message
to be preceded by the indent and trailed by a newline.
Since an empty message has no entries, there should be no newlines.
Change-Id: I5d57165aaf94ca6b184bb35bf05d5d68f5ee9dd5
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/194877
Reviewed-by: Herbie Ong <herbie@google.com>
This is meant to deter users from doing byte for byte comparison.
Change-Id: If005d2dc1eba45eaa4254171d2f247820db109e4
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/194037
Reviewed-by: Joe Tsai <thebrokentoaster@gmail.com>
Simply move logic into similar code block. Maintains the same logic.
Change-Id: I7b5a3f3d57f6102c7919cdc03dd105f08d21aca3
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/194039
Reviewed-by: Joe Tsai <thebrokentoaster@gmail.com>
Descriptor methods generally return a Descriptor with no Go type
information. ExtensionType's Descriptor is an exception, returning an
ExtensionTypeDescriptor containing both the proto descriptor and a
reference back to the ExtensionType. The pure descriptor is accessed
by xt.Descriptor().Descriptor().
Rename ExtensionType's Descriptor method to TypeDescriptor to make it
clear that it behaves a bit differently.
Change 1/2: Add the TypeDescriptor method and deprecate Descriptor.
Change-Id: I1806095044d35a474d60f94d2a28bdf528f12238
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/192139
Reviewed-by: Joe Tsai <thebrokentoaster@gmail.com>
Rename build tag "proto1_legacy" -> "protolegacy"
to be consistent with the "protoreflect" tag.
Rename flag constant "Proto1Legacy" -> "ProtoLegacy" since
it covers more than simply proto1 legacy features.
For example, it covers alpha-features of proto3 that
were eventually removed from the final proto3 release.
Change-Id: I0f4fcbadd4b5a61c87645e2e5be11d187e59157c
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/189345
Reviewed-by: Damien Neil <dneil@google.com>
Change protoiface.ExtensionDescV1 to implement protoreflect.ExtensionType.
ExtensionDescV1's Name field conflicts with the Descriptor Name method,
so change the protoreflect.{Message,Enum,Extension}Type types to no
longer implement the corresponding Descriptor interface. This also leads
to a clearer distinction between the two types.
Introduce a protoreflect.ExtensionTypeDescriptor type which bridges
between ExtensionType and ExtensionDescriptor.
Add extension accessor functions to the proto package:
proto.{Has,Clear,Get,Set}Extension. These functions take a
protoreflect.ExtensionType parameter, which allows writing the
same function call using either the old or new API:
proto.GetExtension(message, somepb.E_ExtensionFoo)
Fixesgolang/protobuf#908
Change-Id: Ibc65d12a46666297849114fd3aefbc4a597d9f08
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/189199
Reviewed-by: Joe Tsai <thebrokentoaster@gmail.com>
Golden test output doesn't match when math.NaN() has different bits from
the test's NaNs. Drop the NaN-related tests as too fiddly to be worth
keeping.
Change-Id: I89cf961273c2afab3b6b9f6c63878816314e9f43
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/186639
Reviewed-by: Joe Tsai <thebrokentoaster@gmail.com>
This CL unifies common MessageSet logic in prototext and protojson
into the messageset package. While we are at it, also enable
MessageSet support only if the proto1_legacy build flag is enabled.
Change-Id: I1a7d475e8bb1dad61ecd286df45e4239e5bef072
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/185898
Reviewed-by: Damien Neil <dneil@google.com>
MessageSets are a deprecated proto1 feature, long since superseded by
extensions. Add disabled-by-default support behind flags.Proto1Legacy.
Change-Id: I7d3ace07f3b0efd59673034f3dc633b908345a88
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/185538
Reviewed-by: Joe Tsai <thebrokentoaster@gmail.com>
This implements generation of and reflection support for weak fields.
Weak fields are a proto1 feature where the "weak" option can be specified
on a singular message field. A weak reference results in generated code
that does not directly link in the dependency containing the weak message.
Weak field support is not added to any of the serialization logic.
Change-Id: I08ccfa72bc80b2ffb6af527a1677a0a81dcf33fb
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/185399
Reviewed-by: Damien Neil <dneil@google.com>
The strs.UnsafeString casts a []byte as a string.
This allows us to avoid duplicated functionality.
Change-Id: I9930b94bae35eac0f98c0fa62963b300bc8d7e49
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/185459
Reviewed-by: Herbie Ong <herbie@google.com>
We already support unmarshaling weak fields, support the other direction.
Change-Id: I514e6b7b18cf9a3567fb1775a1e389fe64f22d22
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/185340
Reviewed-by: Herbie Ong <herbie@google.com>
When decoding, only treat the name as being explicitly set if the
name differs from what the JSON name would have been had it been
automatically derived from the protobuf field name.
Change-Id: Ida9256eafe7af20c7a06be2d4fb298be44276104
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/185398
Reviewed-by: Herbie Ong <herbie@google.com>
Hyrum's Law dictates that if we do not prevent naughty behavior,
people will rely on it. If we do not validate that the provided
file descriptor is correct today, it will be near impossible
to add proper validation checks later on.
The logic added validates that the provided file descriptor is
correct according to the same semantics as protoc,
which was reversed engineered to derive the set of rules implemented here.
The rules are unfortunately complicated because protobuf is a language
full of many non-orthogonal features. While our logic is complicated,
it is still 1/7th the size of the equivalent C++ code!
Change-Id: I6acc5dc3bd2e4c6bea6cd9e81214f8104402602a
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/184837
Reviewed-by: Damien Neil <dneil@google.com>
Overview of changes:
* Add an option that specifies whether to replace unresolvable references
with a placeholder instead of producing an error. Since the prior behavior
produced placeholders (not always), we default to that behavior for now,
but will enable strict resolving in a future CL.
* The option is not yet exported because there is concern about what the
public API should look like. This will be exposed in a future CL.
* Unlike before, we now permit placeholders for unresolvable enum values.
* We implement relative name resolution logic.
* We handle the case where the type is unknown, but type_name is specified.
In such a case, we populate both FieldDescriptor.{Enum,Message} and leave
the FieldDescriptor.Kind with the zero value. If the type_name happened
to resolve, we use that to determine the type.
* If a placeholder is used to represent a relative name,
the FullName reports an invalid full name with a "*." prefix.
Change-Id: Ifa8c750423c488fb9324eec4d033a2f251505fda
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/184317
Reviewed-by: Damien Neil <dneil@google.com>
Immediately abort (un)marshal operations when encountering invalid UTF-8
data in proto3 strings. No other proto implementation supports non-UTF-8
data in proto3 strings (and many reject it in proto2 strings as well).
Producing invalid output is an interoperability threat (other
implementations won't be able to read it).
The case where existing string data is found to contain non-UTF8 data is
better handled by changing the field to the `bytes` type, which (aside
from UTF-8 validation) is wire-compatible with `string`.
Remove the errors.NonFatal type, since there are no remaining cases
where it is needed. "Non-fatal" errors which produce results and a
non-nil error are problematic because they compose poorly; the better
approach is to take an option like AllowPartial indicating which
conditions to check for.
Change-Id: I9d189ec6ffda7b5d96d094aa1b290af2e3f23736
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/183098
Reviewed-by: Joe Tsai <thebrokentoaster@gmail.com>
The internal/fileinit package is split apart into two packages:
* internal/filedesc constructs descriptors from the raw proto.
It is very similar to the previous internal/fileinit package.
* internal/filetype wraps descriptors with Go type information
Overview:
* The internal/fileinit package will be deleted in a future CL.
It is kept around since the v1 repo currently depends on it.
* The internal/prototype package is deleted. All former usages of it
are now using internal/filedesc instead. Most significantly,
the reflect/protodesc package was almost entirely re-written.
* The internal/impl package drops support for messages that do not
have a Descriptor method (pre-2016). This removes a significant amount
of technical debt.
filedesc.Builder to parse raw descriptors.
* The internal/encoding/defval package now handles enum values by name.
Change-Id: I3957bcc8588a70470fd6c7de1122216b80615ab7
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/182360
Reviewed-by: Damien Neil <dneil@google.com>
Temporarily remove go.mod, since we can't generate an accurate one until
the corresponding v1 change is submitted.
Change-Id: I1e1ad97f2b455e33f61ffaeb8676289795e47e72
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/177000
Reviewed-by: Joe Tsai <thebrokentoaster@gmail.com>
Added API:
FieldDescriptor.IsExtension
FieldDescriptor.IsList
FieldDescriptor.MapKey
FieldDescriptor.MapValue
FieldDescriptor.ContainingOneof
FieldDescriptor.ContainingMessage
Deprecated API (to be removed in subsequent CL):
FieldDescriptor.Oneof
FieldDescriptor.Extendee
These methods help cleanup several common usage patterns.
Change-Id: I9a3ffabc2edb2173c536509b22f330f98bba7cf3
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/176977
Reviewed-by: Damien Neil <dneil@google.com>
The protobuf documentation explicitly specifies (1<<29)-1 as the maximum
field number, but the C++ implementation itself has a special-case where it
allows field numbers up to MaxInt32 for MessageSet fields, but continues
to apply the former limit in all non-MessageSet cases.
To avoid complicated branching logic, we use the larger limit for all cases
if MessageSet is supported, otherwise, we impose the documented limit.
Change-Id: I710a2a21aa3beba161c3e6ca2f2ed9a266920a5a
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/175817
Reviewed-by: Damien Neil <dneil@google.com>
Change use of regexp for matching literals true,false,null to simple
bytes comparison. Small gain from doing this.
Remove computing for position in Value as that is only needed in error
messages. In order to preserve ability to compute for position later,
store the original input in Value instead of just the slice containing
the value, however, need to also store the start index and size of the
parsed value.
Using benchmark in encoding/bench_test.go now shows faster time and less
memory usage than V1.
name old time/op new time/op delta
JSONEncode-4 30.3ms ± 1% 10.3ms ± 1% -66.02% (p=0.000 n=9+8)
JSONDecode-4 54.4ms ± 3% 18.9ms ± 2% -65.33% (p=0.000 n=10+10)
name old alloc/op new alloc/op delta
JSONEncode-4 10.3MB ± 0% 3.9MB ± 0% -61.74% (p=0.000 n=10+9)
JSONDecode-4 19.0MB ± 0% 3.6MB ± 0% -81.29% (p=0.000 n=10+9)
name old allocs/op new allocs/op delta
JSONEncode-4 465k ± 0% 64k ± 0% -86.30% (p=0.000 n=10+8)
JSONDecode-4 289k ± 0% 163k ± 0% -43.69% (p=0.000 n=10+9)
Change-Id: I0a3108d675d6442674facb065aaebd14051f6c5d
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/172662
Reviewed-by: Joe Tsai <thebrokentoaster@gmail.com>
The protobuf type system uses the word "descriptor" instead of "type".
We should avoid the "type" verbage when we aren't talking about Go types.
The old names are temporarily kept around for compatibility reasons.
Change-Id: Icc99c913528ead011f7a74aa8399d9c5ec6dc56e
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/172238
Reviewed-by: Herbie Ong <herbie@google.com>
Reviewed-by: Damien Neil <dneil@google.com>
Previous calls to indexNeedEscape with a type conversion from []byte
to string incurs allocation.
Make 2 different calls instead, one for string and one for bytes.
Type converting string to []byte does not incur extra allocation,
however, the benchmark results still show it to be slower by ~3% for
textpb and 6+% for jsonpb, hence decided to go with 2 separate calls
instead.
Results over current head:
name old time/op new time/op delta
TextEncode-4 18.1ms ± 2% 18.3ms ± 2% ~ (p=0.065 n=10+9)
TextDecode-4 233ms ± 3% 102ms ± 1% -56.34% (p=0.000 n=9+10)
JSONEncode-4 10.4ms ± 2% 10.5ms ± 0% +0.56% (p=0.019 n=9+9)
JSONDecode-4 870ms ± 2% 354ms ± 4% -59.33% (p=0.000 n=9+10)
name old alloc/op new alloc/op delta
TextEncode-4 28.9MB ± 0% 28.9MB ± 0% +0.00% (p=0.000 n=10+9)
TextDecode-4 1.16GB ± 0% 0.03GB ± 0% -97.44% (p=0.000 n=9+10)
JSONEncode-4 3.94MB ± 0% 3.94MB ± 0% +0.00% (p=0.000 n=10+10)
JSONDecode-4 3.35GB ± 0% 0.01GB ± 0% -99.83% (p=0.000 n=10+10)
name old allocs/op new allocs/op delta
TextEncode-4 73.5k ± 0% 73.5k ± 0% ~ (all equal)
TextDecode-4 278k ± 0% 255k ± 0% -8.26% (p=0.000 n=9+10)
JSONEncode-4 63.8k ± 0% 63.8k ± 0% ~ (all equal)
JSONDecode-4 247k ± 0% 210k ± 0% -14.92% (p=0.000 n=10+10)
Change-Id: Ibc64e9a7827ec1fffa213eb79f60497950203700
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/172239
Reviewed-by: Joe Tsai <thebrokentoaster@gmail.com>
Stop adding options to legacy MessageDescriptors and FieldDescriptors.
We would synthesize options messages containing semantic options
(FieldOptions.is_packed, FieldOptions.is_weak, MessageOptions.map_entry).
This information is already contained in the protoreflect descriptors,
so there is no real need to define a correct options message.
If we do want to include options in legacy descriptors, then we should
get the original value out of the FileDescriptorProto, since it may
include additional extensions or other information.
This change completely removes the dependency from internal/legacy to
descriptor.proto.
Change-Id: Ib6bbe4ca6e0fe7ae501f3e9b11d5fa0222808410
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/171458
Reviewed-by: Joe Tsai <thebrokentoaster@gmail.com>
Also added json.Decoder.Clone API for unmarshaling Any to look
ahead remaining bytes for @type field.
Change-Id: I2f803743534dfb64f9092d716805b115faa5975a
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/170102
Reviewed-by: Joe Tsai <thebrokentoaster@gmail.com>
Per Joe's suggestion, remove producing numberParts when parsing a JSON
number to produce corresponding Value. This saves having to store it
inside Value as well. Only produce numberParts for calls to
Value.{Int,Uint} call.
numberParts is only used for producing integers and removing the logic to
produce numberParts improves overall decoding speed for floats, and shows no
change for integers.
name old time/op new time/op delta
Float-4 559ns ± 0% 288ns ± 0% ~ (p=1.000 n=1+1)
Int-4 471ns ± 0% 466ns ± 0% ~ (p=1.000 n=1+1)
Change-Id: I21bf304ca67dda8d41a4ea0022dcbefd51058c1c
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/168781
Reviewed-by: Joe Tsai <thebrokentoaster@gmail.com>
Unmarshaling of scalar, messages, repeated, and maps.
Need to further improve on error messages for consistency, some error
messages contain the position info while some currently do not. There
are cases where position info is wrong as well when a value is decoded
in another pass, e.g. numbers in string value, or map keys.
Change-Id: I6f9e903c499b5e87fb258dbdada7434389fc7522
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/166338
Reviewed-by: Joe Tsai <thebrokentoaster@gmail.com>
The prototype package was initially used by generated reflection support,
but has now been replaced by internal/fileinit.
Eventually, this functionality should be deleted and re-written in terms
of other components in the repo.
Usages that prototype currently provides (but should be moved) are:
* Constructing standalone messages and enums, which is behavior we should
provide in reflect/protodesc. The google.protobuf.{Enum,Type} are well-known
proto messages designed for this purpose.
* Constructing placeholder files, enums, and messages.
* Consructing protoreflect.{Message,Enum,Extension}Types, which are protobuf
descriptors with associated Go type information.
Change-Id: Id7dbefff952682781b439aa555508c59b2629f9e
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/167383
Reviewed-by: Damien Neil <dneil@google.com>
Collapse Value.Float32 and Value.Float64 into single API to keep it
consistent with Value.{Int,Uint}.
Change-Id: I07737e72715fe3cc3f6bcad579cf5d6cfe3757d5
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/167317
Reviewed-by: Joe Tsai <thebrokentoaster@gmail.com>
Previous decoder decodes a JSON number into a float64, which lacks
64-bit integer precision.
I attempted to retrofit it with storing the raw bytes and parsed out
number parts, see golang.org/cl/164377. While that is possible, the
encoding logic for Value is not symmetrical with the decoding logic and
can be confusing since both utilizes the same Value struct.
Joe and I decided that it would be better to rewrite the JSON encoder
and decoder to be token-based instead, removing the need for sharing a
model type plus making it more efficient.
Change-Id: Ic0601428a824be4e20141623409ab4d92b6167c7
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/165677
Reviewed-by: Damien Neil <dneil@google.com>
We're rewriting internal/encoding/json. So, make a copy of it first in
order not to break encoding/jsonpb package.
Change-Id: I8b63c468d3f432102d2af4db22a7549998ce3876
Reviewed-on: https://go-review.googlesource.com/c/164642
Reviewed-by: Joe Tsai <thebrokentoaster@gmail.com>
This test examines the result of converting math.NaN() to a fixed byte
string. Change it to use a specific NaN value instead, since the value
returned by math.NaN is specified only as being a NaN, not a specific
one.
Use specific float32 and float64 NaN values, since the result of
converting a float64 NaN to a float32 can and does vary.
Fixes test failure on ARM.
Change-Id: Ia1517fdba768cdd88e5ee5f5af37f0b481e651b4
Reviewed-on: https://go-review.googlesource.com/c/162117
Reviewed-by: Herbie Ong <herbie@google.com>
When encoding/textpb marshals out float32 values, it was previously
formatting it as float64 bitsize since both float types are stored as
float64 and internal/encoding/text only has one Float type. A
consequence of this is that the output may display a different value
than expected, e.g. 1.02 becomes 1.0199999809265137.
This CL splits Float type into Float32 and Float64 to keep track of
which bitsize to use when formatting. Values of both types are still
stored as float64 to keep the logic simple.
Decoding will always use Float64, but users can ask for a float32 value
from it.
Change-Id: Iea5b14b283fec2236a0c3946fac34d4d79b95274
Reviewed-on: https://go-review.googlesource.com/c/158497
Reviewed-by: Damien Neil <dneil@google.com>