Commit Graph

22 Commits

Author SHA1 Message Date
Kir Kolyshkin
bf9455640d all: fix typos
Brought to you by codespell v2.1.0, using the command

	codespell -S .cache,vendor -L ot,ba,fo,unparseable -w

Note that the misspelled "unparseable" comes from the
github.com/protocolbuffers/protobuf, where it is explicitly ignored
(see [1] and some explanation at [2]), so we ignore it here, too.

[1] https://github.com/protocolbuffers/protobuf/pull/7752
[2] https://github.com/protocolbuffers/protobuf/pull/7751#discussion_r460170422

Change-Id: Ie1ca705db4f11df8ec8b22fdc22b6a6ee667ae5b
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/406845
Reviewed-by: Damien Neil <dneil@google.com>
Reviewed-by: Lasse Folger <lassefolger@google.com>
2022-05-19 09:32:38 +00:00
cybrcodr
f1ac97a4c3 internal/encoding/json: fix comments
Change-Id: Ia1414d1a6c4edcacc141f0c927ad1f9f0012843a
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/285552
Reviewed-by: Joe Tsai <thebrokentoaster@gmail.com>
Trust: Herbie Ong <herbie@google.com>
2021-01-21 23:58:14 +00:00
Joe Tsai
8cbef3ff2d all: fix golint violations
Change-Id: I35d9f6842ec2e9b36c14672a05c4381441bda87a
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/224582
Reviewed-by: Herbie Ong <herbie@google.com>
2020-03-21 00:04:20 +00:00
Joe Tsai
e0daf31d84 all: trivial formatting changes
Changes:
* import grouping for third-party dependencies
* import grouping for generated protobufs
* blank space removal

Change-Id: I2950b0606bb2064046d79a23a78b05c23147cbfe
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/221017
Reviewed-by: Damien Neil <dneil@google.com>
2020-02-25 21:59:54 +00:00
Herbie Ong
d2ece139c6 encoding/protojson: refactor to follow prototext pattern
All unmarshaling error messages now contain line number and column
information, except for the following errors:
- `unexpected EOF`
- `no support for proto1 MessageSets`
- `required fields X not set`

Changes to internal/encoding/json:
- Moved encoding funcs in string.go and number.go into encode.go.
- Separated out encoding kind constants from decoding ones.
- Renamed file string.go to decode_string.go.
- Renamed file number.go to decode_number.go.
- Renamed Type struct to Kind.
- Renamed Value struct to Token.
- Token accessor methods no longer return error.
  Name, Bool, ParsedString will panic if called on the wrong kind.
  Float, Int, Uint has ok bool result to check against.
- Changed Peek to return Token and error.

Changes to encoding/protojson:
- Updated internal/encoding/json API calls.
- Added line info on most unmarshaling error messages and kept
  description simple and consistent.

Change-Id: Ie50456694f2214c5c4fafd2c9b9239680da0deec
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/218978
Reviewed-by: Joe Tsai <thebrokentoaster@gmail.com>
2020-02-11 22:59:08 +00:00
Herbie Ong
886c32637f internal/encoding/json: add tests for negative zeros.
Updates golang/protobuf#1011.

Copied tests from http://golang.org/cl/217501.

Change-Id: I58ea1111beccee9691929b062eb87a1a752f81e0
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/217578
Reviewed-by: Damien Neil <dneil@google.com>
2020-02-03 23:03:51 +00:00
Herbie Ong
b02b6d1da5 internal/encoding/json: fix performance cliff when decoding large integers that will go out of range.
For large positive integers, add check for number of decimal digits
before converting number to plain integer w/o exponent.

If exponent value is large, previous implementation may end up
constructing a large string with lots of zeroes that is not useful as it
will fail later on when called with strconv.Parse{Uint,Int} anyways.

Fixes golang/protobuf#1002.

Change-Id: I65bfad304401e076743853d7501786b7231b083b
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/213717
Reviewed-by: Damien Neil <dneil@google.com>
2020-01-08 18:44:43 +00:00
Damien Neil
5ba0c29655 internal/encoding/json: fix crash in parsing
Fuzzer-detected crash when parsing: {""

Change-Id: I019c667f48e6a1237858b5abf7d34f43593fb3b6
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/212357
Reviewed-by: Herbie Ong <herbie@google.com>
2019-12-22 04:14:01 +00:00
Herbie Ong
582ab3de42 encoding/protojson: add random whitespaces in encoding output
This is meant to deter users from doing byte for byte comparison.

Change-Id: If005d2dc1eba45eaa4254171d2f247820db109e4
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/194037
Reviewed-by: Joe Tsai <thebrokentoaster@gmail.com>
2019-09-07 00:48:34 +00:00
Joe Tsai
36dc22ddb8 encoding: use strs.UnsafeString to avoid duplicated code
The strs.UnsafeString casts a []byte as a string.
This allows us to avoid duplicated functionality.

Change-Id: I9930b94bae35eac0f98c0fa62963b300bc8d7e49
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/185459
Reviewed-by: Herbie Ong <herbie@google.com>
2019-07-10 07:01:20 +00:00
Damien Neil
8c86fc5e7d all: remove non-fatal UTF-8 validation errors (and non-fatal in general)
Immediately abort (un)marshal operations when encountering invalid UTF-8
data in proto3 strings. No other proto implementation supports non-UTF-8
data in proto3 strings (and many reject it in proto2 strings as well).
Producing invalid output is an interoperability threat (other
implementations won't be able to read it).

The case where existing string data is found to contain non-UTF8 data is
better handled by changing the field to the `bytes` type, which (aside
from UTF-8 validation) is wire-compatible with `string`.

Remove the errors.NonFatal type, since there are no remaining cases
where it is needed. "Non-fatal" errors which produce results and a
non-nil error are problematic because they compose poorly; the better
approach is to take an option like AllowPartial indicating which
conditions to check for.

Change-Id: I9d189ec6ffda7b5d96d094aa1b290af2e3f23736
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/183098
Reviewed-by: Joe Tsai <thebrokentoaster@gmail.com>
2019-06-20 20:55:13 +00:00
Damien Neil
e89e6244e0 all: change module to google.golang.org/protobuf
Temporarily remove go.mod, since we can't generate an accurate one until
the corresponding v1 change is submitted.

Change-Id: I1e1ad97f2b455e33f61ffaeb8676289795e47e72
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/177000
Reviewed-by: Joe Tsai <thebrokentoaster@gmail.com>
2019-05-14 17:28:29 +00:00
Herbie Ong
decef41dcc internal/encoding/json: improve decoding speed and memory allocation
Change use of regexp for matching literals true,false,null to simple
bytes comparison. Small gain from doing this.

Remove computing for position in Value as that is only needed in error
messages. In order to preserve ability to compute for position later,
store the original input in Value instead of just the slice containing
the value, however, need to also store the start index and size of the
parsed value.

Using benchmark in encoding/bench_test.go now shows faster time and less
memory usage than V1.

name          old time/op    new time/op    delta
JSONEncode-4    30.3ms ± 1%    10.3ms ± 1%  -66.02%  (p=0.000 n=9+8)
JSONDecode-4    54.4ms ± 3%    18.9ms ± 2%  -65.33%  (p=0.000 n=10+10)

name          old alloc/op   new alloc/op   delta
JSONEncode-4    10.3MB ± 0%     3.9MB ± 0%  -61.74%  (p=0.000 n=10+9)
JSONDecode-4    19.0MB ± 0%     3.6MB ± 0%  -81.29%  (p=0.000 n=10+9)

name          old allocs/op  new allocs/op  delta
JSONEncode-4      465k ± 0%       64k ± 0%  -86.30%  (p=0.000 n=10+8)
JSONDecode-4      289k ± 0%      163k ± 0%  -43.69%  (p=0.000 n=10+9)

Change-Id: I0a3108d675d6442674facb065aaebd14051f6c5d
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/172662
Reviewed-by: Joe Tsai <thebrokentoaster@gmail.com>
2019-04-29 22:13:25 +00:00
Herbie Ong
1e09691415 internal/encoding/{json,text}: improve string parsing
Previous calls to indexNeedEscape with a type conversion from []byte
to string incurs allocation.

Make 2 different calls instead, one for string and one for bytes.

Type converting string to []byte does not incur extra allocation,
however, the benchmark results still show it to be slower by ~3% for
textpb and 6+% for jsonpb, hence decided to go with 2 separate calls
instead.

Results over current head:
name          old time/op    new time/op    delta
TextEncode-4    18.1ms ± 2%    18.3ms ± 2%     ~     (p=0.065 n=10+9)
TextDecode-4     233ms ± 3%     102ms ± 1%  -56.34%  (p=0.000 n=9+10)
JSONEncode-4    10.4ms ± 2%    10.5ms ± 0%   +0.56%  (p=0.019 n=9+9)
JSONDecode-4     870ms ± 2%     354ms ± 4%  -59.33%  (p=0.000 n=9+10)

name          old alloc/op   new alloc/op   delta
TextEncode-4    28.9MB ± 0%    28.9MB ± 0%   +0.00%  (p=0.000 n=10+9)
TextDecode-4    1.16GB ± 0%    0.03GB ± 0%  -97.44%  (p=0.000 n=9+10)
JSONEncode-4    3.94MB ± 0%    3.94MB ± 0%   +0.00%  (p=0.000 n=10+10)
JSONDecode-4    3.35GB ± 0%    0.01GB ± 0%  -99.83%  (p=0.000 n=10+10)

name          old allocs/op  new allocs/op  delta
TextEncode-4     73.5k ± 0%     73.5k ± 0%     ~     (all equal)
TextDecode-4      278k ± 0%      255k ± 0%   -8.26%  (p=0.000 n=9+10)
JSONEncode-4     63.8k ± 0%     63.8k ± 0%     ~     (all equal)
JSONDecode-4      247k ± 0%      210k ± 0%  -14.92%  (p=0.000 n=10+10)

Change-Id: Ibc64e9a7827ec1fffa213eb79f60497950203700
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/172239
Reviewed-by: Joe Tsai <thebrokentoaster@gmail.com>
2019-04-17 00:26:13 +00:00
Herbie Ong
8fa64d98d8 internal/encoding/json: improve Value.Int,Uint by reducing allocations
parseNumber does not need to construct new slices for numberParts, it
simply needs to reference the correct subset from the input.

normalizeToString may need to allocate but only if there's a positive
exponent.

name      old time/op    new time/op    delta
Float-4      308ns ± 0%     291ns ± 0%   ~     (p=1.000 n=1+1)
Int-4        498ns ± 0%     341ns ± 0%   ~     (p=1.000 n=1+1)
String-4     262ns ± 0%     250ns ± 0%   ~     (p=1.000 n=1+1)
Bool-4       212ns ± 0%     210ns ± 0%   ~     (p=1.000 n=1+1)

name      old alloc/op   new alloc/op   delta
Float-4      48.0B ± 0%     48.0B ± 0%   ~     (all equal)
Int-4         160B ± 0%       99B ± 0%   ~     (p=1.000 n=1+1)
String-4      176B ± 0%      176B ± 0%   ~     (all equal)
Bool-4       0.00B          0.00B        ~     (all equal)

name      old allocs/op  new allocs/op  delta
Float-4       1.00 ± 0%      1.00 ± 0%   ~     (all equal)
Int-4         9.00 ± 0%      4.00 ± 0%   ~     (p=1.000 n=1+1)
String-4      3.00 ± 0%      3.00 ± 0%   ~     (all equal)
Bool-4        0.00           0.00        ~     (all equal)

Change-Id: If083e18a5914b15e794d34722cbb6539cbd73a53
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/170788
Reviewed-by: Joe Tsai <thebrokentoaster@gmail.com>
2019-04-06 04:16:05 +00:00
Herbie Ong
8ac9dd257b encoding/jsonpb: add support for unmarshaling Any
Also added json.Decoder.Clone API for unmarshaling Any to look
ahead remaining bytes for @type field.

Change-Id: I2f803743534dfb64f9092d716805b115faa5975a
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/170102
Reviewed-by: Joe Tsai <thebrokentoaster@gmail.com>
2019-04-02 04:49:03 +00:00
Herbie Ong
670d808a6d internal/encoding/json: improve allocation of Value for JSON strings
Substitute interface Value.value field with per type field boo and str
instead.

name      old time/op    new time/op    delta
String-4     286ns ± 0%     254ns ± 0%   ~     (p=1.000 n=1+1)
Bool-4       209ns ± 0%     211ns ± 0%   ~     (p=1.000 n=1+1)

name      old alloc/op   new alloc/op   delta
String-4      192B ± 0%      176B ± 0%   ~     (p=1.000 n=1+1)
Bool-4       0.00B          0.00B        ~     (all equal)

name      old allocs/op  new allocs/op  delta
String-4      4.00 ± 0%      3.00 ± 0%   ~     (p=1.000 n=1+1)
Bool-4        0.00           0.00        ~     (all equal)

Change-Id: Ib0167d22e60d63c221c303b79c75b9e96d432fe7
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/170277
Reviewed-by: Joe Tsai <thebrokentoaster@gmail.com>
2019-04-01 20:47:36 +00:00
Herbie Ong
a3421952ac internal/encoding/json: improve decoding of JSON numbers for floats
Per Joe's suggestion, remove producing numberParts when parsing a JSON
number to produce corresponding Value. This saves having to store it
inside Value as well. Only produce numberParts for calls to
Value.{Int,Uint} call.

numberParts is only used for producing integers and removing the logic to
produce numberParts improves overall decoding speed for floats, and shows no
change for integers.

name     old time/op  new time/op  delta
Float-4   559ns ± 0%   288ns ± 0%   ~     (p=1.000 n=1+1)
Int-4     471ns ± 0%   466ns ± 0%   ~     (p=1.000 n=1+1)

Change-Id: I21bf304ca67dda8d41a4ea0022dcbefd51058c1c
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/168781
Reviewed-by: Joe Tsai <thebrokentoaster@gmail.com>
2019-03-23 22:02:55 +00:00
Herbie Ong
c96a79da29 encoding/jsonpb: add support for basic unmarshaling
Unmarshaling of scalar, messages, repeated, and maps.

Need to further improve on error messages for consistency, some error
messages contain the position info while some currently do not.  There
are cases where position info is wrong as well when a value is decoded
in another pass, e.g. numbers in string value, or map keys.

Change-Id: I6f9e903c499b5e87fb258dbdada7434389fc7522
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/166338
Reviewed-by: Joe Tsai <thebrokentoaster@gmail.com>
2019-03-15 18:53:18 +00:00
Herbie Ong
d3f8f2d412 internal/encoding/json: rewrite to a token-based encoder and decoder
Previous decoder decodes a JSON number into a float64, which lacks
64-bit integer precision.

I attempted to retrofit it with storing the raw bytes and parsed out
number parts, see golang.org/cl/164377.  While that is possible, the
encoding logic for Value is not symmetrical with the decoding logic and
can be confusing since both utilizes the same Value struct.

Joe and I decided that it would be better to rewrite the JSON encoder
and decoder to be token-based instead, removing the need for sharing a
model type plus making it more efficient.

Change-Id: Ic0601428a824be4e20141623409ab4d92b6167c7
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/165677
Reviewed-by: Damien Neil <dneil@google.com>
2019-03-11 21:53:21 +00:00
Joe Tsai
01ab29648e go.mod: rename google.golang.org/proto as github.com/golang/protobuf/v2
This change was created by running:
	git ls-files | xargs sed -i "s|google.golang.org/proto|github.com/golang/protobuf/v2|g"

This change is *not* an endorsement of "github.com/golang/protobuf/v2" as the
final import path when the v2 API is eventually released as stable.
We continue to reserve the right to make breaking changes as we see fit.

This change enables us to host the v2 API on a repository that is go-gettable
(since go.googlesource.com is not a known host by the "go get" tool;
and google.golang.org/proto was just a stub URL that is not currently served).
Thus, we can start work on a forked version of the v1 API that explores
what it would take to implement v1 in terms of v2 in a backwards compatible way.

Change-Id: Ia3ebc41ac4238af62ee140200d3158b53ac9ec48
Reviewed-on: https://go-review.googlesource.com/136736
Reviewed-by: Damien Neil <dneil@google.com>
2018-09-24 16:11:50 +00:00
Joe Tsai
879b18d902 internal/encoding/json: initial commit of JSON parser/serializer
Package json provides a parser and serializer for the JSON format.
This focuses on the grammar of the format and is agnostic towards specific
semantics of protobuf types.

High-level API:
	func Marshal(v Value, indent string) ([]byte, error)
	func Unmarshal(b []byte) (Value, error)
	type Type uint8
	    const Null Type ...
	type Value struct{ ... }
	    func ValueOf(v interface{}) Value
		func (v Value) Type() Type
		func (v Value) Bool() bool
		func (v Value) Number() float64
		func (v Value) String() string
		func (v Value) Array() []Value
		func (v Value) Object() [][2]Value
		func (v Value) Raw() []byte

Change-Id: I26422f6b3881ef1a11b8aa95160645b1384b27b8
Reviewed-on: https://go-review.googlesource.com/127824
Reviewed-by: Herbie Ong <herbie@google.com>
2018-08-07 22:40:28 +00:00