12 Commits

Author SHA1 Message Date
Kir Kolyshkin
563f06fbeb internal/encoding/text/decode: limit errId length
Avoid very long errors returned by limiting the length of what errId
returns to 32 bytes (the value is chosen so that the error will not
be too long yet useful).

Append ellipsis to the returned value to denote that it was truncated.

Change-Id: I232d5192a2d9ad675daa0be0fe0c8518489c2953
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/406694
Reviewed-by: Damien Neil <dneil@google.com>
Reviewed-by: Lasse Folger <lassefolger@google.com>
2022-05-17 20:42:32 +00:00
Kir Kolyshkin
a0482351ba internal/encoding/text/decode: stop using regexp
This eliminates the last user of the regexp package, which should save
about 130K from the resulting stripped binary importing this package
(unless, of course, regexp is brought in directly of via another
dependency).

Added some new cases to TestDecoder to test the new function.

Benchmark (not included) shows the following results, comparing to
old implementation using regexp.Find:

name     old time/op    new time/op    delta
ErrId-4    1.93µs ± 1%    0.21µs ± 1%   -89.20%  (p=0.002 n=6+6)

name     old alloc/op   new alloc/op   delta
ErrId-4      128B ± 0%        0B       -100.00%  (p=0.002 n=6+6)

name     old allocs/op  new allocs/op  delta
ErrId-4      13.0 ± 0%       0.0       -100.00%  (p=0.002 n=6+6)

Change-Id: I5569a47580f41cc60f92c444e8d43bb3f26faa4e
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/402774
Reviewed-by: Cassondra Foesch <cfoesch@gmail.com>
Reviewed-by: Damien Neil <dneil@google.com>
Reviewed-by: Lasse Folger <lassefolger@google.com>
2022-05-16 20:59:57 +00:00
Emmanuel T Odeke
26e8bcb3c7 all: remove unnecessary string([]byte) conversion in fmt.Sprintf with %s
Change-Id: I64aab811cbcbfa410817894f1cd1d83f88f27bf6
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/365874
Reviewed-by: Damien Neil <dneil@google.com>
Trust: Damien Neil <dneil@google.com>
Trust: Emmanuel Odeke <emmanuel@orijtech.com>
2021-11-29 18:55:28 +00:00
Herbie Ong
952a08d7c4 encoding/prototext: make unexpected EOF error into proto.Error
Also fixed/added comments on exported vars/funcs.

Change-Id: I6c42b2afb90058e026a5310598bb3ebfcd01b989
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/218357
Reviewed-by: Damien Neil <dneil@google.com>
2020-02-07 19:00:45 +00:00
Herbie Ong
4e6b903e61 internal/encoding/text: fix eof crash when parsing list of scalars
Need to check for EOF and return proper error.

Bug caught by fuzz test: https://oss-fuzz.com/testcase-detail/6258064955277312.

Change-Id: I63d5c12c301f2ddefc9a0813c13abef40d745e91
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/218258
Reviewed-by: Joe Tsai <thebrokentoaster@gmail.com>
2020-02-06 19:20:11 +00:00
Herbie Ong
2eb18f0e62 internal/encoding/text: fix error construction in parseTypeName
Fuzz test caught the following issue --
https://oss-fuzz.com/testcase-detail/6288731021770752

Change-Id: Idcbce7953b465d1b83c01b1d123c9d43907d402a
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/218037
Reviewed-by: Joe Tsai <thebrokentoaster@gmail.com>
2020-02-05 23:01:15 +00:00
Herbie Ong
9b3d97c473 encoding/prototext: rewrite of internal/encoding/text
* Fixes golang/protobuf#842. Unmarshal can now parse singular or
  repeated message fields without the field separator.
* Fixes golang/protobuf#1011. Handles negative 0 properly.
* For unknown fields with fixed 32-bit and 64-bit wire types, output is
  now in hex format with 0x prefix similar to C++ lib output. Previous
  Go implementation simply outputs these as decimal numbers %d.
* All parsing errors, except for unexpected EOF should now contain line
  and column number info.
* Fixed following conformance-related features:
  * Parse nan,inf,-inf,infinity,-infinity as case-insensitive.
  * Interpret float32 overflows as inf or -inf.
  * Parse large int-like number as proto float.
* Discard unknown map field if DiscardUnknown=true.
* Allow whitespaces/comments in Any type URL and extension field names per spec.
* Improves performance and memory usage. It is now as fast and efficient as
  protojson, if not better on most benchmarks.

name                                     old time/op    new time/op    delta
Text/Unmarshal/google_message1_proto2-4    14.1µs ±43%     8.7µs ±12%  -38.27%  (p=0.000 n=10+10)
Text/Unmarshal/google_message1_proto3-4    11.6µs ±18%     7.7µs ± 9%  -33.69%  (p=0.000 n=10+10)
Text/Unmarshal/google_message2-4           6.20ms ±27%    4.10ms ± 5%  -33.95%  (p=0.000 n=10+10)
Text/Marshal/google_message1_proto2-4      12.8µs ± 6%    10.3µs ±23%  -19.54%  (p=0.000 n=9+10)
Text/Marshal/google_message1_proto3-4      11.9µs ±16%     8.6µs ±10%  -27.45%  (p=0.000 n=10+10)
Text/Marshal/google_message2-4             5.59ms ± 5%    5.30ms ±22%     ~     (p=0.356 n=9+10)
JSON/Unmarshal/google_message1_proto2-4    12.3µs ±61%    13.9µs ±26%     ~     (p=0.190 n=10+10)
JSON/Unmarshal/google_message1_proto3-4    7.51µs ± 6%    7.86µs ± 1%   +4.66%  (p=0.010 n=10+9)
JSON/Unmarshal/google_message2-4           3.74ms ± 2%    3.94ms ± 2%   +5.32%  (p=0.000 n=10+10)
JSON/Marshal/google_message1_proto2-4      9.90µs ±12%    9.95µs ± 4%     ~     (p=0.315 n=9+10)
JSON/Marshal/google_message1_proto3-4      7.55µs ± 4%    7.93µs ± 3%   +4.98%  (p=0.000 n=10+10)
JSON/Marshal/google_message2-4             4.29ms ± 5%    4.49ms ± 2%   +4.53%  (p=0.001 n=10+10)

name                                     old alloc/op   new alloc/op   delta
Text/Unmarshal/google_message1_proto2-4    12.5kB ± 0%     2.0kB ± 0%  -83.87%  (p=0.000 n=10+10)
Text/Unmarshal/google_message1_proto3-4    12.2kB ± 0%     1.8kB ± 0%  -85.33%  (p=0.000 n=10+10)
Text/Unmarshal/google_message2-4           5.35MB ± 0%    0.89MB ± 0%  -83.28%  (p=0.000 n=10+9)
Text/Marshal/google_message1_proto2-4      12.0kB ± 0%     1.4kB ± 0%  -88.15%  (p=0.000 n=10+10)
Text/Marshal/google_message1_proto3-4      12.4kB ± 0%     1.9kB ± 0%  -84.91%  (p=0.000 n=10+10)
Text/Marshal/google_message2-4             5.64MB ± 0%    1.02MB ± 0%  -81.85%  (p=0.000 n=10+9)
JSON/Unmarshal/google_message1_proto2-4    2.29kB ± 0%    2.29kB ± 0%     ~     (all equal)
JSON/Unmarshal/google_message1_proto3-4    2.08kB ± 0%    2.08kB ± 0%     ~     (all equal)
JSON/Unmarshal/google_message2-4            899kB ± 0%     899kB ± 0%     ~     (p=1.000 n=10+10)
JSON/Marshal/google_message1_proto2-4      1.46kB ± 0%    1.46kB ± 0%     ~     (all equal)
JSON/Marshal/google_message1_proto3-4      1.36kB ± 0%    1.36kB ± 0%     ~     (all equal)
JSON/Marshal/google_message2-4             1.19MB ± 0%    1.19MB ± 0%     ~     (p=0.197 n=10+10)

name                                     old allocs/op  new allocs/op  delta
Text/Unmarshal/google_message1_proto2-4       133 ± 0%        89 ± 0%  -33.08%  (p=0.000 n=10+10)
Text/Unmarshal/google_message1_proto3-4       108 ± 0%        67 ± 0%  -37.96%  (p=0.000 n=10+10)
Text/Unmarshal/google_message2-4            60.0k ± 0%     38.7k ± 0%  -35.52%  (p=0.000 n=10+10)
Text/Marshal/google_message1_proto2-4        65.0 ± 0%      25.0 ± 0%  -61.54%  (p=0.000 n=10+10)
Text/Marshal/google_message1_proto3-4        59.0 ± 0%      22.0 ± 0%  -62.71%  (p=0.000 n=10+10)
Text/Marshal/google_message2-4              27.4k ± 0%      7.3k ± 0%  -73.39%  (p=0.000 n=10+10)
JSON/Unmarshal/google_message1_proto2-4      95.0 ± 0%      95.0 ± 0%     ~     (all equal)
JSON/Unmarshal/google_message1_proto3-4      74.0 ± 0%      74.0 ± 0%     ~     (all equal)
JSON/Unmarshal/google_message2-4            36.3k ± 0%     36.3k ± 0%     ~     (all equal)
JSON/Marshal/google_message1_proto2-4        27.0 ± 0%      27.0 ± 0%     ~     (all equal)
JSON/Marshal/google_message1_proto3-4        30.0 ± 0%      30.0 ± 0%     ~     (all equal)
JSON/Marshal/google_message2-4              11.3k ± 0%     11.3k ± 0%     ~     (p=1.000 n=10+10)

Change-Id: I377925facde5535f06333b6f25e9c9b358dc062f
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/204602
Reviewed-by: Joe Tsai <thebrokentoaster@gmail.com>
2020-02-05 02:11:08 +00:00
Herbie Ong
a3369c5dc2 internal/encoding/text: replace use of regular expression in decoding
Improve performance by replacing use of regular expressions with direct
parsing code.

Compared to latest version:

name                                     old time/op    new time/op    delta
Text/Unmarshal/google_message1_proto2-4    21.8µs ± 5%    14.0µs ± 9%  -35.69%  (p=0.000 n=10+9)
Text/Unmarshal/google_message1_proto3-4    19.6µs ± 4%    13.8µs ±10%  -29.47%  (p=0.000 n=10+10)
Text/Unmarshal/google_message2-4           13.4ms ± 4%     4.9ms ± 4%  -63.44%  (p=0.000 n=10+10)
Text/Marshal/google_message1_proto2-4      13.8µs ± 2%    14.1µs ± 4%   +2.42%  (p=0.011 n=9+10)
Text/Marshal/google_message1_proto3-4      11.6µs ± 2%    11.8µs ± 8%     ~     (p=0.573 n=8+10)
Text/Marshal/google_message2-4             8.01ms ±48%    5.97ms ± 5%  -25.44%  (p=0.000 n=10+10)

name                                     old alloc/op   new alloc/op   delta
Text/Unmarshal/google_message1_proto2-4    13.0kB ± 0%    12.6kB ± 0%   -3.40%  (p=0.000 n=10+10)
Text/Unmarshal/google_message1_proto3-4    13.0kB ± 0%    12.5kB ± 0%   -3.50%  (p=0.000 n=10+10)
Text/Unmarshal/google_message2-4           5.67MB ± 0%    5.50MB ± 0%   -3.13%  (p=0.000 n=10+10)
Text/Marshal/google_message1_proto2-4      12.0kB ± 0%    12.1kB ± 0%   +0.02%  (p=0.000 n=10+10)
Text/Marshal/google_message1_proto3-4      11.7kB ± 0%    11.7kB ± 0%   +0.01%  (p=0.000 n=10+10)
Text/Marshal/google_message2-4             5.68MB ± 0%    5.68MB ± 0%   +0.01%  (p=0.000 n=10+10)

name                                     old allocs/op  new allocs/op  delta
Text/Unmarshal/google_message1_proto2-4       142 ± 0%       142 ± 0%     ~     (all equal)
Text/Unmarshal/google_message1_proto3-4       156 ± 0%       156 ± 0%     ~     (all equal)
Text/Unmarshal/google_message2-4            70.1k ± 0%     65.4k ± 0%   -6.76%  (p=0.000 n=10+10)
Text/Marshal/google_message1_proto2-4        91.0 ± 0%      91.0 ± 0%     ~     (all equal)
Text/Marshal/google_message1_proto3-4        80.0 ± 0%      80.0 ± 0%     ~     (all equal)
Text/Marshal/google_message2-4              36.4k ± 0%     36.4k ± 0%     ~     (all equal)

Change-Id: Ia5d3c16e9e33961aae03bac0d53fcfc5b1943d2a
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/173360
Reviewed-by: Joe Tsai <thebrokentoaster@gmail.com>
2019-07-23 22:08:16 +00:00
Damien Neil
8c86fc5e7d all: remove non-fatal UTF-8 validation errors (and non-fatal in general)
Immediately abort (un)marshal operations when encountering invalid UTF-8
data in proto3 strings. No other proto implementation supports non-UTF-8
data in proto3 strings (and many reject it in proto2 strings as well).
Producing invalid output is an interoperability threat (other
implementations won't be able to read it).

The case where existing string data is found to contain non-UTF8 data is
better handled by changing the field to the `bytes` type, which (aside
from UTF-8 validation) is wire-compatible with `string`.

Remove the errors.NonFatal type, since there are no remaining cases
where it is needed. "Non-fatal" errors which produce results and a
non-nil error are problematic because they compose poorly; the better
approach is to take an option like AllowPartial indicating which
conditions to check for.

Change-Id: I9d189ec6ffda7b5d96d094aa1b290af2e3f23736
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/183098
Reviewed-by: Joe Tsai <thebrokentoaster@gmail.com>
2019-06-20 20:55:13 +00:00
Damien Neil
e89e6244e0 all: change module to google.golang.org/protobuf
Temporarily remove go.mod, since we can't generate an accurate one until
the corresponding v1 change is submitted.

Change-Id: I1e1ad97f2b455e33f61ffaeb8676289795e47e72
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/177000
Reviewed-by: Joe Tsai <thebrokentoaster@gmail.com>
2019-05-14 17:28:29 +00:00
Joe Tsai
01ab29648e go.mod: rename google.golang.org/proto as github.com/golang/protobuf/v2
This change was created by running:
	git ls-files | xargs sed -i "s|google.golang.org/proto|github.com/golang/protobuf/v2|g"

This change is *not* an endorsement of "github.com/golang/protobuf/v2" as the
final import path when the v2 API is eventually released as stable.
We continue to reserve the right to make breaking changes as we see fit.

This change enables us to host the v2 API on a repository that is go-gettable
(since go.googlesource.com is not a known host by the "go get" tool;
and google.golang.org/proto was just a stub URL that is not currently served).
Thus, we can start work on a forked version of the v1 API that explores
what it would take to implement v1 in terms of v2 in a backwards compatible way.

Change-Id: Ia3ebc41ac4238af62ee140200d3158b53ac9ec48
Reviewed-on: https://go-review.googlesource.com/136736
Reviewed-by: Damien Neil <dneil@google.com>
2018-09-24 16:11:50 +00:00
Joe Tsai
27c2a76c85 internal/encoding/text: initial commit of proto text format parser/serializer
Package text provides a parser and serializer for the proto text format.
This focuses on the grammar of the format and is agnostic towards specific
semantics of protobuf types.

High-level API:
	func Marshal(v Value, indent string, delims [2]byte, outputASCII bool) ([]byte, error)
	func Unmarshal(b []byte) (Value, error)
	type Type uint8
		const Bool Type ...
	type Value struct{ ... }
		func ValueOf(v interface{}) Value
		func (v Value) Type() Type
		func (v Value) Bool() (x bool, ok bool)
		func (v Value) Int(b64 bool) (x int64, ok bool)
		func (v Value) Uint(b64 bool) (x uint64, ok bool)
		func (v Value) Float(b64 bool) (x float64, ok bool)
		func (v Value) Name() (protoreflect.Name, bool)
		func (v Value) String() string
		func (v Value) List() []Value
		func (v Value) Message() [][2]Value
		func (v Value) Raw() []byte

Change-Id: I4a78ec4474c160d0de4d32120651edd931ea2c1e
Reviewed-on: https://go-review.googlesource.com/127455
Reviewed-by: Herbie Ong <herbie@google.com>
2018-08-07 22:44:06 +00:00