protoc: fix consistency with parsing very large decimal numbers #10555

jhump · 2022-09-13T20:37:00Z

I think the most controversial part of this would be the error messages, which I've updated to include the actual allowed range of values for integer literals (since this change makes the parser accept out-of-range integer literals, but they are interpreted as if float/double literals).

I re-generated code from protos in the third commit in order to make tests pass. I think that means the main branch is actually broken now because of stale generated code? I can remove that commit, since I understand it may be noise for this CL, but I included it to get a successful run of tests.

src/google/protobuf/descriptor.cc

jhump · 2022-09-14T19:37:53Z

I realized there is no new test to confirm that options and defaults allow very large decimal literals for float/double fields Adding that now.

jhump · 2022-09-14T20:26:07Z

@fowles, I think this is ready for another look. I think I addressed your comments, and I've added a new test, too.

src/google/protobuf/compiler/parser_unittest.cc

src/google/protobuf/descriptor.cc

…flow uint64 or underflow int64) to be used as float/double values

…ble values

jhump · 2022-09-15T00:20:11Z

BTW, I rebased to more recent master and was able to remove the commit that was just re-generating files from protos. So the diff is a little leaner/less noisy now.

fowles

only nits remaining, I will approve and kick the CI systems on the next round

src/google/protobuf/descriptor.cc

jhump · 2022-09-15T14:52:32Z

@fowles, I see you approved this, but I realized one more thing missing from this before it's ready to merge -- there's a code path that could accidentally trigger a fatal error, if the literal is an octal or hex integer that is >2^64-1. I think it's an easy fix as well as easy to test.

…arge

jhump · 2022-09-15T15:27:40Z

@fowles, see last commit and let me know what you think.

jhump · 2022-09-15T15:29:09Z

src/google/protobuf/compiler/parser.cc

+    } else if (!io::Tokenizer::TryParseFloat(input_->current().text, output)) {
+      // out of int range, and not valid float? 🤷
      AddError("Integer out of range.");
      // We still return true because we did, in fact, parse a number.


This is "just in case". I can't think of any actual scenario, other octal or hex literal (handled above), where ParseFloat would fail. But I also really don't want this inadvertently raising an exception.

jhump · 2022-09-15T15:29:44Z

src/google/protobuf/compiler/parser.cc

+    } else if (input_->current().text[0] == '0') {
+      // octal or hexadecimal; don't bother parsing as float
+      AddError("Integer out of range.");
+      // We still return true because we did, in fact, parse a number.


I put this here, instead of just relying on the branch below, because I think an octal literal would likely be successfully parsed by ParseFloat, but incorrectly interpreted as decimal.

fowles · 2022-09-15T15:36:10Z

src/google/protobuf/io/tokenizer.cc

@@ -1002,9 +1002,19 @@ bool Tokenizer::ParseInteger(const std::string& text, uint64_t max_value,
 }

 double Tokenizer::ParseFloat(const std::string& text) {
+  double result;
+  GOOGLE_LOG_IF(DFATAL,
+         !TryParseFloat(text, &result))


I hate side effected in LOG statements, do you mind switching this to the more straight forward:

if (!TryParseFloat(...)) { LOG(DFATAL) << ... }

src/google/protobuf/io/tokenizer.cc

fowles · 2022-09-15T18:43:42Z

Thanks for the PR. Sorry about the many rounds of back and forth. Running CI one last time and then I will merge!

jhump · 2022-09-15T19:12:22Z

Thanks for the PR. Sorry about the many rounds of back and forth

No worries! I totally understand the desire for high code quality, and I know that I don't write C++ often enough to get everything right the first time :)

src/google/protobuf/io/tokenizer.cc

…y-with-very-large-decimal-numbers protoc: fix consistency with parsing very large decimal numbers

jhump mentioned this pull request Sep 14, 2022

protoc handles very large negative integer literals inconsistently #10554

Closed

fowles added protoc release notes: yes labels Sep 14, 2022

fowles self-requested a review September 14, 2022 13:53

fowles reviewed Sep 14, 2022

View reviewed changes

src/google/protobuf/descriptor.cc Outdated Show resolved Hide resolved

src/google/protobuf/descriptor.cc Outdated Show resolved Hide resolved

jhump force-pushed the jh/fix-consistency-with-very-large-decimal-numbers branch from 5047015 to 7467f75 Compare September 14, 2022 19:34

acozzette added the kokoro:run label Sep 14, 2022

protobuf-kokoro removed the kokoro:run label Sep 14, 2022

fowles reviewed Sep 14, 2022

View reviewed changes

src/google/protobuf/compiler/parser_unittest.cc Show resolved Hide resolved

src/google/protobuf/descriptor.cc Outdated Show resolved Hide resolved

jhump added 5 commits September 14, 2022 20:17

allow excessively large int literal values (that would otherwise over…

2270d3f

…flow uint64 or underflow int64) to be used as float/double values

add allowed ranges to error messages

87f24e4

change format of int range in error message; use macro to make DRY

4e54ec2

add test to verify parsing of extremely large decimal integers to dou…

35dd193

…ble values

use template instead of macro

4c69337

jhump force-pushed the jh/fix-consistency-with-very-large-decimal-numbers branch from 93e26b4 to 4c69337 Compare September 15, 2022 00:19

fowles reviewed Sep 15, 2022

View reviewed changes

src/google/protobuf/descriptor.cc Outdated Show resolved Hide resolved

src/google/protobuf/descriptor.cc Outdated Show resolved Hide resolved

address latest review comments

7702355

fowles reviewed Sep 15, 2022

View reviewed changes

src/google/protobuf/descriptor.cc Show resolved Hide resolved

fowles added the kokoro:run label Sep 15, 2022

protobuf-kokoro removed the kokoro:run label Sep 15, 2022

put helpers into anon namespace

0bc90b1

fowles approved these changes Sep 15, 2022

View reviewed changes

fowles added the kokoro:run label Sep 15, 2022

protobuf-kokoro removed the kokoro:run label Sep 15, 2022

avoid possible exception; error if octal or hex literal that is too l…

f82be68

…arge

mkruskal-google added the kokoro:run label Sep 15, 2022

protobuf-kokoro removed the kokoro:run label Sep 15, 2022

jhump commented Sep 15, 2022

View reviewed changes

fowles reviewed Sep 15, 2022

View reviewed changes

use normal conditional

d6acffb

fowles reviewed Sep 15, 2022

View reviewed changes

src/google/protobuf/io/tokenizer.cc Outdated Show resolved Hide resolved

initialize var to avoid undefined return val

0362a12

fowles approved these changes Sep 15, 2022

View reviewed changes

fowles added the kokoro:run label Sep 15, 2022

protobuf-kokoro removed the kokoro:run label Sep 15, 2022

fowles reviewed Sep 15, 2022

View reviewed changes

src/google/protobuf/io/tokenizer.cc Outdated Show resolved Hide resolved

jhump commented Sep 15, 2022

View reviewed changes

src/google/protobuf/io/tokenizer.cc Outdated Show resolved Hide resolved

oops, fix name: LOG -> GOOGLE_LOG

7e745c4

fowles added the kokoro:run label Sep 15, 2022

protobuf-kokoro removed the kokoro:run label Sep 15, 2022

fowles merged commit c4644b7 into protocolbuffers:main Sep 15, 2022

This was referenced Sep 16, 2022

don't treat overflowing octal int as if decimal float bufbuild/protocompile#32

Merged

corrections to the spec bufbuild/protobuf.com#3

Merged

bithium pushed a commit to bithium/protobuf that referenced this pull request Sep 4, 2023

Merge pull request protocolbuffers#10555 from jhump/jh/fix-consistenc…

355f171

…y-with-very-large-decimal-numbers protoc: fix consistency with parsing very large decimal numbers

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

protoc: fix consistency with parsing very large decimal numbers #10555

protoc: fix consistency with parsing very large decimal numbers #10555

jhump commented Sep 13, 2022

jhump commented Sep 14, 2022

jhump commented Sep 14, 2022

jhump commented Sep 15, 2022

fowles left a comment

jhump commented Sep 15, 2022

jhump commented Sep 15, 2022

jhump Sep 15, 2022

jhump Sep 15, 2022

fowles Sep 15, 2022

jhump Sep 15, 2022

fowles commented Sep 15, 2022

jhump commented Sep 15, 2022

protoc: fix consistency with parsing very large decimal numbers #10555

protoc: fix consistency with parsing very large decimal numbers #10555

Conversation

jhump commented Sep 13, 2022

jhump commented Sep 14, 2022

jhump commented Sep 14, 2022

jhump commented Sep 15, 2022

fowles left a comment

Choose a reason for hiding this comment

jhump commented Sep 15, 2022

jhump commented Sep 15, 2022

jhump Sep 15, 2022

Choose a reason for hiding this comment

jhump Sep 15, 2022

Choose a reason for hiding this comment

fowles Sep 15, 2022

Choose a reason for hiding this comment

jhump Sep 15, 2022

Choose a reason for hiding this comment

fowles commented Sep 15, 2022

jhump commented Sep 15, 2022