Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

protoc: fix consistency with parsing very large decimal numbers #10555

Conversation

jhump
Copy link
Contributor

@jhump jhump commented Sep 13, 2022

Fixes #10554.

I think the most controversial part of this would be the error messages, which I've updated to include the actual allowed range of values for integer literals (since this change makes the parser accept out-of-range integer literals, but they are interpreted as if float/double literals).

I re-generated code from protos in the third commit in order to make tests pass. I think that means the main branch is actually broken now because of stale generated code? I can remove that commit, since I understand it may be noise for this CL, but I included it to get a successful run of tests.

src/google/protobuf/descriptor.cc Outdated Show resolved Hide resolved
src/google/protobuf/descriptor.cc Outdated Show resolved Hide resolved
@jhump jhump force-pushed the jh/fix-consistency-with-very-large-decimal-numbers branch from 5047015 to 7467f75 Compare September 14, 2022 19:34
@jhump
Copy link
Contributor Author

jhump commented Sep 14, 2022

I realized there is no new test to confirm that options and defaults allow very large decimal literals for float/double fields Adding that now.

@jhump
Copy link
Contributor Author

jhump commented Sep 14, 2022

@fowles, I think this is ready for another look. I think I addressed your comments, and I've added a new test, too.

@jhump jhump force-pushed the jh/fix-consistency-with-very-large-decimal-numbers branch from 93e26b4 to 4c69337 Compare September 15, 2022 00:19
@jhump
Copy link
Contributor Author

jhump commented Sep 15, 2022

BTW, I rebased to more recent master and was able to remove the commit that was just re-generating files from protos. So the diff is a little leaner/less noisy now.

Copy link
Contributor

@fowles fowles left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

only nits remaining, I will approve and kick the CI systems on the next round

src/google/protobuf/descriptor.cc Outdated Show resolved Hide resolved
src/google/protobuf/descriptor.cc Outdated Show resolved Hide resolved
@jhump
Copy link
Contributor Author

jhump commented Sep 15, 2022

@fowles, I see you approved this, but I realized one more thing missing from this before it's ready to merge -- there's a code path that could accidentally trigger a fatal error, if the literal is an octal or hex integer that is >2^64-1. I think it's an easy fix as well as easy to test.

@jhump
Copy link
Contributor Author

jhump commented Sep 15, 2022

@fowles, see last commit and let me know what you think.

Comment on lines +317 to 320
} else if (!io::Tokenizer::TryParseFloat(input_->current().text, output)) {
// out of int range, and not valid float? 🤷
AddError("Integer out of range.");
// We still return true because we did, in fact, parse a number.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is "just in case". I can't think of any actual scenario, other octal or hex literal (handled above), where ParseFloat would fail. But I also really don't want this inadvertently raising an exception.

Comment on lines +313 to +316
} else if (input_->current().text[0] == '0') {
// octal or hexadecimal; don't bother parsing as float
AddError("Integer out of range.");
// We still return true because we did, in fact, parse a number.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I put this here, instead of just relying on the branch below, because I think an octal literal would likely be successfully parsed by ParseFloat, but incorrectly interpreted as decimal.

@@ -1002,9 +1002,19 @@ bool Tokenizer::ParseInteger(const std::string& text, uint64_t max_value,
}

double Tokenizer::ParseFloat(const std::string& text) {
double result;
GOOGLE_LOG_IF(DFATAL,
!TryParseFloat(text, &result))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I hate side effected in LOG statements, do you mind switching this to the more straight forward:

if (!TryParseFloat(...)) {
  LOG(DFATAL) << ...
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@fowles
Copy link
Contributor

fowles commented Sep 15, 2022

Thanks for the PR. Sorry about the many rounds of back and forth. Running CI one last time and then I will merge!

@jhump
Copy link
Contributor Author

jhump commented Sep 15, 2022

Thanks for the PR. Sorry about the many rounds of back and forth

No worries! I totally understand the desire for high code quality, and I know that I don't write C++ often enough to get everything right the first time :)

@fowles fowles merged commit c4644b7 into protocolbuffers:main Sep 15, 2022
bithium pushed a commit to bithium/protobuf that referenced this pull request Sep 4, 2023
…y-with-very-large-decimal-numbers

protoc: fix consistency with parsing very large decimal numbers
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

protoc handles very large negative integer literals inconsistently
5 participants