Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ast: New parser implementation #2219

Merged
merged 1 commit into from
Mar 26, 2020

Conversation

tsandall
Copy link
Member

This commit replaces the existing PEG generated parser with a parser
implemented by hand. The new parser is more efficient (avoiding old
problems with pathological input cases like {{{{{{{{{}}}}}}}} and
deeply-nested composites in general) and offers better opportunities
for improved error reporting (which has been improved already but
there is still room to grow.)

During the test process of implementing the new parser, we identified
a few issues that were present in the old parser. Those issues are
fixed by this commit.

Fixes #1251
Fixes #501
Fixes #2198
Fixes #2199
Fixes #2200
Fixes #2201
Fixes #2202
Fixes #2203

Co-authored-by: Torin Sandall torinsandall@gmail.com
Co-authored-by: Patrick East east.patrick@gmail.com

Signed-off-by: Torin Sandall torinsandall@gmail.com
Signed-off-by: Patrick East east.patrick@gmail.com

@tsandall tsandall requested a review from patrick-east March 23, 2020 19:55
@tsandall
Copy link
Member Author

cc @jaspervdj-luminal @srenatus

patrick-east
patrick-east previously approved these changes Mar 23, 2020
Copy link
Contributor

@patrick-east patrick-east left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 👍

Copy link
Contributor

@srenatus srenatus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice. Just glanced at stuff, I'm curious about the benchmarks. Regardless, having better error messages seems like a good enough reason already, though.

CHANGELOG.md Outdated
The new parser contains a small number of backwards incompatible changes that
correct questionable behaviour from the old parser. These changes affect
a very small number of actual policies and we feel confident in the decision to
break backwards compatiblity here.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo

Suggested change
break backwards compatiblity here.
break backwards compatibility here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

return isDecimal(ch) || ch >= utf8.RuneSelf && unicode.IsDigit(ch)
}

func isDecimal(ch rune) bool { return '0' <= ch && ch <= '9' }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This passes gofmt? I wouldn't have thought so 💡

@@ -1,311 +0,0 @@
{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💭 Do we have a valid grammar somewhere? Could be useful to library authors processing Rego code.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

@jaspervdj-luminal jaspervdj-luminal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great. I'm gonna try to run this against our stash of rego code and dig into the parser a bit more, but yeah 💯 from what I can tell right now.

name: "Number",
},
&ruleRefExpr{
pos: position{line: 198, col: 20, offs
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This use of defer is incredibly smart.

This commit replaces the existing PEG generated parser with a parser
implemented by hand. The new parser is more efficient (avoiding old
problems with pathological input cases like {{{{{{{{{}}}}}}}} and
deeply-nested composites in general) and offers better opportunities
for improved error reporting (which has been improved already but
there is still room to grow.)

During the test process of implementing the new parser, we identified
a few issues that were present in the old parser. Those issues are
fixed by this commit.

Fixes open-policy-agent#1251
Fixes open-policy-agent#501
Fixes open-policy-agent#2198
Fixes open-policy-agent#2199
Fixes open-policy-agent#2200
Fixes open-policy-agent#2201
Fixes open-policy-agent#2202
Fixes open-policy-agent#2203

Co-authored-by: Torin Sandall <torinsandall@gmail.com>
Co-authored-by: Patrick East <east.patrick@gmail.com>

Signed-off-by: Torin Sandall <torinsandall@gmail.com>
Signed-off-by: Patrick East <east.patrick@gmail.com>
@tsandall
Copy link
Member Author

I've updated the changelog to include the benchstat output against master.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment