-
Notifications
You must be signed in to change notification settings - Fork 65
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
The $
prefix is ambiguous
#347
Comments
This is a valid issue, and I'm surprised it hasn't come up before. I don't have a better solution that disallowing @Mingun any insights? |
Yes, I think we should forbid that symbol in identifiers (only at start or everywhere if |
I think top = fb
fb = $foo$bar
foo$bar = foo $bar
$foo$bar = 'baz'
foo = 'foo'
bar = 'b' 'a'+ 'r' this matches "foobaaar", but not "baz". |
* main: (21 commits) Update CHANGELOG.md Update version number & rebuild Update dependencies Update test/unit/compiler/passes/report-infinite-repetition.spec.js Fixes peggyjs#357. Do not allow infinite recursion in repetition delimiter. Update changelog Allow extra semicolons between rules. Fix an error in the code generator for "repeated" node Update changelog Fixes peggyjs#329 Update changelog Fixes peggyjs#359. Clarifies documentation about reserved words. Fix more HTML indentation. Test that the generated parser also works without errors Remove use of expect.to.not.throw() Add Rene Saarsoo to AUTHORS Typo in test description Add test to ensure special non-reserved keywords are allowed Comment out unnecessary reserved words Fixes peggyjs#347. Makes $ invalid as an identifier start character. ...
Steps to reproduce: Parse the following grammar:
Expected result: The validity of this Peggy grammar is decided by whether rule names can start with
$
.Actual result: It is not a valid Peggy grammar because rule
Part
is not defined.Because Peggy allows the dollar sign
$
to start rule names AND has a$
prefix operator, one of the following must be true:$
is always parsed as an operator. In this case, a rule cannot reliably be matched simply by writing its name. There is also no way to match$Part
without returning the text rather than the match. This is the current implementation.$
is always parsed as part of the identifier. In this case, to return the matched text of a nonterminal, a space is required before the nonterminal to disambiguate between$ Part
(return text of rulePart
) and$Part
(return match of rule$Part
), or the expression needs to be wrapped in parentheses. This makes it inconsistent with the operator's behaviour before terminals.$
is parsed as part of the identifier if that identifier exists. In this case, adding a rule calledPart
would break existing references to a rule called$Part
or vice versa. It would also it becomes ambiguous what a$
does before a nonterminal and require an extra parsing pass to determine how it is interpreted.All those options seem undesirable to me. Instead, I'd advocate for either disallowing
$
at the start of rule names, or changing the$
operator to something that cannot appear at the start of an identifier. Unfortunately, any alteration to this behaviour would be a breaking change. So if it's decided this won't be fixed, I understand why that's the case.The text was updated successfully, but these errors were encountered: