Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The $ prefix is ambiguous #347

Closed
gamesaucer opened this issue Feb 22, 2023 · 3 comments · Fixed by #356
Closed

The $ prefix is ambiguous #347

gamesaucer opened this issue Feb 22, 2023 · 3 comments · Fixed by #356

Comments

@gamesaucer
Copy link

Steps to reproduce: Parse the following grammar:

$Start = $Part
$Part = .

Expected result: The validity of this Peggy grammar is decided by whether rule names can start with $.
Actual result: It is not a valid Peggy grammar because rule Part is not defined.

Because Peggy allows the dollar sign $ to start rule names AND has a $ prefix operator, one of the following must be true:

  1. The first $ is always parsed as an operator. In this case, a rule cannot reliably be matched simply by writing its name. There is also no way to match $Part without returning the text rather than the match. This is the current implementation.
  2. The first $ is always parsed as part of the identifier. In this case, to return the matched text of a nonterminal, a space is required before the nonterminal to disambiguate between $ Part (return text of rule Part) and $Part (return match of rule $Part), or the expression needs to be wrapped in parentheses. This makes it inconsistent with the operator's behaviour before terminals.
  3. The $ is parsed as part of the identifier if that identifier exists. In this case, adding a rule called Part would break existing references to a rule called $Part or vice versa. It would also it becomes ambiguous what a $ does before a nonterminal and require an extra parsing pass to determine how it is interpreted.

All those options seem undesirable to me. Instead, I'd advocate for either disallowing $ at the start of rule names, or changing the $ operator to something that cannot appear at the start of an identifier. Unfortunately, any alteration to this behaviour would be a breaking change. So if it's decided this won't be fixed, I understand why that's the case.

@hildjj
Copy link
Contributor

hildjj commented Feb 22, 2023

This is a valid issue, and I'm surprised it hasn't come up before. I don't have a better solution that disallowing $ as a start character in identifiers.

@Mingun any insights?

@Mingun
Copy link
Member

Mingun commented Feb 22, 2023

Yes, I think we should forbid that symbol in identifiers (only at start or everywhere if something$something could be treated ambiguously). It's amazing how it hasn't been noticed before

@hildjj
Copy link
Contributor

hildjj commented Feb 24, 2023

I think $ is fine in the middle of identifier.

top = fb

fb = $foo$bar

foo$bar = foo $bar

$foo$bar = 'baz'

foo = 'foo'

bar = 'b' 'a'+ 'r'

this matches "foobaaar", but not "baz".

hildjj added a commit to hildjj/peggy that referenced this issue Mar 5, 2023
* main: (21 commits)
  Update CHANGELOG.md
  Update version number & rebuild
  Update dependencies
  Update test/unit/compiler/passes/report-infinite-repetition.spec.js
  Fixes peggyjs#357.  Do not allow infinite recursion in repetition delimiter.
  Update changelog
  Allow extra semicolons between rules.
  Fix an error in the code generator for "repeated" node
  Update changelog
  Fixes peggyjs#329
  Update changelog
  Fixes peggyjs#359.  Clarifies documentation about reserved words.
  Fix more HTML indentation.
  Test that the generated parser also works without errors
  Remove use of expect.to.not.throw()
  Add Rene Saarsoo to AUTHORS
  Typo in test description
  Add test to ensure special non-reserved keywords are allowed
  Comment out unnecessary reserved words
  Fixes peggyjs#347.  Makes $ invalid as an identifier start character.
  ...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants