Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow comma and caret in symbols #43

Closed
goodmami opened this issue Jan 1, 2020 · 2 comments
Closed

Allow comma and caret in symbols #43

goodmami opened this issue Jan 1, 2020 · 2 comments
Labels
enhancement New feature or request

Comments

@goodmami
Copy link
Owner

goodmami commented Jan 1, 2020

The grammar (now documented at https://penman.readthedocs.io/en/latest/notation.html) disallows commas in symbols (specifically the NameChar production), which is to avoid problems when parsing triples (e.g., instance(a, alphabet)), where the comma is used to delimit the source and target of the triple.

The problem is that the comma does turn up in symbols (at least from some parsers, such as CAMR), such as numbers:

:quant (x26 / 900,000)))

Furthermore, the issue with parsing triples is not an issue in practice because the source of the triple should always be a variable which is even less likely to contain commas. Since there is no other reason to forbid commas, it should be safe to allow them if triple parsing always splits on the first comma and ignores the rest.

@goodmami
Copy link
Owner Author

goodmami commented Jan 2, 2020

The regex-based lexer prevents having some alternative comma-less symbol for triples because then it always ignores commas, even if they occur in the target side. So for triples it needs to anticipate a number of patterns and handle them separately:

  • role(a,b) (a,b is the symbol)
  • role(a, b) (a, is the symbol)
  • role(a , b) (a is the symbol)
  • role(a ,b) (need to get comma from ,b, because role(a, ,b) would be possible if commas are allowed)

@goodmami goodmami changed the title Allow comma in symbols Allow comma and caret in symbols Jan 2, 2020
@goodmami
Copy link
Owner Author

goodmami commented Jan 2, 2020

Similarly, the caret ^ is only used for triple conjunctions, so it could theoretically appear in a PENMAN symbol. I have not seen it in practice, though, unlike commas.

@goodmami goodmami added the enhancement New feature or request label Jan 3, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant