Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Separator DSL methods lookahead issue. #391

Closed
ghost opened this issue Mar 12, 2017 · 8 comments
Closed

Separator DSL methods lookahead issue. #391

ghost opened this issue Mar 12, 2017 · 8 comments
Labels

Comments

@ghost
Copy link

ghost commented Mar 12, 2017

@bd82
Copy link
Member

bd82 commented Mar 12, 2017

It seems that MANY_SEP has somewhat changed.

Yes it has in master but this changes have not yet been released to npm.
This is the relevant issue with all the details:
#367

I'll reply to your grammar issue separately.

@bd82
Copy link
Member

bd82 commented Mar 12, 2017

Did I do something wrong?

I'm not sure yet, did you try without MANY_SEP?
Try using regular MANY to define the separator grammar.

https://github.com/SAP/chevrotain/blob/master/examples/grammars/json/json.js#L65

@bd82
Copy link
Member

bd82 commented Mar 12, 2017

There are no token sequences that even contain the Comma.
There should not be, the Comma is optional as there may not be any iterations and only if there are more than one iteration will it even appear, so the Comma (an optional Token) cannot be used to distinguish between alternatives.

At what point does this error message appear?
It looks like it is from some alternation (OR) which needs to decide between FunctionLiteral
and something else?

I'm guessing a single left parenthesis is not enough to distinguish between those two alternatives.
Perhaps there is an ambiguity here that is unresolved.

If this is indeed the case a larger example is needed to debug, preferably one with only the grammar (no embedded actions) as it makes it much easier to debug.

@bd82
Copy link
Member

bd82 commented Mar 13, 2017

First sneaky issue.

This is a invalid argument type issue.
Two way to avoid these kinds of errors:

  1. Use TypeScript, Its like babel but with a type system and Chevrotain includes type definitions
    So such an error would fail at compile time.
  2. Wait for me to implement "Dev" mode which will include runtime type checks for better error messages.

@bd82
Copy link
Member

bd82 commented Mar 13, 2017

MANY_SEP bug.

This looks like a bug 😞 . I don't see a problem with the code you have written
the versions should be equivalent.
Thanks for spotting it!

There different handling for lookahead calculation for MANY and MANY_SEP
So that may be the cause of the bug.

Is it possible to link to the full code so I can debug this?
It may be difficult to reproduce otherwise as it seems to rely on requiring > 1 token lookahead.

BTW what is the input token vector which raises the error?

@bd82
Copy link
Member

bd82 commented Mar 13, 2017

Optional Types

I'd love to have optional types in ECMAScript. When it would become available I would
strongly re-evaluate the advantages of using TypeScript for developing Chevrotain,
however I need the optional type system now 😄

Infinite loops

I'm very interested in those edge cases. Those are very hard for end users to figure out,
heck they are even hard for me to debug sometimes and I wrote most of the code 😄 ...
If you find new such cases please make a small pull request (that will never be merged) which reproduces the problem to your repository, that way there is something reproducible to checkout and debug in a later time.

Contributing

I'd be down for debugging those. 😄

I'm always happy to receive source code contributions.
However those infinite loops sometimes require diving very deep into Chevrotain and its data
structures and algorithms. If you are interested in contributing something maybe start with something smaller first?

https://github.com/SAP/chevrotain/issues?q=is%3Aissue+is%3Aopen+label%3ASimple
I think that implementing the DevMode's runtime type checks should be fairly simple.

@bd82
Copy link
Member

bd82 commented Mar 13, 2017

Link to last commit to reproduce MANY_SEP bug
https://github.com/kdex/chi/commit/2671cea045e86bfbfae0b4d5a3b12fcee2350610

@bd82 bd82 added the Bug 🪲 label Mar 13, 2017
@bd82
Copy link
Member

bd82 commented Mar 17, 2017

O.k.

The Scenario

At this point we need to decide between a three alternatives:
https://github.com/kdex/chi/blob/2671cea045e86bfbfae0b4d5a3b12fcee2350610/src/Parser.js#L145

this.OR([
   {ALT: () => this.SUBRULE(this.Literal)}, 
   {ALT: () => this.SUBRULE(this.Identifier)}, 
   {ALT: () => this.SUBRULE(this.ParenthesisExpression)
}]);

Alternatives 1 and 3 can both start with "("
Interestingly they can also both start with ["(" , "Ident"],
Or even ["(" , "Ident", ")"].

So several tokens must be looked ahead to distinguish.
In the case there is more than one argument as in the input ("(a,b) => {...}") the comma is actually used as distinguishing token. if there is only one argument the "fat arrow" is used to distinguish.

The Bug

The bug is that the separator token is not taken into account when computing the lookahead.
You can see in Chevrotain's code that the handling for Repetition and RepetionWithSeparator
is the same (but it should not be!) when calculating the lookahead. So the separator part is just ignored...
https://github.com/SAP/chevrotain/blob/master/src/parse/grammar/interpreter.ts#L252..L257

The Solution

Do not ignore the separator in the lookahead paths calculation.

Why this bug has not surfaced before.

It is a pretty strange edge case in which a separator of an optional repetition is required
to distinguish between alternatives...

bd82 pushed a commit that referenced this issue Jun 12, 2017
bd82 pushed a commit that referenced this issue Jun 12, 2017
@bd82 bd82 changed the title Has MANY_SEP changed? Separator DSL methods lookahead issue.. Jun 12, 2017
@bd82 bd82 changed the title Separator DSL methods lookahead issue.. Separator DSL methods lookahead issue. Jun 12, 2017
bd82 pushed a commit that referenced this issue Jun 12, 2017
bd82 pushed a commit that referenced this issue Jun 12, 2017
* Lookahead calculation does not take into account the separator in X_SEP
  DSL Methods.

* Possible Incorrect lookahead calculation in case of optional productions.

Fixes #391.
bd82 pushed a commit that referenced this issue Jun 12, 2017
* Lookahead calculation does not take into account the separator in X_SEP
  DSL Methods.

* Possible Incorrect lookahead calculation in case of optional productions.

Fixes #391.
bd82 pushed a commit that referenced this issue Jun 12, 2017
* Lookahead calculation does not take into account the separator in X_SEP
  DSL Methods.

* Possible Incorrect lookahead calculation in case of optional productions.

Fixes #391.
@bd82 bd82 closed this as completed in #491 Jun 13, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant