Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

formatError for undefined tokens #114

Merged
merged 5 commits into from
Feb 27, 2019
Merged

formatError for undefined tokens #114

merged 5 commits into from
Feb 27, 2019

Conversation

tjvr
Copy link
Collaborator

@tjvr tjvr commented Jan 2, 2019

It's fairly natural to write code of the form:

  while (tok = lexer.next()) {
    try {
      parser.eat(tok)
    } catch (err) {
      throw new Error(lexer.formatError(tok, "Syntax error"))
    }
  }

  try {
    var program = parser.result()
  } catch (err) {
    throw new Error(lexer.formatError(tok, "Unexpected EOF")) // Not allowed!
  }
  return program

The second formatError call is not valid, because tok will be undefined here. Moo uses undefined to indicate that there are no more tokens, i.e. we've reached the end of the buffer.

There's no way to get Moo to format an error at the end of the file, after the last token, without manually constructing an EOF token. I propose letting formatError accept undefined, and silently interpret it as an EOF token.

Alternatively, we could introduce a lexer.makeEOF() method which returns this end-of-file token directly.

@tjvr tjvr requested a review from nathan January 3, 2019 12:04
@nathan
Copy link
Collaborator

nathan commented Jan 6, 2019

I don't really have a problem with formatError() interpreting null/undefined as EOF, but it seems confusing and unreadable to write code that uses tok outside of its logical scope to mean the constant undefined. Even though the example above uses a while loop, it reads as a for loop:

for (let tok; tok = lexer.next();) {
  try {
    parser.eat(tok)
  } catch (err) {
    throw new Error(lexer.formatError(tok, "Syntax error"))
  }
}

which makes using it outside of the loop unintuitive and odd. I think it makes more sense to write the second call to formatError() as

lexer.formatError(null, "Unexpected EOF")

or simply

lexer.formatError("Unexpected EOF")

Additionally, perhaps such a call should use the current lexer position rather than always use EOF; it would be odd to call formatError in the middle of the token stream and have the result point to its end.

@tjvr
Copy link
Collaborator Author

tjvr commented Jan 7, 2019

Agreed on all points! Thanks 😊

Sent with GitHawk

moo.js Outdated
@@ -544,7 +544,24 @@
}
}

Lexer.prototype.makeEOF = function(type) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is makeEOF useful for something other than formatError (to which the user can just pass null or undefined)? It seems more like an implementation detail.

Furthermore, aren't line and col inconsistent with offset when the lexer isn't actually at EOF? Should you be allowed to call makeEOF in that scenario?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

aren't line and col inconsistent with offset when the lexer isn't actually at EOF?

That's a great point; I'm not sure what I was thinking.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are some styles of parser where it's useful to have an EOF token; I think that was the idea here. However I haven't yet needed makeEOF() in practice, so I'm happy to remove it.

@nathan
Copy link
Collaborator

nathan commented Feb 26, 2019

Sounds good. I think my comment above may still be relevant:

perhaps such a call should use the current lexer position rather than always use EOF; it would be odd to call formatError in the middle of the token stream and have the result point to its end.

@nathan
Copy link
Collaborator

nathan commented Feb 27, 2019

Awesome, thanks! LGTM

@tjvr tjvr merged commit f2f501d into master Feb 27, 2019
@nathan nathan deleted the format-error-eof branch February 27, 2019 16:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants