formatError for undefined tokens #114

tjvr · 2019-01-02T01:07:46Z

It's fairly natural to write code of the form:

  while (tok = lexer.next()) {
    try {
      parser.eat(tok)
    } catch (err) {
      throw new Error(lexer.formatError(tok, "Syntax error"))
    }
  }

  try {
    var program = parser.result()
  } catch (err) {
    throw new Error(lexer.formatError(tok, "Unexpected EOF")) // Not allowed!
  }
  return program

The second formatError call is not valid, because tok will be undefined here. Moo uses undefined to indicate that there are no more tokens, i.e. we've reached the end of the buffer.

There's no way to get Moo to format an error at the end of the file, after the last token, without manually constructing an EOF token. I propose letting formatError accept undefined, and silently interpret it as an EOF token.

Alternatively, we could introduce a lexer.makeEOF() method which returns this end-of-file token directly.

nathan · 2019-01-06T17:54:47Z

I don't really have a problem with formatError() interpreting null/undefined as EOF, but it seems confusing and unreadable to write code that uses tok outside of its logical scope to mean the constant undefined. Even though the example above uses a while loop, it reads as a for loop:

for (let tok; tok = lexer.next();) {
  try {
    parser.eat(tok)
  } catch (err) {
    throw new Error(lexer.formatError(tok, "Syntax error"))
  }
}

which makes using it outside of the loop unintuitive and odd. I think it makes more sense to write the second call to formatError() as

lexer.formatError(null, "Unexpected EOF")

or simply

lexer.formatError("Unexpected EOF")

Additionally, perhaps such a call should use the current lexer position rather than always use EOF; it would be odd to call formatError in the middle of the token stream and have the result point to its end.

tjvr · 2019-01-07T19:17:49Z

Agreed on all points! Thanks 😊

_{Sent with GitHawk}

nathan · 2019-02-25T17:08:41Z

moo.js

@@ -544,7 +544,24 @@
    }
  }

+  Lexer.prototype.makeEOF = function(type) {


Is makeEOF useful for something other than formatError (to which the user can just pass null or undefined)? It seems more like an implementation detail.

Furthermore, aren't line and col inconsistent with offset when the lexer isn't actually at EOF? Should you be allowed to call makeEOF in that scenario?

aren't line and col inconsistent with offset when the lexer isn't actually at EOF?

That's a great point; I'm not sure what I was thinking.

There are some styles of parser where it's useful to have an EOF token; I think that was the idea here. However I haven't yet needed makeEOF() in practice, so I'm happy to remove it.

nathan · 2019-02-26T01:43:13Z

Sounds good. I think my comment above may still be relevant:

perhaps such a call should use the current lexer position rather than always use EOF; it would be odd to call formatError in the middle of the token stream and have the result point to its end.

nathan · 2019-02-27T00:54:18Z

Awesome, thanks! LGTM

tjvr requested a review from nathan January 3, 2019 12:04

tjvr added 2 commits January 3, 2019 13:11

formatError for undefined tokens

672de82

Expose makeEOF()

9cc0593

tjvr force-pushed the format-error-eof branch from 6322e78 to 9cc0593 Compare January 3, 2019 13:11

tjvr added the enhancement label Jan 10, 2019

nathan reviewed Feb 25, 2019

View reviewed changes

Remove makeEOF

496b498

tjvr added 2 commits February 26, 2019 23:10

Fix formatError for null token not at EOF

1e52e3d

Simplify control flow

5c47290

nathan approved these changes Feb 27, 2019

View reviewed changes

tjvr merged commit f2f501d into master Feb 27, 2019

nathan deleted the format-error-eof branch February 27, 2019 16:03

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

formatError for undefined tokens #114

formatError for undefined tokens #114

tjvr commented Jan 2, 2019 •

edited

Loading

nathan commented Jan 6, 2019

tjvr commented Jan 7, 2019

nathan Feb 25, 2019

tjvr Feb 25, 2019

tjvr Feb 25, 2019

nathan commented Feb 26, 2019 •

edited

Loading

nathan commented Feb 27, 2019

formatError for undefined tokens #114

formatError for undefined tokens #114

Conversation

tjvr commented Jan 2, 2019 • edited Loading

nathan commented Jan 6, 2019

tjvr commented Jan 7, 2019

nathan Feb 25, 2019

Choose a reason for hiding this comment

tjvr Feb 25, 2019

Choose a reason for hiding this comment

tjvr Feb 25, 2019

Choose a reason for hiding this comment

nathan commented Feb 26, 2019 • edited Loading

nathan commented Feb 27, 2019

tjvr commented Jan 2, 2019 •

edited

Loading

nathan commented Feb 26, 2019 •

edited

Loading