-
Notifications
You must be signed in to change notification settings - Fork 233
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve error reporting #459
Conversation
formatError. Display better error message when parser isn't expected further input.
offending line. Output line numbers. Better error output when no more input is expected. Display friendly error report even on lexer errors.
This looks very nice, any progress getting it merged? |
@shmish111 I haven't heard any news, I guess there isn't a lot of momentum there to get this merged. |
Is the moo PR merged? Also does this mean you're storing lines somewhere? Is there a memory concern for if someone's parsing something non-line-oriented, where you might be storing the entire input stream? Can we keep ^ instead of the unicode arrow? Can the commented code lying around (e.g. line 300) be removed? Merge conflicts? |
I would love to see this merged. |
Would also love to see this merged. This would unlock a key use case for us-- giving users specific feedback on errors parsing their input in our equation DSL. |
@airportyh Do you think this can be de-conflicted without too much trouble? |
@kach There's a bunch of context that I need to re-enter my head. I'll revisit this next week. |
@kach I merged the conflicts from master. The moo PR (no-context/moo#129) is still pending though, we won't get the display of the 5 lines code without it if using moo. If using Nearley by itself, you still get the display. Both the StreamLexer in Nearley and moo already buffer all of the input, so I don't think the code introduces a new big problem. In moo, the authors requested that I do not use String.split on the entire buffer because they worry the buffer might be large, and I changed the code there. I think it's really a moot point because if you are already storing the entire buffer, you code will not work for very large inputs anyway. Nearley's model is to feed a smaller chunk at a time, which means the buffer will contain only a chunk at a time, and as long as that's not large, it's not a problem. |
I'm so happy to see that work is being done on this! |
Thank you for the rapid turnaround! Can we replace the unicode up-arrow with a |
@kach removed the special character. As for the change in the package-lock.json, I think it has to do with me having a different version of npm when doing |
No worries — thank you for your quick updates! I'm reviewing this right now, I swear we're going to get this PR merged before its birthday… :) |
test/parser.test.js:85, what's going on there? It leads to stuff printed to console during tests… |
Another thing that confuses me (and it's possible my morning brain is just not getting it) is that if I type in
|
@kach Oops, that try/catch/log was to debug the exception. I've removed it. The "unexpected end of input" message from your code snippet seems to come from node. I am not totally sure I follow what you are saying in the second comment. I tried to reproduce what you see, but no luck. But I'll try to answer. I think it is regarding giving a better error when there is no more expected input. The current code in there does not generate an error when end of file is reached. The way it works is if a token is read, and the parse table has no more expectant states (states where the rule is expecting a terminal next), then it throws an error. The problem is that we don't treat end-of-file as a token, which means when it reaches the end of the input, there is nothing to trigger the error. For example, if you make a file with an incomplete parse:
and run it through nearleyc, you'd get:
Because the parser is in the middle of an incomplete parse, and therefore does not have any results yet. It might be worth considering emitting an end-of-file token at the end (this is traditionally done as I remember from college), or equivalently, have a method on the parser to tell it that we have received all of the input: |
Line 436 still has a unicode arrow! (Also the associated test.) Also commented code on 299. Again thanks for your patience, I've been really busy these past few days and haven't found time to look at PRs outside of weekends. |
@kach no worries, those are fixed. |
Merged! Thanks so much for the code and your patience. It's pushed to npm (2.19.5), let's keep an eye on the issues in the next few days in case anyone reports anything. But it should be fine. :-) |
Thank you @kach! |
This PR continues the work for #451:
I am planning on sending a PR to moo.js also (no-context/moo#129), to attach the errored token to its error object, so that the error report can display it. The code in this PR uses that token if it exists.
Code context and line numbers:
Custom lexer error:
Parser is not expecting more input: