Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixes #1982 - skips building errors / CST during backtracking #1983

Closed
wants to merge 5 commits into from

Conversation

matthew-dean
Copy link
Contributor

@matthew-dean matthew-dean commented Aug 27, 2023

Related to #1982.

I looked at the benchmark_web section, but I couldn't get that to work, and I don't think any of the examples have backtracking anyway.

I'm going to try this with my local parser for performance improvements.

@matthew-dean matthew-dean changed the title (Partially) fixes #1982 - skips building errors during backtracking Fixes #1982 - skips building errors / CST during backtracking Aug 27, 2023
@matthew-dean
Copy link
Contributor Author

I wonder actually if this would be considered a breaking change, based on this discussion. That is, if someone has modified / overridden backtracking, and is expecting errors to be built, this might break that?

@matthew-dean
Copy link
Contributor Author

I will close this as I got a local solution that does the exact same thing just extends the CstParser

@bd82
Copy link
Member

bd82 commented Aug 30, 2023

I will close this as I got a local solution that does the exact same thing just extends the CstParser

Why?
I have not looked into this yet but from the title it seems like a good generic approach.

I hope to evaluate this on the weekend....

@matthew-dean
Copy link
Contributor Author

matthew-dean commented Aug 30, 2023

@bd82 Oh! Well, it does work! It basically does the following when backtracking:

  1. Skips cloning (or restoring errors)
  2. Skips saving errors, which involves serialization, so a tiny optimization there.
  3. Skips creating or modifying a CSTNode

To be honest, part of the reason I closed it is because part of the huge slow-down was not just backtracking, but in poor ALT performance of chevrotain-allstar when using recursive ALT blocks.

But, when doing the above 3 steps, then what you're left with is just, essentially, a structured lookahead.

If, though, in addition to this, a successful BACKTRACK skipped a matching ALT (an ALT with the same SUBRULE) when successful (which IMO is somewhat trivial to implement), then you could remove the part in the documentation that strongly recommends against backtracking, because IMO it would have almost no performance hit whatsoever. But you seemed to be against this in this discussion.

(Note: these approaches might need to be somewhat exclusive, because you cannot avoid building a CST if you want to then cache it and return it with a matching SUBRULE.)

So, because you seemed somewhat against making BACKTRACK performant, I figured maybe I should close this. But if the current files in this PR or the above approach seems compelling to you, let me know! I'm happy to make backtracking super-fast and useful, but if you're philosophically against backtracking, I wasn't sure I should invest much time into it.

@bd82
Copy link
Member

bd82 commented Aug 31, 2023

but if you're philosophically against backtracking, I wasn't sure I should invest much time into it.

I prefer to avoid backtracking where possible, but if there is an easy win to improve performance we could consider it.
If I remember correctly the Java Parser is stuck on an old version of Chevrotain

@bd82 bd82 reopened this Aug 31, 2023
@jtkiesel
Copy link

jtkiesel commented Nov 2, 2023

A certain set of Java files currently takes ~2500ms to be parsed by the Java Parser on Chevrotain 6.5.0 when run on my machine. After upgrading to Chevrotain 11.0.3, those same files took ~4300ms to parse (~72% slower). I made the changes proposed in this PR to the Java Parser directly, and parsing those files now only takes ~2750ms (only ~10% slower than on Chevrotain 6.5.0, and ~56% faster than on Chevrotain 11.0.3 without these changes).

In order to avoid having to copy/paste ~200 lines of code from Chevrotain to the Java Parser, it would be great if these changes could be merged into Chevrotain itself. Let me know if there is anything I can do to help this PR along!

@bd82
Copy link
Member

bd82 commented Jan 10, 2024

I wonder:

  1. Could I remove built-in back-tracking support in Chevrotain?
  2. Could back-tracking be implemented as an optional plugin (same as LL-star lookahead).

Basically It was always recommended to avoid back-tracking to avoid back-tracking
due to both performance reasons and it did not always align smoothly with other features.

Nowadays it seems that many of the use cases for back-tracking can be filled via: LL-* lookahead.

@jtkiesel What scenarios in the Java-Parser still require backtracking? could you point to one of the simpler ones?

@bd82
Copy link
Member

bd82 commented Jan 10, 2024

Backtracking as a Plugin details

It seems many of the changes (at least in the PR) are either:

  1. at the start of a method
  2. at the end of a method
  3. conditionally disable a sub-method call from a method.

It seems to me that most cases would be easy to implement (without copy/pasta) in an extending class

@matthew-dean @jtkiesel WDYT?
WDYT.

I have to admit I did not yet have a look at the existing backtracking related code in Chevrotain
Only the PR code.

@bd82 bd82 closed this Oct 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants