Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Don't Panic! 50% plus performance improvement #4192

Merged
merged 2 commits into from
Mar 18, 2023
Merged

Conversation

jimidle
Copy link
Collaborator

@jimidle jimidle commented Mar 18, 2023

feat: Another 50%+ performance improvement

Prior to this change, the runtime and generated code was using panic() and recover()
to perform error checking and reporting. This is extremely expensive and just not the
way to do it.

This change now uses goto, and explicit checking for error state after selected calls
into the runtime. This has greatly improved parser performance. Using the test code
provided by a recent performance issue report, the parse is now twice as fast as the
issue raised was hoping for.

This change may require code changes by users if they have added Go code in to their
parsers and have declared variables at random points in actions. They merely have to
declare the variables at the start of the rule. This is a minor price to pay for the massive
performance improvement.

@parrt If you can merge this when you see it, I have many more performance improvements
to send your way.

@kaby76 If you have time to run any of your performance tests, I would appreciate your feedback sir.

… flowcontrol - 50% performance improvement

  - Prior to this change, a recognition error was tracked by performing a panic(),
    which the generated code for rules would then use recover() to discover. However,
    recover() is not like catch(){} in Java and is expensive to run even if there is
    no panic() to find on the execution stack. Eliminating this and doing a simple check
    at the end of rule execution brings with it a massive performance improvement up to
    50% of CPU execution time. Now that collections and context caching is working correctly
    this is a significant improvement in execution time.

Signed-off-by: Jim.Idle <jimi@idle.ws>
Prior to this change, the runtime and generated code was using panic() and recover()
to perform error checking and reporting. This is extremely expensive and just not the
way to do it.

This change now uses goto, and explicit checking for error state after selected calls
into the runtime. This has greatly improved parser performance. Using the test code
provided by a recent performance issue report, the parse is now twoice as fast as the
issue raised was hoping for.

Signed-off-by: Jim.Idle <jimi@idle.ws>
@parrt
Copy link
Member

parrt commented Mar 18, 2023

Seems OK to me. It looks like this would only improve performance when there's recognition Errors however right?

@parrt parrt merged commit adbe946 into antlr:dev Mar 18, 2023
@jimidle
Copy link
Collaborator Author

jimidle commented Mar 19, 2023 via email

@kaby76
Copy link
Contributor

kaby76 commented Mar 19, 2023

I'll be testing this today. K

@jimidle jimidle deleted the feature/nopanic branch March 19, 2023 06:33
@kaby76
Copy link
Contributor

kaby76 commented Mar 19, 2023

I guess the tool has changed:

$ make
bash build.sh
go: added github.com/antlr/antlr4/runtime/Go/antlr/v4 v4.0.0-20230318180256-adbe946416f5
go: added golang.org/x/exp v0.0.0-20220722155223-a9213eeb770e
# example.com/myparser/parser
parser\java_parser.go:1495:67: not enough arguments in call to p.GetInterpreter().AdaptivePredict
        have (antlr.TokenStream, number, antlr.ParserRuleContext)
        want (*antlr.BaseParser, antlr.TokenStream, int, antlr.ParserRuleContext)

This is going to take a little time to work out. The scripts not only need to use a specific repo/branch of the runtime, but it will also need to generate a tool specific branch of Antlr because the template was updated. Currently, the perf scripts work against what is published, which limits things a bit.

@parrt
Copy link
Member

parrt commented Mar 19, 2023

yay! back to the 80s with goto. love it.

@jimidle
Copy link
Collaborator Author

jimidle commented Mar 19, 2023 via email

@kaby76
Copy link
Contributor

kaby76 commented Mar 19, 2023

before.tar.gz -- Generated Go parser driver with "master" reference.

after.tar.gz -- Generated Go parser driver with "dev" reference.

I've been adjust my scripts to work with any repo/branch as source in my driver generating code.

In the meanwhile, I am just gathering some data for a sanity check, to make sure I am referencing the right version of the code. Unfortunately, I do not see any performance increase with the "dev" branch Go runtime over version 4.12. In fact, I now see some grammars/input no longer terminating (killed after 5 minutes vs. before completed in 55s). grammars-v4/sql/mysql/Positive-Technologies, examples/bitrix_queries_cut.sql. I am using the published 4.12 version vs. the "dev" version. Please check. The two .tar.gz files are generated parsers for Windows Msys. (I never use ridiculous cmd or Powershell if I can help it.) I see similar behavior over on Ubuntu. I will check to make sure there are no diffs in the git clone images between using a git on the command line versus what go thinks it did over in its cache area.

@jimidle
Copy link
Collaborator Author

jimidle commented Mar 20, 2023 via email

@kaby76
Copy link
Contributor

kaby76 commented Mar 20, 2023

I tried grammars-v4/java/java first, and saw a slight perf slowdown for the dev-branch, too. That's when I tried the MySQL grammar. I'll try some other grammars.

@jimidle
Copy link
Collaborator Author

jimidle commented Mar 20, 2023 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants