Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimizations for the generated parser #2358

Conversation

austin-schick
Copy link
Contributor

Changes made

  • Use a while loop and break to simulate the C parser's goto behavior, instead of creating and calling a done() function
  • Stop running PyBytes_FromStringAndSize when tokenizing. I verified that .bytes was never read during execution. But it's possible that this wasn't a valid change.
  • Add the nodejs profiling script to source control
  • Small improvements to scripts and tests

Performance improvements

Here's are the times I see on my laptop for the datetime tests:

First PEG parser:                      193ms
Generated PEG parser (before this PR): 130ms
Generated PEG parser (after this PR):   88ms
Legacy parser:                          55ms

And here's an additional datapoint from my pyperformance tests:

First PEG parser:                      2.8s
Generated PEG parser (before this PR): 2.6s
Generated PEG parser (after this PR):  2.1s
Legacy parser:                         1.2s

So I think we're sitting around 1.7x slower than the legacy parser after these improvements.

@@ -15,6 +15,11 @@
var _b_ = __BRYTHON__.builtins

const Load = new $B.ast.Load()
const NULL = undefined;
const ENDMARKER = 0,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was necessary to get the nodejs profiler running, but didn't seem necessary in the browser. I didn't have the time to dig into exactly why that discrepancy exists.

@PierreQuentel
Copy link
Contributor

Once again impressive work Austin !

I am currently working on a separate branch generated_peg_parser and making good progress to pass all the tests in the built-in test suite. I am not a GIT expert, is it possible to apply this PR to the branch ?

@austin-schick austin-schick changed the base branch from master to generated_peg_parser January 22, 2024 23:03
@austin-schick
Copy link
Contributor Author

Thank you, and yes! I updated this PR to merge into the generated_peg_parser branch. It looks like there are a few merge conflicts -- I should be able to look into those this week.

@austin-schick
Copy link
Contributor Author

austin-schick commented Jan 29, 2024

Merge conflicts fixed! I verified that datetime still parses, but I didn't run the full test suite on this branch. What's your current process for running the built-in test suite on the generated parser?

@PierreQuentel PierreQuentel merged commit 66b9efa into brython-dev:generated_peg_parser Jan 29, 2024
@PierreQuentel
Copy link
Contributor

I ran the test suite with the page /tests/parse_tests/index_generated_parser.html.

After fixing a bug in pegen.js, I have merged your PR. The whole test suite passes, slightly slower than with the legacy parser, but the difference is negligible compared to the benefits of the new parser.

The next step is to test all the scripts in the gallery, the documentation etc. before I can merge the branch into master.

Thanks once again Austin, this is a major step forward for Brython !

@austin-schick
Copy link
Contributor Author

Very exciting! I'm happy to have been able to help out some. Thanks for all your work on this Pierre!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants