-
Notifications
You must be signed in to change notification settings - Fork 97
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
perf: Skip the first goto call in full construction (-35%) #33
Conversation
This should be the last PR :) It does complicate things a bit more but it really helps when querying for the next state of However, if |
46fd1e8
to
f4ccc69
Compare
Another thing that could have improved things further would be to actively update |
(*) Second to last PR, it is possible to improve (and complicate) the |
Thanks for this! I'm going to let this sit for at least a week to allow any additional changes you might have planned to accrue here in order to minimize churn and context switching. |
This ended up not having that large of an impact so I scrapped it. Instead I realized that The only idea I have left to try is to re-apply the above idea as it could be more significant after this improvment but I don't have those sources accessible atm so that will have to wait. |
This is probably less significant when larger number of needles are used but should still be a decent speed improvement. ``` name old ns/iter new ns/iter diff ns/iter diff % speedup bench_construction 131,148 122,956 -8,192 -6.25% x 1.07 bench_full_construction 309,493 200,595 -108,898 -35.19% x 1.54 ```
``` name old ns/iter new ns/iter diff ns/iter diff % speedup bench_construction 167,415 25,804 -141,611 -84.59% x 6.49 bench_full_construction 275,170 133,253 -141,917 -51.57% x 2.07 ```
Since the previous commit were so significant I find it safest to add and explicit function for this so we don't rely on LLVM doing the optimization for dense states.
Rebased and added
Still no improvement from this so I am finished with all optimizations I am able to do here. |
ping @BurntSushi (this is done) |
Awesome! Thanks so much for this. Sorry it took a while. I needed to make time to carefully review this. I think I buy everything! I've merged this locally (to drop the unneeded CI commit---thank you for that! I fixed it elsewhere) and pushed it to crates.io in These speed improvements are really really nice. It actually makes a huge difference in a real test case I've been playing with: BurntSushi/ripgrep#838 |
This is probably less significant when larger number of needles are used
but should still be a decent speed improvement.
(The two benchmarks were done on different computers)