-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Trie hash optimizations #5827
Trie hash optimizations #5827
Conversation
Depends on alloy-rs/nybbles#1 |
cc @DaniPopes this should be in alloy-trie, correct? |
Only the primitives change. If you don't mind you can open a PR to https://github.com/alloy-rs/trie with the prealloc commit @dvush |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM after comments are resolved. cc @rkrasiuk
@dvush in any follow-up perf PRs it would be great to see side by side comparison of flamegraphs / benches before and after the change. otherwise, it's tough to verify the performance boost from the changes. even some obvious "improvements" might degrade the performance as we've experienced in the past |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
overall lgtm modulo my comment re perf PRs above. tested at mainnet tip and ran execution debug script on holesky
This PR improves a performance of the trie when calculating hash.
In my benchmark it reduces time from 100ms to 90ms and this 10ms are more noticeable in implementation that caches db reads for the trie.
The most influential commits are:
This key clone alone is mostly responsible for 10ms drop
Recently hashbrown was removed from the repo and it made all hash maps / sets much slower. Its mostly due to hashbrown using ahash for hashing while std is using siphash. This only changes hashmaps in trie related code but other parts of reth can benefit from it too.
push_branch_node
a bit. (without it about 20% of the method is spent doing this collect) (moved to prealloc children alloy-rs/trie#1)