-
Notifications
You must be signed in to change notification settings - Fork 4.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Uses fold+reduce for handling duplicate pubkeys during index generation #34011
Uses fold+reduce for handling duplicate pubkeys during index generation #34011
Conversation
Codecov Report
@@ Coverage Diff @@
## master #34011 +/- ##
=========================================
- Coverage 81.9% 81.9% -0.1%
=========================================
Files 811 811
Lines 219412 219428 +16
=========================================
+ Hits 179792 179797 +5
- Misses 39620 39631 +11 |
5c414c2
to
7512e9a
Compare
self.accounts_index | ||
.add_uncleaned_roots(uncleaned_roots.into_iter()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was moved up into the block that makes uncleaned_roots
. Same for the other "removed" lines just above here.
accounts_data_len_dedup_timer.stop(); | ||
timings.accounts_data_len_dedup_time_us = accounts_data_len_dedup_timer.as_us(); | ||
timings.slots_to_clean = uncleaned_roots.len() as u64; | ||
|
||
self.accounts_index | ||
.add_uncleaned_roots(uncleaned_roots.into_iter()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Following from https://github.com/solana-labs/solana/pull/34011/files#r1389679091, those lines were moved here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice find. lgtm.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
Problem
When generating the index we visit duplicate pubkeys to get uncleaned roots and also the accounts data len from those duplicates. This visitation is farmed out to multiple threads with rayon. To capture the results, we currently use a mutex. This causes undue communication and contention between these threads, since rayon has primitives for doing these reductions.
Summary of Changes
Replace the mutex with an impl that uses Rayon's
fold
andreduce
.Results
Using ledger-tool and a recent mnb snapshot, I compared master (as of
69ab8a8234
) to this PR. Since this change only impacts index generation, I only looked at that metric, and specifically theaccounts_data_len_dedup_time_us
datapoint:1,974,473
1,700,625
273,848
So not huge, but it's something!