-
Notifications
You must be signed in to change notification settings - Fork 3.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
store/cachekv, x/bank/types: algorithmically fix pathologically slow code #8719
Conversation
…code After continuously profiling InitGensis with 100K accounts, it showed pathologically slow code, that was the result of a couple of patterns: * Unconditional and not always necessary map lookups * O(n^2) sdk.AccAddressFromBech32 retrievals when the code is expensive, during a quicksort The remedy involved 4 parts: * O(n) sdk.AccAddressFromBech32 invocations, down from O(n^2) in the quicksort * Only doing map lookups when the domain key check has passed * Using a black magic compiler technique of the map clearing idiom * Zero allocation []byte<->string conversion With 100K accounts, this brings InitGenesis down to ~6min, instead of 20+min, it reduces the sort code from ~7sec down to 50ms. Also some simple benchmark reflect the change: ```shell name old time/op new time/op delta SanitizeBalances500-8 19.3ms ±10% 1.5ms ± 5% -92.46% (p=0.000 n=20+20) SanitizeBalances1000-8 41.9ms ± 8% 3.0ms ±12% -92.92% (p=0.000 n=20+20) name old alloc/op new alloc/op delta SanitizeBalances500-8 9.05MB ± 6% 0.56MB ± 0% -93.76% (p=0.000 n=20+18) SanitizeBalances1000-8 20.2MB ± 3% 1.1MB ± 0% -94.37% (p=0.000 n=20+19) name old allocs/op new allocs/op delta SanitizeBalances500-8 72.4k ± 6% 4.5k ± 0% -93.76% (p=0.000 n=20+20) SanitizeBalances1000-8 162k ± 3% 9k ± 0% -94.40% (p=0.000 n=20+20) ``` The CPU profiles show the radical change as per #7766 (comment) Later on, we shall do more profiling and fixes but for now this brings down the run-time for InitGenesis. Fixes #7766
0bda379
to
c19be27
Compare
Codecov Report
@@ Coverage Diff @@
## master #8719 +/- ##
==========================================
+ Coverage 61.35% 61.38% +0.02%
==========================================
Files 671 671
Lines 38322 38343 +21
==========================================
+ Hits 23512 23536 +24
+ Misses 12351 12348 -3
Partials 2459 2459
|
// from string -> []byte to speed up operations, it is not meant | ||
// to be used generally, but for a specific pattern to check for available | ||
// keys within a domain. | ||
func strToByte(s string) []byte { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mostly memo to self: there is another implementation in types/address/hash.go
. We'll eventually need to 1. find a place where to put these functions (and I hope we won't come up with another catch-all util
package) 2. reduce code duplication
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ack, we can do so later indeed, but for now this'll suffice and keep the PR focused :-)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Completely agree! Thanks for the great work! 🙏
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great job @odeke-em. Are you planning to merge str/bytes convertion functions? Otherwise I can do a quick PR for that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @robert-zaremba! Please go ahead and add them, but perhaps we should put them in an internal package so that no other packages out of the cosmos-sdk can import them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Chapeau @odeke-em. Thank you so much. Approved.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is awesome thank you very much @odeke-em! 🙌 🙌
Mind also adding an improvement changelog entry?
We should backport this to 0.41 and launchpad. |
Thank you for the reviews and kind words, @alessio @fedekunze @zmanian.
Done, @fedekunze, please take a look and thank you. |
…code (#8719) After continuously profiling InitGensis with 100K accounts, it showed pathologically slow code, that was the result of a couple of patterns: * Unconditional and not always necessary map lookups * O(n^2) sdk.AccAddressFromBech32 retrievals when the code is expensive, during a quicksort The remedy involved 4 parts: * O(n) sdk.AccAddressFromBech32 invocations, down from O(n^2) in the quicksort * Only doing map lookups when the domain key check has passed * Using a black magic compiler technique of the map clearing idiom * Zero allocation []byte<->string conversion With 100K accounts, this brings InitGenesis down to ~6min, instead of 20+min, it reduces the sort code from ~7sec down to 50ms. Also some simple benchmark reflect the change: ```shell name old time/op new time/op delta SanitizeBalances500-8 19.3ms ±10% 1.5ms ± 5% -92.46% (p=0.000 n=20+20) SanitizeBalances1000-8 41.9ms ± 8% 3.0ms ±12% -92.92% (p=0.000 n=20+20) name old alloc/op new alloc/op delta SanitizeBalances500-8 9.05MB ± 6% 0.56MB ± 0% -93.76% (p=0.000 n=20+18) SanitizeBalances1000-8 20.2MB ± 3% 1.1MB ± 0% -94.37% (p=0.000 n=20+19) name old allocs/op new allocs/op delta SanitizeBalances500-8 72.4k ± 6% 4.5k ± 0% -93.76% (p=0.000 n=20+20) SanitizeBalances1000-8 162k ± 3% 9k ± 0% -94.40% (p=0.000 n=20+20) ``` The CPU profiles show the radical change as per #7766 (comment) Later on, we shall do more profiling and fixes but for now this brings down the run-time for InitGenesis. Fixes #7766 Co-authored-by: Alessio Treglia <alessio@tendermint.com>
…code (#8719) After continuously profiling InitGensis with 100K accounts, it showed pathologically slow code, that was the result of a couple of patterns: * Unconditional and not always necessary map lookups * O(n^2) sdk.AccAddressFromBech32 retrievals when the code is expensive, during a quicksort The remedy involved 4 parts: * O(n) sdk.AccAddressFromBech32 invocations, down from O(n^2) in the quicksort * Only doing map lookups when the domain key check has passed * Using a black magic compiler technique of the map clearing idiom * Zero allocation []byte<->string conversion With 100K accounts, this brings InitGenesis down to ~6min, instead of 20+min, it reduces the sort code from ~7sec down to 50ms. Also some simple benchmark reflect the change: ```shell name old time/op new time/op delta SanitizeBalances500-8 19.3ms ±10% 1.5ms ± 5% -92.46% (p=0.000 n=20+20) SanitizeBalances1000-8 41.9ms ± 8% 3.0ms ±12% -92.92% (p=0.000 n=20+20) name old alloc/op new alloc/op delta SanitizeBalances500-8 9.05MB ± 6% 0.56MB ± 0% -93.76% (p=0.000 n=20+18) SanitizeBalances1000-8 20.2MB ± 3% 1.1MB ± 0% -94.37% (p=0.000 n=20+19) name old allocs/op new allocs/op delta SanitizeBalances500-8 72.4k ± 6% 4.5k ± 0% -93.76% (p=0.000 n=20+20) SanitizeBalances1000-8 162k ± 3% 9k ± 0% -94.40% (p=0.000 n=20+20) ``` The CPU profiles show the radical change as per #7766 (comment) Later on, we shall do more profiling and fixes but for now this brings down the run-time for InitGenesis. Fixes #7766 Co-authored-by: Alessio Treglia <alessio@tendermint.com>
* store/cachekv, x/bank/types: algorithmically fix pathologically slow code (#8719) After continuously profiling InitGensis with 100K accounts, it showed pathologically slow code, that was the result of a couple of patterns: * Unconditional and not always necessary map lookups * O(n^2) sdk.AccAddressFromBech32 retrievals when the code is expensive, during a quicksort The remedy involved 4 parts: * O(n) sdk.AccAddressFromBech32 invocations, down from O(n^2) in the quicksort * Only doing map lookups when the domain key check has passed * Using a black magic compiler technique of the map clearing idiom * Zero allocation []byte<->string conversion With 100K accounts, this brings InitGenesis down to ~6min, instead of 20+min, it reduces the sort code from ~7sec down to 50ms. Also some simple benchmark reflect the change: ```shell name old time/op new time/op delta SanitizeBalances500-8 19.3ms ±10% 1.5ms ± 5% -92.46% (p=0.000 n=20+20) SanitizeBalances1000-8 41.9ms ± 8% 3.0ms ±12% -92.92% (p=0.000 n=20+20) name old alloc/op new alloc/op delta SanitizeBalances500-8 9.05MB ± 6% 0.56MB ± 0% -93.76% (p=0.000 n=20+18) SanitizeBalances1000-8 20.2MB ± 3% 1.1MB ± 0% -94.37% (p=0.000 n=20+19) name old allocs/op new allocs/op delta SanitizeBalances500-8 72.4k ± 6% 4.5k ± 0% -93.76% (p=0.000 n=20+20) SanitizeBalances1000-8 162k ± 3% 9k ± 0% -94.40% (p=0.000 n=20+20) ``` The CPU profiles show the radical change as per #7766 (comment) Later on, we shall do more profiling and fixes but for now this brings down the run-time for InitGenesis. Fixes #7766 Co-authored-by: Alessio Treglia <alessio@tendermint.com> * Fix issues with merge * Apply suggestions from code review Co-authored-by: Emmanuel T Odeke <emmanuel@orijtech.com> Co-authored-by: Alessio Treglia <alessio@tendermint.com> Co-authored-by: Federico Kunze <31522760+fedekunze@users.noreply.github.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Impressive and awesome job @odeke-em I liked your profiling approach. Thanks a lot! 👍🏻
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
After continuously profiling InitGensis with 100K accounts, it showed
pathologically slow code, that was the result of a couple of patterns:
during a quicksort
The remedy involved 4 parts:
With 100K accounts, this brings InitGenesis down to ~6min, instead of
20+min, it reduces the sort code from ~7sec down to 50ms.
Also some simple benchmark reflect the change:
The CPU profiles show the radical change as per
#7766 (comment)
Later on, we shall do more profiling and fixes but for now this brings
down the run-time for InitGenesis.
Fixes #7766
Before we can merge this PR, please make sure that all the following items have been
checked off. If any of the checklist items are not applicable, please leave them but
write a little note why.
docs/
) or specification (x/<module>/spec/
)godoc
comments.Unreleased
section inCHANGELOG.md
Files changed
in the Github PR explorerCodecov Report
in the comment section below once CI passes