Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

store/cachekv, x/bank/types: algorithmically fix pathologically slow code #8719

Merged
merged 4 commits into from
Feb 27, 2021

Conversation

odeke-em
Copy link
Collaborator

After continuously profiling InitGensis with 100K accounts, it showed
pathologically slow code, that was the result of a couple of patterns:

  • Unconditional and not always necessary map lookups
  • O(n^2) sdk.AccAddressFromBech32 retrievals when the code is expensive,
    during a quicksort

The remedy involved 4 parts:

  • O(n) sdk.AccAddressFromBech32 invocations, down from O(n^2) in the quicksort
  • Only doing map lookups when the domain key check has passed
  • Using a black magic compiler technique of the map clearing idiom
  • Zero allocation []byte<->string conversion

With 100K accounts, this brings InitGenesis down to ~6min, instead of
20+min, it reduces the sort code from ~7sec down to 50ms.

Also some simple benchmark reflect the change:

name                    old time/op    new time/op    delta
SanitizeBalances500-8     19.3ms ±10%     1.5ms ± 5%  -92.46%  (p=0.000 n=20+20)
SanitizeBalances1000-8    41.9ms ± 8%     3.0ms ±12%  -92.92%  (p=0.000 n=20+20)

name                    old alloc/op   new alloc/op   delta
SanitizeBalances500-8     9.05MB ± 6%    0.56MB ± 0%  -93.76%  (p=0.000 n=20+18)
SanitizeBalances1000-8    20.2MB ± 3%     1.1MB ± 0%  -94.37%  (p=0.000 n=20+19)

name                    old allocs/op  new allocs/op  delta
SanitizeBalances500-8      72.4k ± 6%      4.5k ± 0%  -93.76%  (p=0.000 n=20+20)
SanitizeBalances1000-8      162k ± 3%        9k ± 0%  -94.40%  (p=0.000 n=20+20)

The CPU profiles show the radical change as per
#7766 (comment)

Later on, we shall do more profiling and fixes but for now this brings
down the run-time for InitGenesis.

Fixes #7766


Before we can merge this PR, please make sure that all the following items have been
checked off. If any of the checklist items are not applicable, please leave them but
write a little note why.

  • Targeted PR against correct branch (see CONTRIBUTING.md)
  • Linked to Github issue with discussion and accepted design OR link to spec that describes this work.
  • Code follows the module structure standards.
  • Wrote unit and integration tests
  • Updated relevant documentation (docs/) or specification (x/<module>/spec/)
  • Added relevant godoc comments.
  • Added a relevant changelog entry to the Unreleased section in CHANGELOG.md
  • Re-reviewed Files changed in the Github PR explorer
  • Review Codecov Report in the comment section below once CI passes

…code

After continuously profiling InitGensis with 100K accounts, it showed
pathologically slow code, that was the result of a couple of patterns:
* Unconditional and not always necessary map lookups
* O(n^2) sdk.AccAddressFromBech32 retrievals when the code is expensive,
during a quicksort

The remedy involved 4 parts:
* O(n) sdk.AccAddressFromBech32 invocations, down from O(n^2) in the quicksort
* Only doing map lookups when the domain key check has passed
* Using a black magic compiler technique of the map clearing idiom
* Zero allocation []byte<->string conversion

With 100K accounts, this brings InitGenesis down to ~6min, instead of
20+min, it reduces the sort code from ~7sec down to 50ms.

Also some simple benchmark reflect the change:
```shell
name                    old time/op    new time/op    delta
SanitizeBalances500-8     19.3ms ±10%     1.5ms ± 5%  -92.46%  (p=0.000 n=20+20)
SanitizeBalances1000-8    41.9ms ± 8%     3.0ms ±12%  -92.92%  (p=0.000 n=20+20)

name                    old alloc/op   new alloc/op   delta
SanitizeBalances500-8     9.05MB ± 6%    0.56MB ± 0%  -93.76%  (p=0.000 n=20+18)
SanitizeBalances1000-8    20.2MB ± 3%     1.1MB ± 0%  -94.37%  (p=0.000 n=20+19)

name                    old allocs/op  new allocs/op  delta
SanitizeBalances500-8      72.4k ± 6%      4.5k ± 0%  -93.76%  (p=0.000 n=20+20)
SanitizeBalances1000-8      162k ± 3%        9k ± 0%  -94.40%  (p=0.000 n=20+20)
```

The CPU profiles show the radical change as per
#7766 (comment)

Later on, we shall do more profiling and fixes but for now this brings
down the run-time for InitGenesis.

Fixes #7766
@odeke-em odeke-em force-pushed the algorithmically-fix-slow-InitGenesis branch from 0bda379 to c19be27 Compare February 27, 2021 04:51
@codecov
Copy link

codecov bot commented Feb 27, 2021

Codecov Report

Merging #8719 (7828624) into master (68e7a3a) will increase coverage by 0.02%.
The diff coverage is 86.20%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #8719      +/-   ##
==========================================
+ Coverage   61.35%   61.38%   +0.02%     
==========================================
  Files         671      671              
  Lines       38322    38343      +21     
==========================================
+ Hits        23512    23536      +24     
+ Misses      12351    12348       -3     
  Partials     2459     2459              
Impacted Files Coverage Δ
store/cachekv/store.go 87.12% <71.42%> (-2.76%) ⬇️
x/bank/types/balance.go 89.18% <100.00%> (+28.47%) ⬆️

// from string -> []byte to speed up operations, it is not meant
// to be used generally, but for a specific pattern to check for available
// keys within a domain.
func strToByte(s string) []byte {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly memo to self: there is another implementation in types/address/hash.go. We'll eventually need to 1. find a place where to put these functions (and I hope we won't come up with another catch-all util package) 2. reduce code duplication

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ack, we can do so later indeed, but for now this'll suffice and keep the PR focused :-)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Completely agree! Thanks for the great work! 🙏

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great job @odeke-em. Are you planning to merge str/bytes convertion functions? Otherwise I can do a quick PR for that.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @robert-zaremba! Please go ahead and add them, but perhaps we should put them in an internal package so that no other packages out of the cosmos-sdk can import them.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, 👍

x/bank/types/balance.go Outdated Show resolved Hide resolved
Copy link
Contributor

@alessio alessio left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Chapeau @odeke-em. Thank you so much. Approved.

Copy link
Collaborator

@fedekunze fedekunze left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is awesome thank you very much @odeke-em! 🙌 🙌

Mind also adding an improvement changelog entry?

@zmanian
Copy link
Member

zmanian commented Feb 27, 2021

We should backport this to 0.41 and launchpad.

@odeke-em
Copy link
Collaborator Author

Thank you for the reviews and kind words, @alessio @fedekunze @zmanian.

Mind also adding an improvement changelog entry?

Done, @fedekunze, please take a look and thank you.

@odeke-em odeke-em merged commit dbb9923 into master Feb 27, 2021
@odeke-em odeke-em deleted the algorithmically-fix-slow-InitGenesis branch February 27, 2021 15:26
zmanian pushed a commit that referenced this pull request Feb 27, 2021
…code (#8719)

After continuously profiling InitGensis with 100K accounts, it showed
pathologically slow code, that was the result of a couple of patterns:
* Unconditional and not always necessary map lookups
* O(n^2) sdk.AccAddressFromBech32 retrievals when the code is expensive,
during a quicksort

The remedy involved 4 parts:
* O(n) sdk.AccAddressFromBech32 invocations, down from O(n^2) in the quicksort
* Only doing map lookups when the domain key check has passed
* Using a black magic compiler technique of the map clearing idiom
* Zero allocation []byte<->string conversion

With 100K accounts, this brings InitGenesis down to ~6min, instead of
20+min, it reduces the sort code from ~7sec down to 50ms.

Also some simple benchmark reflect the change:
```shell
name                    old time/op    new time/op    delta
SanitizeBalances500-8     19.3ms ±10%     1.5ms ± 5%  -92.46%  (p=0.000 n=20+20)
SanitizeBalances1000-8    41.9ms ± 8%     3.0ms ±12%  -92.92%  (p=0.000 n=20+20)

name                    old alloc/op   new alloc/op   delta
SanitizeBalances500-8     9.05MB ± 6%    0.56MB ± 0%  -93.76%  (p=0.000 n=20+18)
SanitizeBalances1000-8    20.2MB ± 3%     1.1MB ± 0%  -94.37%  (p=0.000 n=20+19)

name                    old allocs/op  new allocs/op  delta
SanitizeBalances500-8      72.4k ± 6%      4.5k ± 0%  -93.76%  (p=0.000 n=20+20)
SanitizeBalances1000-8      162k ± 3%        9k ± 0%  -94.40%  (p=0.000 n=20+20)
```

The CPU profiles show the radical change as per
#7766 (comment)

Later on, we shall do more profiling and fixes but for now this brings
down the run-time for InitGenesis.

Fixes #7766

Co-authored-by: Alessio Treglia <alessio@tendermint.com>
zmanian pushed a commit that referenced this pull request Feb 27, 2021
…code (#8719)

After continuously profiling InitGensis with 100K accounts, it showed
pathologically slow code, that was the result of a couple of patterns:
* Unconditional and not always necessary map lookups
* O(n^2) sdk.AccAddressFromBech32 retrievals when the code is expensive,
during a quicksort

The remedy involved 4 parts:
* O(n) sdk.AccAddressFromBech32 invocations, down from O(n^2) in the quicksort
* Only doing map lookups when the domain key check has passed
* Using a black magic compiler technique of the map clearing idiom
* Zero allocation []byte<->string conversion

With 100K accounts, this brings InitGenesis down to ~6min, instead of
20+min, it reduces the sort code from ~7sec down to 50ms.

Also some simple benchmark reflect the change:
```shell
name                    old time/op    new time/op    delta
SanitizeBalances500-8     19.3ms ±10%     1.5ms ± 5%  -92.46%  (p=0.000 n=20+20)
SanitizeBalances1000-8    41.9ms ± 8%     3.0ms ±12%  -92.92%  (p=0.000 n=20+20)

name                    old alloc/op   new alloc/op   delta
SanitizeBalances500-8     9.05MB ± 6%    0.56MB ± 0%  -93.76%  (p=0.000 n=20+18)
SanitizeBalances1000-8    20.2MB ± 3%     1.1MB ± 0%  -94.37%  (p=0.000 n=20+19)

name                    old allocs/op  new allocs/op  delta
SanitizeBalances500-8      72.4k ± 6%      4.5k ± 0%  -93.76%  (p=0.000 n=20+20)
SanitizeBalances1000-8      162k ± 3%        9k ± 0%  -94.40%  (p=0.000 n=20+20)
```

The CPU profiles show the radical change as per
#7766 (comment)

Later on, we shall do more profiling and fixes but for now this brings
down the run-time for InitGenesis.

Fixes #7766

Co-authored-by: Alessio Treglia <alessio@tendermint.com>
alessio pushed a commit that referenced this pull request Mar 1, 2021
From: #8719
Fixes: #7766

Co-authored-by: Emmanuel T Odeke <emmanuel@orijtech.com>
Co-authored-by: Alessio Treglia <alessio@tendermint.com>
Co-authored-by: Federico Kunze <31522760+fedekunze@users.noreply.github.com>
alessio pushed a commit that referenced this pull request Mar 1, 2021
* store/cachekv, x/bank/types: algorithmically fix pathologically slow code (#8719)

After continuously profiling InitGensis with 100K accounts, it showed
pathologically slow code, that was the result of a couple of patterns:
* Unconditional and not always necessary map lookups
* O(n^2) sdk.AccAddressFromBech32 retrievals when the code is expensive,
during a quicksort

The remedy involved 4 parts:
* O(n) sdk.AccAddressFromBech32 invocations, down from O(n^2) in the quicksort
* Only doing map lookups when the domain key check has passed
* Using a black magic compiler technique of the map clearing idiom
* Zero allocation []byte<->string conversion

With 100K accounts, this brings InitGenesis down to ~6min, instead of
20+min, it reduces the sort code from ~7sec down to 50ms.

Also some simple benchmark reflect the change:
```shell
name                    old time/op    new time/op    delta
SanitizeBalances500-8     19.3ms ±10%     1.5ms ± 5%  -92.46%  (p=0.000 n=20+20)
SanitizeBalances1000-8    41.9ms ± 8%     3.0ms ±12%  -92.92%  (p=0.000 n=20+20)

name                    old alloc/op   new alloc/op   delta
SanitizeBalances500-8     9.05MB ± 6%    0.56MB ± 0%  -93.76%  (p=0.000 n=20+18)
SanitizeBalances1000-8    20.2MB ± 3%     1.1MB ± 0%  -94.37%  (p=0.000 n=20+19)

name                    old allocs/op  new allocs/op  delta
SanitizeBalances500-8      72.4k ± 6%      4.5k ± 0%  -93.76%  (p=0.000 n=20+20)
SanitizeBalances1000-8      162k ± 3%        9k ± 0%  -94.40%  (p=0.000 n=20+20)
```

The CPU profiles show the radical change as per
#7766 (comment)

Later on, we shall do more profiling and fixes but for now this brings
down the run-time for InitGenesis.

Fixes #7766

Co-authored-by: Alessio Treglia <alessio@tendermint.com>

* Fix issues with merge

* Apply suggestions from code review

Co-authored-by: Emmanuel T Odeke <emmanuel@orijtech.com>
Co-authored-by: Alessio Treglia <alessio@tendermint.com>
Co-authored-by: Federico Kunze <31522760+fedekunze@users.noreply.github.com>
Copy link

@alain2sf alain2sf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Impressive and awesome job @odeke-em I liked your profiling approach. Thanks a lot! 👍🏻

Copy link

@alain2sf alain2sf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Inspect and fix a slow pattern in InitGenesis of certain modules
6 participants