Skip to content

Commit

Permalink
Backport KVCache speeds ups to launchpad. (#8723)
Browse files Browse the repository at this point in the history
* store/cachekv, x/bank/types: algorithmically fix pathologically slow code (#8719)

After continuously profiling InitGensis with 100K accounts, it showed
pathologically slow code, that was the result of a couple of patterns:
* Unconditional and not always necessary map lookups
* O(n^2) sdk.AccAddressFromBech32 retrievals when the code is expensive,
during a quicksort

The remedy involved 4 parts:
* O(n) sdk.AccAddressFromBech32 invocations, down from O(n^2) in the quicksort
* Only doing map lookups when the domain key check has passed
* Using a black magic compiler technique of the map clearing idiom
* Zero allocation []byte<->string conversion

With 100K accounts, this brings InitGenesis down to ~6min, instead of
20+min, it reduces the sort code from ~7sec down to 50ms.

Also some simple benchmark reflect the change:
```shell
name                    old time/op    new time/op    delta
SanitizeBalances500-8     19.3ms ±10%     1.5ms ± 5%  -92.46%  (p=0.000 n=20+20)
SanitizeBalances1000-8    41.9ms ± 8%     3.0ms ±12%  -92.92%  (p=0.000 n=20+20)

name                    old alloc/op   new alloc/op   delta
SanitizeBalances500-8     9.05MB ± 6%    0.56MB ± 0%  -93.76%  (p=0.000 n=20+18)
SanitizeBalances1000-8    20.2MB ± 3%     1.1MB ± 0%  -94.37%  (p=0.000 n=20+19)

name                    old allocs/op  new allocs/op  delta
SanitizeBalances500-8      72.4k ± 6%      4.5k ± 0%  -93.76%  (p=0.000 n=20+20)
SanitizeBalances1000-8      162k ± 3%        9k ± 0%  -94.40%  (p=0.000 n=20+20)
```

The CPU profiles show the radical change as per
#7766 (comment)

Later on, we shall do more profiling and fixes but for now this brings
down the run-time for InitGenesis.

Fixes #7766

Co-authored-by: Alessio Treglia <alessio@tendermint.com>

* Fix issues with merge

* Apply suggestions from code review

Co-authored-by: Emmanuel T Odeke <emmanuel@orijtech.com>
Co-authored-by: Alessio Treglia <alessio@tendermint.com>
Co-authored-by: Federico Kunze <31522760+fedekunze@users.noreply.github.com>
  • Loading branch information
4 people authored Mar 1, 2021
1 parent 2965099 commit bd1897c
Show file tree
Hide file tree
Showing 2 changed files with 42 additions and 2 deletions.
6 changes: 6 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -35,6 +35,12 @@ Ref: https://keepachangelog.com/en/1.0.0/

# Changelog

## [Unreleased]

### Improvements

* (store/cachekv) [\#8719](https://github.com/cosmos/cosmos-sdk/pull/8719) Algorithmically fix pathologically slow code

## [v0.39.3]

### Improvements
Expand Down
38 changes: 36 additions & 2 deletions store/cachekv/store.go
Original file line number Diff line number Diff line change
Expand Up @@ -4,8 +4,10 @@ import (
"bytes"
"container/list"
"io"
"reflect"
"sort"
"sync"
"unsafe"

tmkv "github.com/tendermint/tendermint/libs/kv"
dbm "github.com/tendermint/tm-db"
Expand Down Expand Up @@ -172,16 +174,48 @@ func (store *Store) iterator(start, end []byte, ascending bool) types.Iterator {
return newCacheMergeIterator(parent, cache, ascending)
}

// strToByte is meant to make a zero allocation conversion
// from string -> []byte to speed up operations, it is not meant
// to be used generally, but for a specific pattern to check for available
// keys within a domain.
func strToByte(s string) []byte {
var b []byte
hdr := (*reflect.SliceHeader)(unsafe.Pointer(&b))
hdr.Cap = len(s)
hdr.Len = len(s)
hdr.Data = (*reflect.StringHeader)(unsafe.Pointer(&s)).Data
return b
}

// byteSliceToStr is meant to make a zero allocation conversion
// from []byte -> string to speed up operations, it is not meant
// to be used generally, but for a specific pattern to delete keys
// from a map.
func byteSliceToStr(b []byte) string {
hdr := (*reflect.StringHeader)(unsafe.Pointer(&b))
return *(*string)(unsafe.Pointer(hdr))
}

// Constructs a slice of dirty items, to use w/ memIterator.
func (store *Store) dirtyItems(start, end []byte) {
unsorted := make([]*tmkv.Pair, 0)

n := len(store.unsortedCache)
for key := range store.unsortedCache {
cacheValue := store.cache[key]
if dbm.IsKeyInDomain([]byte(key), start, end) {
if dbm.IsKeyInDomain(strToByte(key), start, end) {
cacheValue := store.cache[key]
unsorted = append(unsorted, &tmkv.Pair{Key: []byte(key), Value: cacheValue.value})
}
}

if len(unsorted) == n { // This pattern allows the Go compiler to emit the map clearing idiom for the entire map.
for key := range store.unsortedCache {
delete(store.unsortedCache, key)
}
} else { // Otherwise, normally delete the unsorted keys from the map.
for _, kv := range unsorted {
delete(store.unsortedCache, byteSliceToStr(kv.Key))
}
}

sort.Slice(unsorted, func(i, j int) bool {
Expand Down

0 comments on commit bd1897c

Please sign in to comment.