Skip to content

Commit

Permalink
feat: Fast storage optimization for queries and iterations (cosmos#468)
Browse files Browse the repository at this point in the history
* Add unbounded key string to KeyFormat

* Add test vectors for unbounded length keys

* Add some notes

* update .gitignore

* Add FastNode struct

* WIP: make Get work with new FastNode

* when retrieving fastnode fails, return errors vs. panic

* add comments clarifying what index represents

* make the linter happy

* Add small tweaks to fast_node.go

* add TODO & small linter tweaks

* Update comment

* update fast node cache in set

* add debugging output when falling back to original logic

* return error instead of panic

* WIP: refactor set ops to work with fast store

* update Set of mutable tree, begin unit testing

* update GetVersioned to check fast nodes before trying the immutable

* check fast node version before nil value check in get of immutable tree

* fix small bugs and typos, continue writing unit tests for Set

* unit test saveFastNodeVersion

* simplify storing unsaved fast nodes

* resolve a bug with not writing prefix for fast node to disk

* remove fast nodes from disk on save and clear fast cache when version is deleted, fix all tests but random and with index

* resolve an issue with randomized tests caused by the fast node cache not being cleared when latest version is saved

* split unsaved fast node changes into additions and removals

* save fast node removals

* move fast node cache clearing to nodedb

* use regular logic only when fast node version is greater than immutable tree'

* clean up tree_random_test.go

* clear unsaved fast node removals on rollback

* fix randomized test failures caused by a typo in ndb DeleteVersion for loop

* implement GetFast method to preserve Get with correct index return, convert unit tests from Get to GetFast where applicable

* ensure Get and GetFast return the same values in tree_random_test.go

* test fast node cache is live in random tree tests

* improve mutable tree unit tests related to new functionality

* clean up tree_test.go

* implement GetVersionedFast to preserve the index in GetVersioned

* restore accidentally deleted test in mutable tree test

* spurious whitespace

* refactor mutable tree

* fix comment in mutable tree

* add benchmark results

* avoid redundant full tree search in GetFast of immutable tree when fast node is nil and tree is latest

* fix naming for bench test get on random keys

* use method for get latestversio in get fast

* optimize GetFast, perform a refactor to ensure that fast nodes on disk are matched and better test

* add latest bench

* Fast Node Iteration (osmosis-labs#7)

* propagate errors from nodedb and avoid panicking

* begin implementing fast node iteration

* resolve rebase issue in random tests

* fix iteration to deserialize fast node for extracting the correct value

* finalzie iteration and let all unit tests pass

* add benchmarks

* merge GetVersioned and GetVersionedFast

* remove fast node deletion from DeleteVersion and DeleteVersionRange and benchmark

* fix and unit test iteration on mutable and immutable trees

* implement tests for iterator and iterate, begin testing fast iterator

* fix and unit test fast iterator

* refactor iterate methods of mutable and immutable trees

* resolve some warnings

* remove old bench results

* refactor bench tests for iteration

* Fast Cache Migration (osmosis-labs#9)

* implement nodedb changes to set and get chain version from the database

* implement and unit test upgrade to fast storage in mutable tree

* refactor for auto upgrade to fast version in mutable tree contructor, load version and lazy load version

* use proper functionality for getting latest version

* remove unused error

* fix comment

* use fast version value in tests

* spurious tab

* fix style problems and remove redundant code in tests

* Rename Get to GetWithIndex and GetFast to Get

* refactor and clean up bench tests

* remove extra byte from fast node length

* clean up immutable tree

* refactor nil tree or ndb error for the iterators and their tests

* avoid exporting methods for getting unsaved additions and removals

* refactor fast upgrade to read from tree instead of traversing disk nodes and orphans, update unit tests

* remove unneeded comment

* refer to storage version consistently across the project

* fix more warnings

* optimize removal of fast nodes from cache on deletion

* small changes in teh mutable tree

* correct storage version key

* auto set fast version in SaveVersion

* avoid exporting new methods of the immutable tree

* go fmt

* Fix comment in fast iterator

Co-authored-by: Dev Ojha <ValarDragon@users.noreply.github.com>

* add extra comment for domain in fast iterator

Co-authored-by: Dev Ojha <ValarDragon@users.noreply.github.com>

* add comments for moving the iterator before the first element

* add comment for describing what mirror is in assertIterator of testutils

* fix checking the mirror for descending iterator in tests

* Update testutils_test.go with a comment

Co-authored-by: Dev Ojha <ValarDragon@users.noreply.github.com>

* Update benchmarks/bench_test.go with a comment for runKnownQueriesFast

Co-authored-by: Dev Ojha <ValarDragon@users.noreply.github.com>

* Update benchmarks/bench_test.go with a comment for runQueriesFast

Co-authored-by: Dev Ojha <ValarDragon@users.noreply.github.com>

* export IsFastCacheEnabled and add an assert in bench tests

* Update comment immutable_tree.go

Co-authored-by: Dev Ojha <ValarDragon@users.noreply.github.com>

* Update comment for migration in mutable_tree.go

Co-authored-by: Dev Ojha <ValarDragon@users.noreply.github.com>

* simlify Iterate in mutable tree, add debug log for

* Fast Cache - Downgrade - reupgrade protection and other improvements (osmosis-labs#12)

* add leaf hash to fast node and unit test

* refactor get with index and get by index, fix migration in load version and lazy load version

* use Get in GetVersioned of mutable tree

* refactor non membership proof to use fast storage if available

* bench non-membership proof

* fix bench tests to work with the new changes

* add downgrade-reupgrade protection and unit test

* remove leaf hash from fast node

* resolve multithreading bug related to iterators not being closed

* clean up

* use correct tree in bench tests

* add cache to tree used to bench non membership proofs

* add benc tests for GetWithIndex and GetByIndex

* revert GetWithIndex and GetByIndex

* remove unused import

* unit test re-upgrade protection and fix small issues

* remove redundant setStorageVersion method

* fix bug with appending to live stage version to storage version and nit test

* add comment for setFastStorageVersionToBatch

* refactor and improve unit tests for reupgrade protection

* rename ndb's isFastStorageEnabled to hasUpgradedToFastStorage and add comments

* comment out new implementation for GetNonMembershipProof

* update comments in nodedb to reflect the difference between hasUpgradedToFastStorage and shouldForceFastStorageUpdate

* refactor nodedb tests

* downgrade tendermint to 0.34.14 - osmosis's latest cosmos sdk does not support 0.35.0

* fix bug where fast storage was not enabled when version 0 was attempted to be loaded

* implement unsaved fast iterator to be used in mutable tree (osmosis-labs#16)

* address comments from unsaved fast iterator PR

* expose isUpgradeable method on mutable tree and unit test (osmosis-labs#17)

* expose isUpgradeable method on mutable tree and unit test

* go fmt

* resolve problems with rebasing

* update CHANGELOG.md

* tendermint v0.35.0

* fix String() bench

* fix duplication lint in iterator_test.go

* fix lint for tree.ndb.DeleteFastNode error return not checked

* fix Error return value of `ndb.resetBatch` is not checked

* fix Error return value of `ndb.traversePrefix` is not checked

* fix Error: struct of size 64 bytes could be of size 56 bytes

* fix Error: `comitted` is a misspelling of `committed`

* fix issues in basic_test.go

* address comments in fast_iterator.go

* address comments in immutable tree

* address comments in iterator.go

* address comments in key_format.go

* address remaining comments

* fix Error: Error return value of `ndb.batch.Write` is not checked

* fix Error: receiver name t should be consistent with previous receiver name tree for MutableTree

* fix Error: struct of size 48 bytes could be of size 40 bytes

* go fmt

* more linter fixes

* fix remaining linter problems

* upgrade tm-db and comment out broken bencher databases

* skip iterations for BenchmarkLevelDBLargeData bench

* attempt to fix linter

* force GC, no cache during migration, auto heap profile (osmosis-labs#19)

* force GC, no cache during migration, auto heap profile

* resolve a potential deadlock from racing between reset and stop

* fix small lint issue

* remove logs and pprof logic

* remove unused libraries

* add comment explaining the reason for RAM optimizations

* sync access to fast node cache to avoid concurrent write fatal error (osmosis-labs#23)

* update go.mod and go.sum to match master versions

* revert osmosis-labs#23 (sync access to fast node cache), fix bug related to old height export (osmosis-labs#33)

* Revert "sync access to fast node cache to avoid concurrent write fatal error (osmosis-labs#23)"

This reverts commit 2a1daf4.

* return correct iterator in mutable tree

* fix concurrent map panic when querying and comittting concurrently

* avoid clearing fast node cache during pruning (osmosis-labs#35)

* fix data race related to VersionExists (osmosis-labs#36)

* fix data race related to VersionExists

* use regular lock instead of RW in mutable_tree.go

* hardcode fast node cache size to  100k

* go fmt

* restore proof_ics23.go

* fix linter

Co-authored-by: ValarDragon <dojha12@gmail.com>
Co-authored-by: jtieri <37750742+jtieri@users.noreply.github.com>
Co-authored-by: Dev Ojha <ValarDragon@users.noreply.github.com>
Co-authored-by: Roman Akhtariev <34196718+akhtariev@users.noreply.github.com>
Co-authored-by: Roman <34196718+r0mvn@users.noreply.github.com>
  • Loading branch information
6 people authored Apr 9, 2022
1 parent ccfb418 commit 0dcb21b
Show file tree
Hide file tree
Showing 34 changed files with 3,985 additions and 385 deletions.
3 changes: 3 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -12,3 +12,6 @@ cpu*.out
mem*.out
cpu*.pdf
mem*.pdf

# IDE files
.idea/*
6 changes: 6 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,12 @@

### Improvements

- [\#468](https://github.com/cosmos/iavl/pull/468) Fast storage optimization for queries and iterations

## 0.17.3 (December 1, 2021)

### Improvements

- [\#445](https://github.com/cosmos/iavl/pull/445) Bump github.com/tendermint/tendermint to v0.35.0
- [\#452](https://github.com/cosmos/iavl/pull/452) Optimization: remove unnecessary (*bytes.Buffer).Reset right after creating buffer.
- [\#453](https://github.com/cosmos/iavl/pull/453),[\#456](https://github.com/cosmos/iavl/pull/456) Optimization: buffer reuse
Expand Down
101 changes: 78 additions & 23 deletions basic_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -35,71 +35,126 @@ func TestBasic(t *testing.T) {

// Test 0x00
{
idx, val := tree.Get([]byte{0x00})
key := []byte{0x00}
expected := ""

idx, val := tree.GetWithIndex(key)
if val != nil {
t.Errorf("Expected no value to exist")
t.Error("Expected no value to exist")
}
if idx != 0 {
t.Errorf("Unexpected idx %x", idx)
}
if string(val) != "" {
t.Errorf("Unexpected value %v", string(val))
if string(val) != expected {
t.Errorf("Unexpected value %s", val)
}

val = tree.Get(key)
if val != nil {
t.Error("Fast method - expected no value to exist")
}
if string(val) != expected {
t.Errorf("Fast method - Unexpected value %s", val)
}
}

// Test "1"
{
idx, val := tree.Get([]byte("1"))
key := []byte("1")
expected := "one"

idx, val := tree.GetWithIndex(key)
if val == nil {
t.Errorf("Expected value to exist")
t.Error("Expected value to exist")
}
if idx != 0 {
t.Errorf("Unexpected idx %x", idx)
}
if string(val) != "one" {
t.Errorf("Unexpected value %v", string(val))
if string(val) != expected {
t.Errorf("Unexpected value %s", val)
}

val = tree.Get(key)
if val == nil {
t.Error("Fast method - expected value to exist")
}
if string(val) != expected {
t.Errorf("Fast method - Unexpected value %s", val)
}
}

// Test "2"
{
idx, val := tree.Get([]byte("2"))
key := []byte("2")
expected := "TWO"

idx, val := tree.GetWithIndex(key)
if val == nil {
t.Errorf("Expected value to exist")
t.Error("Expected value to exist")
}
if idx != 1 {
t.Errorf("Unexpected idx %x", idx)
}
if string(val) != "TWO" {
t.Errorf("Unexpected value %v", string(val))
if string(val) != expected {
t.Errorf("Unexpected value %s", val)
}

val = tree.Get(key)
if val == nil {
t.Error("Fast method - expected value to exist")
}
if string(val) != expected {
t.Errorf("Fast method - Unexpected value %s", val)
}
}

// Test "4"
{
idx, val := tree.Get([]byte("4"))
key := []byte("4")
expected := ""

idx, val := tree.GetWithIndex(key)
if val != nil {
t.Errorf("Expected no value to exist")
t.Error("Expected no value to exist")
}
if idx != 2 {
t.Errorf("Unexpected idx %x", idx)
}
if string(val) != "" {
t.Errorf("Unexpected value %v", string(val))
if string(val) != expected {
t.Errorf("Unexpected value %s", val)
}

val = tree.Get(key)
if val != nil {
t.Error("Fast method - expected no value to exist")
}
if string(val) != expected {
t.Errorf("Fast method - Unexpected value %s", val)
}
}

// Test "6"
{
idx, val := tree.Get([]byte("6"))
key := []byte("6")
expected := ""

idx, val := tree.GetWithIndex(key)
if val != nil {
t.Errorf("Expected no value to exist")
t.Error("Expected no value to exist")
}
if idx != 3 {
t.Errorf("Unexpected idx %x", idx)
}
if string(val) != "" {
t.Errorf("Unexpected value %v", string(val))
if string(val) != expected {
t.Errorf("Unexpected value %s", val)
}

val = tree.Get(key)
if val != nil {
t.Error("Fast method - expected no value to exist")
}
if string(val) != expected {
t.Errorf("Fast method - Unexpected value %s", val)
}
}
}
Expand Down Expand Up @@ -252,7 +307,7 @@ func TestIntegration(t *testing.T) {
if has := tree.Has([]byte(randstr(12))); has {
t.Error("Table has extra key")
}
if _, val := tree.Get([]byte(r.key)); string(val) != r.value {
if val := tree.Get([]byte(r.key)); string(val) != r.value {
t.Error("wrong value")
}
}
Expand All @@ -270,7 +325,7 @@ func TestIntegration(t *testing.T) {
if has := tree.Has([]byte(randstr(12))); has {
t.Error("Table has extra key")
}
_, val := tree.Get([]byte(r.key))
val := tree.Get([]byte(r.key))
if string(val) != r.value {
t.Error("wrong value")
}
Expand Down Expand Up @@ -388,7 +443,7 @@ func TestPersistence(t *testing.T) {
require.NoError(t, err)
t2.Load()
for key, value := range records {
_, t2value := t2.Get([]byte(key))
t2value := t2.Get([]byte(key))
if string(t2value) != value {
t.Fatalf("Invalid value. Expected %v, got %v", value, t2value)
}
Expand Down
131 changes: 114 additions & 17 deletions benchmarks/bench_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,6 @@ func randBytes(length int) []byte {
key := make([]byte, length)
// math.rand.Read always returns err=nil
// we do not need cryptographic randomness for this test:
//nolint:gosec
rand.Read(key)
return key
}
Expand Down Expand Up @@ -57,21 +56,95 @@ func commitTree(b *testing.B, t *iavl.MutableTree) {
}
}

func runQueries(b *testing.B, t *iavl.MutableTree, keyLen int) {
// queries random keys against live state. Keys are almost certainly not in the tree.
func runQueriesFast(b *testing.B, t *iavl.MutableTree, keyLen int) {
require.True(b, t.IsFastCacheEnabled())
for i := 0; i < b.N; i++ {
q := randBytes(keyLen)
t.Get(q)
}
}

func runKnownQueries(b *testing.B, t *iavl.MutableTree, keys [][]byte) {
// queries keys that are known to be in state
func runKnownQueriesFast(b *testing.B, t *iavl.MutableTree, keys [][]byte) {
require.True(b, t.IsFastCacheEnabled()) // to ensure fast storage is enabled
l := int32(len(keys))
for i := 0; i < b.N; i++ {
q := keys[rand.Int31n(l)]
t.Get(q)
}
}

func runQueriesSlow(b *testing.B, t *iavl.MutableTree, keyLen int) {
b.StopTimer()
// Save version to get an old immutable tree to query against,
// Fast storage is not enabled on old tree versions, allowing us to bench the desired behavior.
_, version, err := t.SaveVersion()
require.NoError(b, err)

itree, err := t.GetImmutable(version - 1)
require.NoError(b, err)
require.False(b, itree.IsFastCacheEnabled()) // to ensure fast storage is not enabled

b.StartTimer()
for i := 0; i < b.N; i++ {
q := randBytes(keyLen)
itree.GetWithIndex(q)
}
}

func runKnownQueriesSlow(b *testing.B, t *iavl.MutableTree, keys [][]byte) {
b.StopTimer()
// Save version to get an old immutable tree to query against,
// Fast storage is not enabled on old tree versions, allowing us to bench the desired behavior.
_, version, err := t.SaveVersion()
require.NoError(b, err)

itree, err := t.GetImmutable(version - 1)
require.NoError(b, err)
require.False(b, itree.IsFastCacheEnabled()) // to ensure fast storage is not enabled
b.StartTimer()
l := int32(len(keys))
for i := 0; i < b.N; i++ {
q := keys[rand.Int31n(l)]
index, value := itree.GetWithIndex(q)
require.True(b, index >= 0, "the index must not be negative")
require.NotNil(b, value, "the value should exist")
}
}

func runIterationFast(b *testing.B, t *iavl.MutableTree, expectedSize int) {
require.True(b, t.IsFastCacheEnabled()) // to ensure fast storage is enabled
for i := 0; i < b.N; i++ {
itr := t.ImmutableTree.Iterator(nil, nil, false)
iterate(b, itr, expectedSize)
require.Nil(b, itr.Close(), ".Close should not error out")
}
}

func runIterationSlow(b *testing.B, t *iavl.MutableTree, expectedSize int) {
for i := 0; i < b.N; i++ {
itr := iavl.NewIterator(nil, nil, false, t.ImmutableTree) // create slow iterator directly
iterate(b, itr, expectedSize)
require.Nil(b, itr.Close(), ".Close should not error out")
}
}

func iterate(b *testing.B, itr db.Iterator, expectedSize int) {
b.StartTimer()
keyValuePairs := make([][][]byte, 0, expectedSize)
for i := 0; i < expectedSize && itr.Valid(); i++ {
itr.Next()
keyValuePairs = append(keyValuePairs, [][]byte{itr.Key(), itr.Value()})
}
b.StopTimer()
if g, w := len(keyValuePairs), expectedSize; g != w {
b.Errorf("iteration count mismatch: got=%d, want=%d", g, w)
} else {
b.Logf("completed %d iterations", len(keyValuePairs))
}
}

// func runInsert(b *testing.B, t *iavl.MutableTree, keyLen, dataLen, blockSize int) *iavl.MutableTree {
// for i := 1; i <= b.N; i++ {
// t.Set(randBytes(keyLen), randBytes(dataLen))
Expand Down Expand Up @@ -132,7 +205,7 @@ func runBlock(b *testing.B, t *iavl.MutableTree, keyLen, dataLen, blockSize int,
data := randBytes(dataLen)

// perform query and write on check and then real
// check.Get(key)
// check.GetFast(key)
// check.Set(key, data)
real.Get(key)
real.Set(key, data)
Expand Down Expand Up @@ -175,11 +248,11 @@ func BenchmarkMedium(b *testing.B) {
benchmarks := []benchmark{
{"memdb", 100000, 100, 16, 40},
{"goleveldb", 100000, 100, 16, 40},
{"cleveldb", 100000, 100, 16, 40},
// {"cleveldb", 100000, 100, 16, 40},
// FIXME: idk why boltdb is too slow !?
// {"boltdb", 100000, 100, 16, 40},
{"rocksdb", 100000, 100, 16, 40},
{"badgerdb", 100000, 100, 16, 40},
// {"rocksdb", 100000, 100, 16, 40},
// {"badgerdb", 100000, 100, 16, 40},
}
runBenchmarks(b, benchmarks)
}
Expand All @@ -188,10 +261,10 @@ func BenchmarkSmall(b *testing.B) {
benchmarks := []benchmark{
{"memdb", 1000, 100, 4, 10},
{"goleveldb", 1000, 100, 4, 10},
{"cleveldb", 1000, 100, 4, 10},
{"boltdb", 1000, 100, 4, 10},
{"rocksdb", 1000, 100, 4, 10},
{"badgerdb", 1000, 100, 4, 10},
// {"cleveldb", 1000, 100, 4, 10},
// {"boltdb", 1000, 100, 4, 10},
// {"rocksdb", 1000, 100, 4, 10},
// {"badgerdb", 1000, 100, 4, 10},
}
runBenchmarks(b, benchmarks)
}
Expand All @@ -202,8 +275,8 @@ func BenchmarkLarge(b *testing.B) {
{"goleveldb", 1000000, 100, 16, 40},
// FIXME: idk why boltdb is too slow !?
// {"boltdb", 1000000, 100, 16, 40},
{"rocksdb", 1000000, 100, 16, 40},
{"badgerdb", 1000000, 100, 16, 40},
// {"rocksdb", 1000000, 100, 16, 40},
// {"badgerdb", 1000000, 100, 16, 40},
}
runBenchmarks(b, benchmarks)
}
Expand Down Expand Up @@ -287,14 +360,38 @@ func runSuite(b *testing.B, d db.DB, initSize, blockSize, keyLen, dataLen int) {

b.ResetTimer()

b.Run("query-miss", func(sub *testing.B) {
b.Run("query-no-in-tree-guarantee-fast", func(sub *testing.B) {
sub.ReportAllocs()
runQueriesFast(sub, t, keyLen)
})
b.Run("query-no-in-tree-guarantee-slow", func(sub *testing.B) {
sub.ReportAllocs()
runQueriesSlow(sub, t, keyLen)
})
//
b.Run("query-hits-fast", func(sub *testing.B) {
sub.ReportAllocs()
runQueries(sub, t, keyLen)
runKnownQueriesFast(sub, t, keys)
})
b.Run("query-hits", func(sub *testing.B) {
b.Run("query-hits-slow", func(sub *testing.B) {
sub.ReportAllocs()
runKnownQueries(sub, t, keys)
runKnownQueriesSlow(sub, t, keys)
})
//
// Iterations for BenchmarkLevelDBLargeData timeout bencher in CI so
// we must skip them.
if b.Name() != "BenchmarkLevelDBLargeData" {
b.Run("iteration-fast", func(sub *testing.B) {
sub.ReportAllocs()
runIterationFast(sub, t, initSize)
})
b.Run("iteration-slow", func(sub *testing.B) {
sub.ReportAllocs()
runIterationSlow(sub, t, initSize)
})
}

//
b.Run("update", func(sub *testing.B) {
sub.ReportAllocs()
t = runUpdate(sub, t, dataLen, blockSize, keys)
Expand Down
Loading

0 comments on commit 0dcb21b

Please sign in to comment.