refactor: implement cache abstraction and unit test #506

p0mvn · 2022-06-14T01:45:25Z

Context

See here: Implement configurable fast node cache size osmosis-labs/osmosis#1163
To sum up:
- better abstraction for cache to eventually enable the limit by bytes as opposed to the number of nodes
- unit-tested cache
- avoid caching nodes with the same key (the old one is now replaced)
- cache adapted from: https://github.com/golang/groupcache/blob/master/lru/lru.go

This change has been running on Osmosis for almost 2 months so it is relatively low-risk.

Original Benchstat from Osmosis fork

name                                                                 old time/op    new time/op    delta
Medium/goleveldb-100000-100-16-40/query-no-in-tree-guarantee-fast-4    4.22µs ± 9%    4.15µs ± 6%     ~     (p=1.000 n=5+5)
Medium/goleveldb-100000-100-16-40/query-no-in-tree-guarantee-slow-4    15.3µs ±11%    16.6µs ± 6%     ~     (p=0.056 n=5+5)
Medium/goleveldb-100000-100-16-40/query-hits-fast-4                     566ns ± 8%     613ns ±11%     ~     (p=0.222 n=5+5)
Medium/goleveldb-100000-100-16-40/query-hits-slow-4                    21.2µs ± 5%    21.0µs ± 4%     ~     (p=0.690 n=5+5)
Medium/goleveldb-100000-100-16-40/iteration-fast-4                     78.0ms ± 6%    81.1ms ±11%     ~     (p=0.548 n=5+5)
Medium/goleveldb-100000-100-16-40/iteration-slow-4                      1.79s ±13%     1.90s ±11%     ~     (p=0.222 n=5+5)
Medium/goleveldb-100000-100-16-40/update-4                              233µs ±15%     203µs ± 8%  -12.78%  (p=0.008 n=5+5)
Medium/goleveldb-100000-100-16-40/block-4                              29.0ms ± 4%    25.8ms ±10%  -11.04%  (p=0.016 n=5+5)

name                                                                 old alloc/op   new alloc/op   delta
Medium/goleveldb-100000-100-16-40/query-no-in-tree-guarantee-fast-4      814B ± 0%      814B ± 0%     ~     (all equal)
Medium/goleveldb-100000-100-16-40/query-no-in-tree-guarantee-slow-4    1.41kB ± 1%    1.41kB ± 0%     ~     (p=1.000 n=5+5)
Medium/goleveldb-100000-100-16-40/query-hits-fast-4                     0.00B          0.00B          ~     (all equal)
Medium/goleveldb-100000-100-16-40/query-hits-slow-4                    1.99kB ± 1%    2.00kB ± 1%     ~     (p=0.460 n=5+5)
Medium/goleveldb-100000-100-16-40/iteration-fast-4                     29.3MB ± 0%    29.3MB ± 0%     ~     (p=0.381 n=5+5)
Medium/goleveldb-100000-100-16-40/iteration-slow-4                      276MB ± 0%     276MB ± 0%   +0.02%  (p=0.029 n=4+4)
Medium/goleveldb-100000-100-16-40/update-4                             47.4kB ± 2%    46.5kB ± 4%     ~     (p=0.151 n=5+5)
Medium/goleveldb-100000-100-16-40/block-4                              5.65MB ± 2%    5.41MB ± 2%   -4.27%  (p=0.008 n=5+5)

name                                                                 old allocs/op  new allocs/op  delta
Medium/goleveldb-100000-100-16-40/query-no-in-tree-guarantee-fast-4      16.0 ± 0%      16.0 ± 0%     ~     (all equal)
Medium/goleveldb-100000-100-16-40/query-no-in-tree-guarantee-slow-4      24.0 ± 0%      24.0 ± 0%     ~     (all equal)
Medium/goleveldb-100000-100-16-40/query-hits-fast-4                      0.00           0.00          ~     (all equal)
Medium/goleveldb-100000-100-16-40/query-hits-slow-4                      34.0 ± 0%      34.0 ± 0%     ~     (all equal)
Medium/goleveldb-100000-100-16-40/iteration-fast-4                       523k ± 0%      523k ± 0%     ~     (p=0.484 n=5+5)
Medium/goleveldb-100000-100-16-40/iteration-slow-4                      4.71M ± 0%     4.71M ± 0%   +0.01%  (p=0.029 n=4+4)
Medium/goleveldb-100000-100-16-40/update-4                                502 ± 9%       482 ± 6%     ~     (p=0.421 n=5+5)
Medium/goleveldb-100000-100-16-40/block-4                               62.1k ± 3%     59.6k ± 2%   -4.05%  (p=0.016 n=5+5)

* implementing cache abstraction, unit test Add * create benchmark for Add(), start working on benchmark for Remove() * implement no duplicates in cache and unit test * switch fast node cache to the new abstraction * switch regular cache to the new abstraction * fmt and add interface godoc * rename receiver to c from nc * const fastNodeCacheLimit * move benchmarks to a separate file

tac0turtle

amazing, thank you!

tac0turtle · 2022-06-16T10:03:37Z

could you resolve the conflicts then we can merge it, thank you

robert-zaremba · 2022-06-16T15:45:36Z

Did anyone make an in depth review?

p0mvn · 2022-06-16T16:47:34Z

Did anyone make an in depth review?

In addition to @marbar3778 's review, @alexanderbez helped with looking into this in the Osmosis fork a few months ago: osmosis-labs#38

I'm happy to wait if anyone else would like to look at this

robert-zaremba

Let's consider using https://github.com/hashicorp/golang-lru/tree/master/simplelru rather than implementing our own lRU.

robert-zaremba · 2022-06-16T17:32:19Z

cache/cache.go

+	dict       map[string]*list.Element // FastNode cache.
+	cacheLimit int                      // FastNode cache size limit in elements.
+	ll         *list.List               // LRU queue of cache elements. Used for deletion.
+}


I would prefer if we use https://github.com/hashicorp/golang-lru or https://github.com/hashicorp/golang-lru/tree/master/simplelru ? We are already using it in various places.

I've looked into these options and, unfortunately, they do not allow for enough customization necessary for future work.

I'm planning to further improve the custom cache by adding the ability to have 2 kinds of limits:

in terms of bytes

in terms of the number of nodes

I have this work started here that I'm hoping to move to this repository: osmosis-labs#40

More background about why I'm introducing this cache and its limits can be found here: osmosis-labs/osmosis#1163

In short, we found it not intuitive configuring the limits in terms of nodes so we would like to provide the ability to do so in bytes only for fast cache

Maybe let's add a short comment in the code, before someone else will come and propose to remove this and reuse other library.

@p0mvn how about https://dgraph.io/blog/post/introducing-ristretto-high-perf-go-cache/ It supports setting byte high watermarks and that blogpost talks about their journey in figuring out some LRUs and caches.

@odeke-em thanks for the suggestion. I've investigated using ristretto by swapping it for fast nodes cache but have run into issues.

The major one is a ~17% increase in time/op. I suspect that is due to ristretto having overhead from concurrency guarantees which we might not really need.

Also, some randomized tests started failing for reasons I could not explain.

The current state of the attempt, summary, and benchmarks can be found here: #508

robert-zaremba · 2022-06-16T17:36:27Z

nodedb.go

+		opts:           *opts,
+		latestVersion:  0, // initially invalid
+		nodeCache:      cache.New(cacheSize),
+		fastNodeCache:  cache.New(fastNodeCacheLimit),


why do we still need fastNodeCache?

According to benchmarks, removing it significantly degrades the performance: osmosis-labs/osmosis#1041

Thanks for the link! It seams we both had the same concern :) Good job!

I looked at it too fast. I don't see a benchmark with nodeCache and no-fastNodeCache. The benchmark you have is noCache vs with fastNodeCache. Sorry, maybe I'm not interpreting it correctly as I don't see the whole setting.

I think I just wasn't consistent with how I was referring to the benchmarks, sorry for the confusion.

The comparison is most certainly between both node caches and regular cache but no-fastNodeCache.
When I referred to:
cache.log = both node caches
nocache.log = regular cache but no-fastNodeCache

Looking at the benchstat, the delta is coming mostly from query-hits-fast-16 and iteration-fast-16, proving that the regular cache was untouched. If it was touched, we would see delta in query-hits-slow-16 and iteration-slow-16

To explain the reasons for trying this, our node operators were getting OOM'ed. pprof results were showing memory growth only specific to fast cache. There was never a motivation for trying things w/o regular cache.

To mitigate the memory growth we decided to hardcode the value for the fast cache to 100000

iavl/nodedb.go

Line 33 in d07ef28

fastNodeCacheLimit = 100000

The value was chosen by spinning up a bunch of nodes at 10k, 50k, 100k, 150 and 200k. 100k proved to be the most optimal.

Spinning up several nodes to find a good value seemed like an unnecessarily complex solution. That's why I decided to do this refactor to eventually introduce byte cache for fast nodes. I'm also hoping to expose a parameter in config to let node operators choose this byte value.

p0mvn · 2022-06-16T19:43:23Z

@robert-zaremba thanks for the review. Addressed all comments in threads. Please let me know what you think

p0mvn · 2022-06-16T23:47:37Z

gobencher is failing for unrelated to this PR reasons

odeke-em

Thank you @p0mvn for the work! Thanks @marbar3778 for the ping!

@p0mvn I've added some comments, please take a look.

odeke-em · 2022-06-17T19:42:15Z

cache/cache.go

+type Cache interface {
+	// Adds node to cache. If full and had to remove the oldest element,
+	// returns the oldest, otherwise nil.
+	Add(node Node) Node


What happens when someone adds a nil Node? That'll break this entire interface. If nil nodes can be permitted, it might be much more useful to return (last Node, evicted bool)

There should never be a nil Node added. Would it be sufficient to add a comment saying that the contract for this method is that node is never nil?

From reading this discussion, I could either:

use reflection to check against nil in the Add method

define IsNil() method for the Node interface and implement it for every node struct.

However, if we were to go with one of the approaches, I think that this would be trading performance for safety unnecessarily at such a low database layer.

A comment will perhaps suffice. Thank you @p0mvn.

odeke-em · 2022-06-17T19:42:29Z

cache/cache.go

+	Add(node Node) Node
+
+	// Returns Node for the key, if exists. nil otherwise.
+	Get(key []byte) Node


What the value stored is nil?

It should never be possible for the value to be nil.

Currently, I added the CONTRACT to the spec of Add(node Node) explaining that node must never be nil.

Given the contract, it is impossible to have a nil value

odeke-em · 2022-06-17T19:48:51Z

cache/cache.go

+	dict       map[string]*list.Element // FastNode cache.
+	cacheLimit int                      // FastNode cache size limit in elements.
+	ll         *list.List               // LRU queue of cache elements. Used for deletion.
+}


@p0mvn how about https://dgraph.io/blog/post/introducing-ristretto-high-perf-go-cache/ It supports setting byte high watermarks and that blogpost talks about their journey in figuring out some LRUs and caches.

odeke-em · 2022-06-17T19:53:42Z

cache/cache.go

+	elem := c.ll.PushFront(node)
+	c.dict[string(node.GetKey())] = elem
+
+	if c.ll.Len() > c.cacheLimit {


Shouldn't this be >= ?

When I named this limit, I meant maximum, including in tests. I changed the names to reflect that so that the condition is kept as is.

odeke-em · 2022-06-17T19:58:15Z

cache/cache_bench_test.go

+		b.Run(name, func(b *testing.B) {
+			for i := 0; i < b.N; i++ {
+				_ = cache.Add(&testNode{
+					key: randBytes(tc.keySize),


I highly suspect that randBytes is going to skew the benchmark results due to the fact that it reads from crypto/rand.Reader. Please either

invoke b.StopTimer(), generate the bytes, the b.StartTimer()

Or

At init time, generate a whole lot of these keys then just iterate by the keyCount

But option 1 is the most pragmatic

Went with option 1, thanks

odeke-em · 2022-06-17T20:00:27Z

cache/cache_bench_test.go

+
+	for i := 0; i < b.N; i++ {
+		key := existentKeyMirror[r.Intn(len(existentKeyMirror))]
+		b.ResetTimer()


This benchmark doesn't make any sense. You keep resetting the the timer in the most important of the benchmark for every b.N iteration. This looks like a mistake.

odeke-em · 2022-06-17T20:01:03Z

cache/cache_bench_test.go

+
+	randSeed := 498727689 // For deterministic tests
+	r := rand.New(rand.NewSource(int64(randSeed)))
+


I think you should have instead invoked b.ResetTimer() before the loop.

robert-zaremba · 2022-07-04T16:16:49Z

@p0mvn any chance you will finish this PR this week?

p0mvn · 2022-07-04T16:34:29Z

@p0mvn any chance you will finish this PR this week?

Yes, I'll aim to complete this in the next couple of days. Apologies for the delay

robert-zaremba · 2022-07-04T17:48:25Z

thanks!

…o roman/cache-abstractions

p0mvn · 2022-07-06T16:14:06Z

@robert-zaremba @odeke-em addressed your comments, please take a look when you have a chance

robert-zaremba

utACK.
Would be also good to include a short doc explaining diff between nodeCache and fastNodeCache.

p0mvn · 2022-07-07T17:58:40Z

Added comments for the difference between 2 cache fields in the struct declaration.

I think it would be good to have a README explaining the motivation for the fast index change overall. I can work on this over the next week and open a separate PR.

robert-zaremba · 2022-07-07T18:27:12Z

test is still failing

p0mvn · 2022-07-19T00:41:28Z

Hey everyone. This is ready for review / final approval. Please take a look when you can

p0mvn and others added 2 commits June 13, 2022 21:31

fix bench

48c3a5e

p0mvn marked this pull request as ready for review June 14, 2022 02:02

p0mvn requested review from a team and alexanderbez June 14, 2022 02:03

tac0turtle approved these changes Jun 16, 2022

View reviewed changes

robert-zaremba reviewed Jun 16, 2022

View reviewed changes

p0mvn added 3 commits June 16, 2022 18:46

comment about the reasons for implementing cache

1c4c288

Merge branch 'master' into roman/cache-abstractions

d07ef28

expand test cases in TestNodeCacheStatisic for readability

d696f75

odeke-em reviewed Jun 17, 2022

View reviewed changes

p0mvn mentioned this pull request Jun 18, 2022

investigate: ristretto cache with memory bounding #508

Closed

Merge branch 'master' into roman/cache-abstractions

e194c16

p0mvn added 4 commits July 6, 2022 11:20

contract comment for adding nil

8bf11a3

rename limit to max / size

5a5f34b

fix benchmarks

e935d8f

Merge branch 'roman/cache-abstractions' of github.com:cosmos/iavl int…

d2b985c

…o roman/cache-abstractions

robert-zaremba approved these changes Jul 7, 2022

View reviewed changes

robert-zaremba and others added 2 commits July 7, 2022 15:44

Merge branch 'master' into roman/cache-abstractions

a7152c2

comment for nodeCache vs fastNodeCache

6eac72a

attempt to fix failing stats test

77dce5b

p0mvn mentioned this pull request Jul 21, 2022

Upstream osmosis iavl changes #526

Closed

Merge branch 'master' into roman/cache-abstractions

b5a27ad

tac0turtle approved these changes Jul 22, 2022

View reviewed changes

tac0turtle merged commit 9b14c86 into master Jul 22, 2022

tac0turtle deleted the roman/cache-abstractions branch July 22, 2022 07:34

coderabbitai bot mentioned this pull request May 13, 2024

chore: changelog onto master #946

Merged


		randSeed := 498727689 // For deterministic tests
		r := rand.New(rand.NewSource(int64(randSeed)))

refactor: implement cache abstraction and unit test #506

refactor: implement cache abstraction and unit test #506

Conversation

p0mvn commented Jun 14, 2022 • edited Loading

tac0turtle left a comment

Choose a reason for hiding this comment

tac0turtle commented Jun 16, 2022

robert-zaremba commented Jun 16, 2022

p0mvn commented Jun 16, 2022

robert-zaremba left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

p0mvn Jun 18, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

p0mvn Jun 16, 2022 • edited Loading

Choose a reason for hiding this comment

p0mvn commented Jun 16, 2022

p0mvn commented Jun 16, 2022

odeke-em left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

robert-zaremba commented Jul 4, 2022

p0mvn commented Jul 4, 2022

robert-zaremba commented Jul 4, 2022

p0mvn commented Jul 6, 2022

robert-zaremba left a comment

Choose a reason for hiding this comment

p0mvn commented Jul 7, 2022

robert-zaremba commented Jul 7, 2022

p0mvn commented Jul 19, 2022 • edited Loading

p0mvn commented Jun 14, 2022 •

edited

Loading

p0mvn Jun 18, 2022 •

edited

Loading

p0mvn Jun 16, 2022 •

edited

Loading

p0mvn commented Jul 19, 2022 •

edited

Loading