Enable parallel prefetching of nodes within the same storage trie #425

BrianBland · 2024-10-31T02:30:34Z

Description

This extends the statedb prefetcher to support parallel fetches within the context of a single trie. The purpose of the statedb prefetcher is only to prewarm database caches, ensuring that all trie nodes are quickly accessible when computing the MPT state root.

Without this change, the statedb's prefetcher will perform concurrent fetching jobs for each unique trie, but perform all fetches for each trie sequentially. In certain cases, such as blocks which contain a large number of storage updates in a single large trie, this sequential prefetching behavior can result in a significant performance degradation. In extreme cases, blocks may contain updates to thousands of storage slots in a single trie, which are then fetched entirely sequentially.

This implementation utilizes a fixed worker pool per prefetcher, and allows the per-trie subfetcher to clone this trie up to N times for a given max concurrency limit of N goroutines. This approach was chosen because the trie itself is not safe for concurrent use, and the cost of copying the trie is negligible when compared to the round-trip latency of fetching the associated trie nodes from the local database.

This selects a somewhat-arbitrary 16 goroutines as the default concurrency limit for the prefetcher based on some locally-run benchmarking results on a M1 Pro with 16GB of memory, using a matrix of trieSize (1k, 100k, 10M), keyCount (10, 100, 1k, 10k), and maxConcurrency (1, 4, 16, 64) values. We see that there is a minor increase in overhead associated with concurrent access when accessing tries containing fewer than 1000 nodes, but around a 10x reduction in latency when accessing at least 10 keys from very large tries (10M keys). This is likely a worthwhile tradeoff for high-throughput EVM chains, as this improvement directly targets a subset of the worst-performing blocks.

Tests

TODO: include links to benchmarking code.

Additional context

Metadata

See ethereum/go-ethereum#28266 for additional context and prior discussions.

ajsutton · 2024-10-31T02:43:49Z

I haven't had a chance to look at the code yet, but it is worth noting that the cannon fault proof VM does not support threading so the concurrency here will need to be able to be disabled either by an API call that op-program can call or when there is a single CPU. To be compatible with cannon no new go routines can be created.

ajsutton · 2024-10-31T02:46:14Z

#370 added a worker group with two implementations - one that doesn't create any go routines. And I'd forgotten again but actually we want to avoid concurrency when running in native mode as well because the PreimageOracle communication isn't thread safe so #373 switched to using an explicit flag for single threaded mode instead of doing it automatically when there is one CPU.

BrianBland · 2024-10-31T03:13:16Z

@ajsutton while this PR adds more concurrency capabilities to the prefetcher, it already strictly depends on goroutines (https://github.com/ethereum-optimism/op-geth/blob/optimism/core/state/trie_prefetcher.go#L270). I don't expect any of the prefetcher code to actually be invoked by op-program given that this would already violate the singlethreading requirement.

ajsutton · 2024-10-31T03:14:53Z

Awesome thanks. I just wanted to jump in early and flag the risk since I won't have time to look at this for a while and likely (hopefully) someone else will get to it first.

BrianBland · 2024-10-31T04:27:12Z

Results from this benchmarking test:

goos: darwin
goarch: arm64
pkg: github.com/ethereum/go-ethereum/core/state
cpu: Apple M1 Pro
BenchmarkTriePrefetcher
BenchmarkTriePrefetcher/trieSize=1000,prefetchKeys=10,maxConcurrency=1
BenchmarkTriePrefetcher/trieSize=1000,prefetchKeys=10,maxConcurrency=1-8         	       1	    368541 ns/op
BenchmarkTriePrefetcher/trieSize=1000,prefetchKeys=100,maxConcurrency=1
BenchmarkTriePrefetcher/trieSize=1000,prefetchKeys=100,maxConcurrency=1-8        	       1	    765125 ns/op
BenchmarkTriePrefetcher/trieSize=1000,prefetchKeys=1000,maxConcurrency=1
BenchmarkTriePrefetcher/trieSize=1000,prefetchKeys=1000,maxConcurrency=1-8       	       1	   1578500 ns/op
BenchmarkTriePrefetcher/trieSize=1000,prefetchKeys=10000,maxConcurrency=1
BenchmarkTriePrefetcher/trieSize=1000,prefetchKeys=10000,maxConcurrency=1-8      	       1	   2763167 ns/op
BenchmarkTriePrefetcher/trieSize=100000,prefetchKeys=10,maxConcurrency=1
BenchmarkTriePrefetcher/trieSize=100000,prefetchKeys=10,maxConcurrency=1-8       	       1	   4306791 ns/op
BenchmarkTriePrefetcher/trieSize=100000,prefetchKeys=100,maxConcurrency=1
BenchmarkTriePrefetcher/trieSize=100000,prefetchKeys=100,maxConcurrency=1-8      	       1	   2168458 ns/op
BenchmarkTriePrefetcher/trieSize=100000,prefetchKeys=1000,maxConcurrency=1
BenchmarkTriePrefetcher/trieSize=100000,prefetchKeys=1000,maxConcurrency=1-8     	       1	  10609667 ns/op
BenchmarkTriePrefetcher/trieSize=100000,prefetchKeys=10000,maxConcurrency=1
BenchmarkTriePrefetcher/trieSize=100000,prefetchKeys=10000,maxConcurrency=1-8    	       1	  57159875 ns/op
BenchmarkTriePrefetcher/trieSize=10000000,prefetchKeys=10,maxConcurrency=1
BenchmarkTriePrefetcher/trieSize=10000000,prefetchKeys=10,maxConcurrency=1-8     	       1	  75022250 ns/op
BenchmarkTriePrefetcher/trieSize=10000000,prefetchKeys=100,maxConcurrency=1
BenchmarkTriePrefetcher/trieSize=10000000,prefetchKeys=100,maxConcurrency=1-8    	       1	 184448583 ns/op
BenchmarkTriePrefetcher/trieSize=10000000,prefetchKeys=1000,maxConcurrency=1
BenchmarkTriePrefetcher/trieSize=10000000,prefetchKeys=1000,maxConcurrency=1-8   	       1	1268485792 ns/op
BenchmarkTriePrefetcher/trieSize=10000000,prefetchKeys=10000,maxConcurrency=1
BenchmarkTriePrefetcher/trieSize=10000000,prefetchKeys=10000,maxConcurrency=1-8  	       1	10005077083 ns/op
BenchmarkTriePrefetcher/trieSize=1000,prefetchKeys=10,maxConcurrency=4
BenchmarkTriePrefetcher/trieSize=1000,prefetchKeys=10,maxConcurrency=4-8         	       1	   1327708 ns/op
BenchmarkTriePrefetcher/trieSize=1000,prefetchKeys=100,maxConcurrency=4
BenchmarkTriePrefetcher/trieSize=1000,prefetchKeys=100,maxConcurrency=4-8        	       1	    697541 ns/op
BenchmarkTriePrefetcher/trieSize=1000,prefetchKeys=1000,maxConcurrency=4
BenchmarkTriePrefetcher/trieSize=1000,prefetchKeys=1000,maxConcurrency=4-8       	       1	   2919875 ns/op
BenchmarkTriePrefetcher/trieSize=1000,prefetchKeys=10000,maxConcurrency=4
BenchmarkTriePrefetcher/trieSize=1000,prefetchKeys=10000,maxConcurrency=4-8      	       1	   4309250 ns/op
BenchmarkTriePrefetcher/trieSize=100000,prefetchKeys=10,maxConcurrency=4
BenchmarkTriePrefetcher/trieSize=100000,prefetchKeys=10,maxConcurrency=4-8       	       1	    637250 ns/op
BenchmarkTriePrefetcher/trieSize=100000,prefetchKeys=100,maxConcurrency=4
BenchmarkTriePrefetcher/trieSize=100000,prefetchKeys=100,maxConcurrency=4-8      	       1	   1541541 ns/op
BenchmarkTriePrefetcher/trieSize=100000,prefetchKeys=1000,maxConcurrency=4
BenchmarkTriePrefetcher/trieSize=100000,prefetchKeys=1000,maxConcurrency=4-8     	       1	  12693291 ns/op
BenchmarkTriePrefetcher/trieSize=100000,prefetchKeys=10000,maxConcurrency=4
BenchmarkTriePrefetcher/trieSize=100000,prefetchKeys=10000,maxConcurrency=4-8    	       1	  94442750 ns/op
BenchmarkTriePrefetcher/trieSize=10000000,prefetchKeys=10,maxConcurrency=4
BenchmarkTriePrefetcher/trieSize=10000000,prefetchKeys=10,maxConcurrency=4-8     	       1	  46073917 ns/op
BenchmarkTriePrefetcher/trieSize=10000000,prefetchKeys=100,maxConcurrency=4
BenchmarkTriePrefetcher/trieSize=10000000,prefetchKeys=100,maxConcurrency=4-8    	       1	  75421666 ns/op
BenchmarkTriePrefetcher/trieSize=10000000,prefetchKeys=1000,maxConcurrency=4
BenchmarkTriePrefetcher/trieSize=10000000,prefetchKeys=1000,maxConcurrency=4-8   	       1	 411606792 ns/op
BenchmarkTriePrefetcher/trieSize=10000000,prefetchKeys=10000,maxConcurrency=4
BenchmarkTriePrefetcher/trieSize=10000000,prefetchKeys=10000,maxConcurrency=4-8  	       1	2490629375 ns/op
BenchmarkTriePrefetcher/trieSize=1000,prefetchKeys=10,maxConcurrency=16
BenchmarkTriePrefetcher/trieSize=1000,prefetchKeys=10,maxConcurrency=16-8        	       1	    477667 ns/op
BenchmarkTriePrefetcher/trieSize=1000,prefetchKeys=100,maxConcurrency=16
BenchmarkTriePrefetcher/trieSize=1000,prefetchKeys=100,maxConcurrency=16-8       	       1	    712250 ns/op
BenchmarkTriePrefetcher/trieSize=1000,prefetchKeys=1000,maxConcurrency=16
BenchmarkTriePrefetcher/trieSize=1000,prefetchKeys=1000,maxConcurrency=16-8      	       1	   2741791 ns/op
BenchmarkTriePrefetcher/trieSize=1000,prefetchKeys=10000,maxConcurrency=16
BenchmarkTriePrefetcher/trieSize=1000,prefetchKeys=10000,maxConcurrency=16-8     	       1	   4372875 ns/op
BenchmarkTriePrefetcher/trieSize=100000,prefetchKeys=10,maxConcurrency=16
BenchmarkTriePrefetcher/trieSize=100000,prefetchKeys=10,maxConcurrency=16-8      	       1	    560625 ns/op
BenchmarkTriePrefetcher/trieSize=100000,prefetchKeys=100,maxConcurrency=16
BenchmarkTriePrefetcher/trieSize=100000,prefetchKeys=100,maxConcurrency=16-8     	       1	   1536542 ns/op
BenchmarkTriePrefetcher/trieSize=100000,prefetchKeys=1000,maxConcurrency=16
BenchmarkTriePrefetcher/trieSize=100000,prefetchKeys=1000,maxConcurrency=16-8    	       1	  10867167 ns/op
BenchmarkTriePrefetcher/trieSize=100000,prefetchKeys=10000,maxConcurrency=16
BenchmarkTriePrefetcher/trieSize=100000,prefetchKeys=10000,maxConcurrency=16-8   	       1	  97409166 ns/op
BenchmarkTriePrefetcher/trieSize=10000000,prefetchKeys=10,maxConcurrency=16
BenchmarkTriePrefetcher/trieSize=10000000,prefetchKeys=10,maxConcurrency=16-8    	       1	  13836709 ns/op
BenchmarkTriePrefetcher/trieSize=10000000,prefetchKeys=100,maxConcurrency=16
BenchmarkTriePrefetcher/trieSize=10000000,prefetchKeys=100,maxConcurrency=16-8   	       1	  71473833 ns/op
BenchmarkTriePrefetcher/trieSize=10000000,prefetchKeys=1000,maxConcurrency=16
BenchmarkTriePrefetcher/trieSize=10000000,prefetchKeys=1000,maxConcurrency=16-8  	       1	 340721250 ns/op
BenchmarkTriePrefetcher/trieSize=10000000,prefetchKeys=10000,maxConcurrency=16
BenchmarkTriePrefetcher/trieSize=10000000,prefetchKeys=10000,maxConcurrency=16-8 	       1	2163701750 ns/op
BenchmarkTriePrefetcher/trieSize=1000,prefetchKeys=10,maxConcurrency=64
BenchmarkTriePrefetcher/trieSize=1000,prefetchKeys=10,maxConcurrency=64-8        	       1	    557584 ns/op
BenchmarkTriePrefetcher/trieSize=1000,prefetchKeys=100,maxConcurrency=64
BenchmarkTriePrefetcher/trieSize=1000,prefetchKeys=100,maxConcurrency=64-8       	       1	    782584 ns/op
BenchmarkTriePrefetcher/trieSize=1000,prefetchKeys=1000,maxConcurrency=64
BenchmarkTriePrefetcher/trieSize=1000,prefetchKeys=1000,maxConcurrency=64-8      	       1	   2665083 ns/op
BenchmarkTriePrefetcher/trieSize=1000,prefetchKeys=10000,maxConcurrency=64
BenchmarkTriePrefetcher/trieSize=1000,prefetchKeys=10000,maxConcurrency=64-8     	       1	   4527458 ns/op
BenchmarkTriePrefetcher/trieSize=100000,prefetchKeys=10,maxConcurrency=64
BenchmarkTriePrefetcher/trieSize=100000,prefetchKeys=10,maxConcurrency=64-8      	       1	    577250 ns/op
BenchmarkTriePrefetcher/trieSize=100000,prefetchKeys=100,maxConcurrency=64
BenchmarkTriePrefetcher/trieSize=100000,prefetchKeys=100,maxConcurrency=64-8     	       1	   1662959 ns/op
BenchmarkTriePrefetcher/trieSize=100000,prefetchKeys=1000,maxConcurrency=64
BenchmarkTriePrefetcher/trieSize=100000,prefetchKeys=1000,maxConcurrency=64-8    	       1	   9062125 ns/op
BenchmarkTriePrefetcher/trieSize=100000,prefetchKeys=10000,maxConcurrency=64
BenchmarkTriePrefetcher/trieSize=100000,prefetchKeys=10000,maxConcurrency=64-8   	       1	 105606000 ns/op
BenchmarkTriePrefetcher/trieSize=10000000,prefetchKeys=10,maxConcurrency=64
BenchmarkTriePrefetcher/trieSize=10000000,prefetchKeys=10,maxConcurrency=64-8    	       1	  12950041 ns/op
BenchmarkTriePrefetcher/trieSize=10000000,prefetchKeys=100,maxConcurrency=64
BenchmarkTriePrefetcher/trieSize=10000000,prefetchKeys=100,maxConcurrency=64-8   	       1	  64940916 ns/op
BenchmarkTriePrefetcher/trieSize=10000000,prefetchKeys=1000,maxConcurrency=64
BenchmarkTriePrefetcher/trieSize=10000000,prefetchKeys=1000,maxConcurrency=64-8  	       1	 433275333 ns/op
BenchmarkTriePrefetcher/trieSize=10000000,prefetchKeys=10000,maxConcurrency=64
BenchmarkTriePrefetcher/trieSize=10000000,prefetchKeys=10000,maxConcurrency=64-8 	       1	1146698292 ns/op
PASS
ok  	github.com/ethereum/go-ethereum/core/state	2923.199s

Enable parallel prefetching of nodes within the same storage trie

6c82c46

BrianBland added 2 commits October 30, 2024 20:18

Update other callers of the prefetcher

f518a2b

Add cache.prefetcher.parallelism flag, default to 16

8510e35

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Enable parallel prefetching of nodes within the same storage trie #425

Enable parallel prefetching of nodes within the same storage trie #425

BrianBland commented Oct 31, 2024 •

edited

Loading

ajsutton commented Oct 31, 2024

ajsutton commented Oct 31, 2024

BrianBland commented Oct 31, 2024

ajsutton commented Oct 31, 2024

BrianBland commented Oct 31, 2024 •

edited

Loading

Enable parallel prefetching of nodes within the same storage trie #425

Are you sure you want to change the base?

Enable parallel prefetching of nodes within the same storage trie #425

Conversation

BrianBland commented Oct 31, 2024 • edited Loading

ajsutton commented Oct 31, 2024

ajsutton commented Oct 31, 2024

BrianBland commented Oct 31, 2024

ajsutton commented Oct 31, 2024

BrianBland commented Oct 31, 2024 • edited Loading

BrianBland commented Oct 31, 2024 •

edited

Loading

BrianBland commented Oct 31, 2024 •

edited

Loading