Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

core: improve trie updates (part 2) #21047

Merged
merged 2 commits into from
Jan 20, 2021
Merged

Conversation

holiman
Copy link
Contributor

@holiman holiman commented May 8, 2020

This is part 2 of #20796. It's very much work in progress, and a bit hacky.

  • Deliver main tries
  • Deliver account tries
  • Testing

Feedback regarding the design is appreciated

@holiman
Copy link
Contributor Author

holiman commented May 9, 2020

I'm going to restart the benchmark from block 5M now, due to the problems with master.
I did get some nice charts for account_update, around the transition to byzantium (when we stop doing intermediate roots after every tx -- which is the point where I expect this PR to make a big dent).

Screenshot_2020-05-09 Dual Geth - Grafana

So far, this PR only exports the account trie, not yet storage tries.

@holiman
Copy link
Contributor Author

holiman commented May 9, 2020

On the new benchmark run, both finished generating the snapshot after about 2h 15m.

May 09 15:12:36 bench01.ethdevops.io geth INFO [05-09|13:12:35.949] Starting peer-to-peer node instance=Geth/v1.9.14-unstable-89a36171-20200508/linux-amd64/go1.14.2
May 09 15:12:42 bench02.ethdevops.io geth INFO [05-09|13:12:42.508] Starting peer-to-peer node instance=Geth/v1.9.14-unstable-82f9ed49-20200508/linux-amd64/go1.14.2 
...
May 09 17:31:56 bench02.ethdevops.io geth INFO [05-09|15:31:56.621] Generated state snapshot accounts=26380419 slots=41264679 storage=4.22GiB elapsed=2h18m54.750s
May 09 17:32:20 bench01.ethdevops.io geth INFO [05-09|15:32:20.731] Generated state snapshot accounts=26380345 slots=41263951 storage=4.22GiB elapsed=2h19m25.169s 

@holiman
Copy link
Contributor Author

holiman commented May 10, 2020

Block around 57622XX:

May 10 14:36:15 bench01.ethdevops.io geth INFO [05-10|12:36:15.239] Imported new chain segment blocks=74 txs=9891 mgas=487.228 elapsed=8.022s mgasps=60.733 number=5762219 hash="24988a…16b544" age=1y11mo1w dirty=1022.94MiB 
...
May 10 17:03:07 bench02.ethdevops.io geth INFO [05-10|15:03:07.256] Imported new chain segment blocks=45 txs=7366 mgas=308.569 elapsed=8.024s mgasps=38.454 number=5762249 hash="135387…4100a7" age=1y11mo1w dirty=1023.05MiB 

So after ~24h, bench01 got a 2.5 hours lead, or roughly 10% improvement in block import times

Here are account and storage update charts:
Screenshot_2020-05-10 Dual Geth - Grafana -- which looks like ~20ms savings per block, on average.

Start block after snapshot generation was around 5049092.
This gives,

  • bench01: 106ms/block on average,
  • bench02: 118ms/block on average

I have some new modifications which I will push soon, to deliver storage tries in object form too.

@holiman
Copy link
Contributor Author

holiman commented May 11, 2020

About to deploy a new update, here's the result chart for the first run (account tries delivered as objects, storage tries warmed up via prefetching but not delivered as objects)
Screenshot_2020-05-11 Dual Geth - Grafana

@holiman
Copy link
Contributor Author

holiman commented May 11, 2020

New benchmark started. Started at 5M, this is the chart from where both of them were done generating the snapshot: https://geth-bench.ethdevops.io/d/Jpk-Be5Wk/dual-geth?orgId=1&var-exp=bench01&var-master=bench02&var-percentile=50&from=1589212697550&to=now

@holiman
Copy link
Contributor Author

holiman commented May 12, 2020

After ~12 hours, here's a chart showing execution combined with storage_update and account_update.
Screenshot_2020-05-12 Dual Geth - Grafana

.

@holiman
Copy link
Contributor Author

holiman commented May 12, 2020

First run

The first run was based on the version which delivers account tries, but not storage tries.

This PR:

May 09 17:32:20 bench01.ethdevops.io geth INFO [05-09|15:32:20.731] Generated state snapshot accounts=26380345 slots=41263951 storage=4.22GiB elapsed=2h19m25.169s 
May 09 17:32:26 bench01.ethdevops.io geth INFO [05-09|15:32:26.386] Imported new chain segment blocks=70 txs=9546 mgas=433.862 elapsed=8.090s mgasps=53.629 number=5049092 hash="3ad8d0…2395aa" age=2y3mo1w dirty=1018.35MiB 
...
May 10 14:36:15 bench01.ethdevops.io geth INFO [05-10|12:36:15.239] Imported new chain segment blocks=74 txs=9891 mgas=487.228 elapsed=8.022s mgasps=60.733 number=5762219 hash="24988a…16b544" age=1y11mo1w dirty=1022.94MiB 

Start block after snapshot generation: 5049092, reached block 5762219 (713127 blocks later) after 21h4m: 106.3 ms/block

master

May 09 17:31:56 bench02.ethdevops.io geth INFO [05-09|15:31:56.621] Generated state snapshot accounts=26380419 slots=41264679 storage=4.22GiB elapsed=2h18m54.750s
May 09 17:32:00 bench02.ethdevops.io geth INFO [05-09|15:32:00.354] Imported new chain segment blocks=58 txs=7810 mgas=378.535 elapsed=8.010s mgasps=47.256 number=5048322 hash="153512…a7fc60" age=2y3mo1w dirty=1022.80MiB 
...
May 10 17:03:07 bench02.ethdevops.io geth INFO [05-10|15:03:07.256] Imported new chain segment blocks=45 txs=7366 mgas=308.569 elapsed=8.024s mgasps=38.454 number=5762249 hash="135387…4100a7" age=1y11mo1w dirty=1023.05MiB 

Start block after snapshot generation: 5048322, reached block 5762249 (713927 blocks later) after 23h31m: 118.6 ms/block

Second run

The second run is this PR with storage-trie-delivery added in

May 11 17:55:40 bench01.ethdevops.io geth INFO [05-11|15:55:40.118] Imported new chain segment blocks=67 txs=8194 mgas=456.993 elapsed=8.053s mgasps=56.743 number=5050584 hash="ab7b29…16feee" age=2y3mo1w dirty=1022.65MiB 
May 12 14:35:25 bench01.ethdevops.io geth INFO [05-12|12:35:25.353] Imported new chain segment blocks=77 txs=10947 mgas=512.181 elapsed=8.065s mgasps=63.507 number=5762236 hash="c660bd…ee51b7" age=1y11mo1w dirty=1023.21MiB 

Start block after snapshot generation: 5050584, reached block 5762236(711652 blocks later) after 20h40m: 104.5 ms/block

master

 May 11 17:52:50 bench02.ethdevops.io geth INFO [05-11|15:52:49.966] Imported new chain segment blocks=52 txs=6657 mgas=379.182 elapsed=8.020s mgasps=47.275 number=5048225 hash="36e92d…c09c8e" age=2y3mo1w dirty=1022.88MiB 
 May 12 17:22:40 bench02.ethdevops.io geth INFO [05-12|15:22:40.635] Imported new chain segment blocks=63 txs=9208 mgas=412.976 elapsed=8.044s mgasps=51.337 number=5762229 hash="f05b55…f7ec57" age=1y11mo1w dirty=1022.64MiB 

Start block after snapshot generation: 5048225, reached block 5762229(714004 blocks later) after 23h30m: 118.5 ms/block.

@holiman
Copy link
Contributor Author

holiman commented May 13, 2020

Interestingly, the removal of the regular prefetcher causes the loading-from-snapshot to take a hit.

So the gains that are won on account_update and storage_update are somewhat eaten into by speed-loss in snapshot_storage_read and snapshot_account_read
Screenshot_2020-05-13 Dual Geth - Grafana

@holiman
Copy link
Contributor Author

holiman commented May 14, 2020

The last couple of days:
Screenshot_2020-05-14 Dual Geth - Grafana

The block heights are 6.95M vs 7.22M (in favour of this PR), both started at 5.05M

@holiman
Copy link
Contributor Author

holiman commented May 15, 2020

Preparing a third run now, where I've re-enabled the old prefetcher, which executes the next block on current state. The hope being that this should address the regression on snapshot_account_read and snapshot_storage_read

@holiman
Copy link
Contributor Author

holiman commented May 15, 2020

Third run in progress

May 15 14:58:43 bench01.ethdevops.io geth INFO [05-15|12:58:43.545] Generated state snapshot accounts=26358822 slots=41225212 storage=4.22GiB elapsed=2h22m45.514s
May 15 14:58:49 bench01.ethdevops.io geth INFO [05-15|12:58:49.194] Imported new chain segment blocks=56 txs=7633 mgas=374.765 elapsed=8.045s mgasps=46.579 number=5048448 hash="272e82…614245" age=2y3mo2w dirty=1023.13MiB 
May 15 14:56:13 bench02.ethdevops.io geth INFO [05-15|12:56:13.641] Generated state snapshot accounts=26386458 slots=41276090 storage=4.22GiB elapsed=2h20m10.229s
May 15 14:56:13 bench02.ethdevops.io geth INFO [05-15|12:56:13.910] Imported new chain segment blocks=28 txs=4216 mgas=205.895 elapsed=8.056s mgasps=25.557 number=5048614 hash="81cd35…68d0e6" age=2y3mo2w dirty=1022.68MiB 

Charts: https://geth-bench.ethdevops.io/d/Jpk-Be5Wk/dual-geth?orgId=1&var-exp=bench01&var-master=bench02&var-percentile=50&from=1589547665049&to=now

@holiman
Copy link
Contributor Author

holiman commented May 16, 2020

 May 16 12:44:57 bench01.ethdevops.io geth INFO [05-16|10:44:57.160] Imported new chain segment blocks=62 txs=9586 mgas=418.097 elapsed=8.320s mgasps=50.248 number=5762239 hash="ae28f8…91122e" age=1y11mo2w dirty=1023.24MiB 
Version Run Start time Start block End time End block Speed
PR v1 1 15:32:26 5049092 12:36:15 5762219 106.3 ms/block
master 1 15:32:00 5048322 15:03:07 5762249 118.6 ms/block
PR v2 2 15:55:40 5050584 12:35:25 5762236 104.5 ms/block
master 2 15:52:49 5048225 15:22:40 5762229 118.5 ms/block
PR v3 3 12:58:49 5048448 10:44:57 5762239 109.8 ms/block
master 3 12:56:13 5048614

Looks like adding back the original prefetcher adds a regression. Although the snapshot reads are now in league with master:
Screenshot_2020-05-16 Dual Geth - Grafana

I can't really explain it -- my best guess would be that we're hitting some IO bottleneck, and although we improve in some parts, there's some overall higher sluggishness in all IO.

(there's no obvious extra egress from experimental on the last run either)

@holiman
Copy link
Contributor Author

holiman commented May 18, 2020

Comparing run 2 with run 3:

Run 2, blocks 5050253 - 6001565 : https://geth-bench.ethdevops.io/d/Jpk-Be5Wk/dual-geth?orgId=1&from=1589212470448&to=1589312346463

run-2

Run 3, blocks 5056095 - 6000581: https://geth-bench.ethdevops.io/d/Jpk-Be5Wk/dual-geth?orgId=1&from=1589548255204&to=1589652724888

run-3

op avg run 2 avg run 3 diff
execution 37.5 ms 39.7 ms +1.8
commit 15.9 ms 16.9 ms +1.0
snapshot storage read 9.3 ms 8.2 ms -1.1
account commit 8.0 ms 9.7 ms +1.7
snapshot account read 6.4 ms 5.7 ms -0.7
storage commit 6.3 ms 8.0 ms +1.7
storage hash 5.4 ms 5.8 ms +0.4
account hash 3.8 ms 4.0 ms +0.2
account update 3.4 ms 3.4 ms -
validation 3.2 ms 3.3 ms +0.1
snapshot commit 2.9 ms 2.8 ms -0.1
storage update 2.5 ms 2.5 ms -
account read 573 ns 0 ns -0.5
storage read 454 ns 0 ns -0.5
SUM 105.6ms 110.0ms

@holiman
Copy link
Contributor Author

holiman commented May 18, 2020

Looking at the efficiency of the trie prefetcher:
(second run):
Screenshot_2020-05-18 Dual Geth - Grafana
(third run):
Screenshot_2020-05-18 Dual Geth - Grafana(1)

We can see that the second run has 5.18K fetches on aveage, whereas the third run has 4.94K fetches on average. Which is the reason why the storage/account commit operation saw a regression on the third run, I guess.

@holiman holiman marked this pull request as ready for review May 19, 2020 09:36
@holiman
Copy link
Contributor Author

holiman commented May 19, 2020

ready for review. Should I disable to 'regular' prefetcher for good, or leave that as a separate thing later?

@holiman
Copy link
Contributor Author

holiman commented May 24, 2020

Here are the last 6 hours of them both being up to head.
Screenshot_2020-05-24 Dual Geth - Grafana

I'll swap the machines around for a sanity check now

@holiman
Copy link
Contributor Author

holiman commented May 24, 2020

Fourth run

bench02, now with this PR

May 24 19:27:43 bench02.ethdevops.io geth INFO [05-24|17:27:43.527] Generated state snapshot accounts=26356296 slots=41222789 storage=4.22GiB elapsed=2h19m22.077s
May 24 19:27:45 bench02.ethdevops.io geth INFO [05-24|17:27:45.797] Imported new chain segment blocks=17 txs=3209 mgas=134.557 elapsed=8.114s mgasps=16.583 number=5048551 hash="7b8e34…5c0703" age=2y3mo3w dirty=1023.79MiB 
...
May 25 17:01:45 bench02.ethdevops.io geth INFO [05-25|15:01:45.639] Imported new chain segment blocks=69 txs=10013 mgas=460.737 elapsed=8.033s mgasps=57.352 number=5762233 hash="434e0a…bcf21d" age=1y11mo3w dirty=1023.34MiB 

and bench01, with master:

May 24 19:22:35 bench01.ethdevops.io geth INFO [05-24|17:22:34.993] Generated state snapshot accounts=26350530 slots=41211811 storage=4.22GiB elapsed=2h14m18.436s
May 24 19:22:38 bench01.ethdevops.io geth INFO [05-24|17:22:38.724] Imported new chain segment blocks=43 txs=6352 mgas=300.193 elapsed=8.204s mgasps=36.587 number=5047272 hash="38ddff…2e7ed7" age=2y3mo3w dirty=1022.12MiB 
...
May 25 19:09:13 bench01.ethdevops.io geth INFO [05-25|17:09:13.180] Imported new chain segment blocks=58 txs=9057 mgas=398.132 elapsed=8.238s mgasps=48.324 number=5762238 hash="fa34b4…435bd0" age=1y11mo3w dirty=1023.08MiB 
Version Run Start time Start block End time End block Speed
PR v1 1 15:32:26 5049092 12:36:15 5762219 106.3 ms/block
master 1 15:32:00 5048322 15:03:07 5762249 118.6 ms/block
PR v2 2 15:55:40 5050584 12:35:25 5762236 104.5 ms/block
master 2 15:52:49 5048225 15:22:40 5762229 118.5 ms/block
PR v3 3 12:58:49 5048448 10:44:57 5762239 109.8 ms/block
master 3 12:56:13 5048614
PR v3 4 17:27:45 5048551 15:01:45 5762233 108.7 ms/block
master 4 17:22:38 5047272 17:09:13 5762238 119.9 ms/block

Q.E.D
:shipit:

@holiman
Copy link
Contributor Author

holiman commented Jun 9, 2020

rebased

@holiman holiman force-pushed the improve_updates_2 branch 2 times, most recently from 3bd0b1a to e9edf05 Compare June 9, 2020 13:58
core/state/trie_prefetcher.go Outdated Show resolved Hide resolved
core/state/trie_prefetcher.go Outdated Show resolved Hide resolved
core/state/trie_prefetcher.go Outdated Show resolved Hide resolved
@@ -1769,6 +1782,7 @@ func (bc *BlockChain) insertChain(chain types.Blocks, verifySeals bool) (int, er
parent = bc.GetHeader(block.ParentHash(), block.NumberU64()-1)
}
statedb, err := state.New(parent.Root, bc.stateCache, bc.snaps)
statedb.UsePrefetcher(bc.triePrefetcher)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we pass the prefetcher pointer to the statedb directly?

And also I think the triePrefetcher only make sense if we use snapshot. If snapshot is disabled, then we should disable this too.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The bc.triePrefetcher will be nil if it's disabled. The rest of the snapshot-stuff is handled via bc, so it's a bit cumbersome to make the connection directly to statedb.

core/state/database.go Show resolved Hide resolved
core/state/state_object.go Show resolved Hide resolved
// prefetcher
s.trie = s.db.prefetcher.GetTrie(s.data.Root)
}
if s.trie == nil {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we load the data from the trie directly(without hitting in the snapshot for some reasons), is it ok to reload the path again by triePrefetcher?

I think it's fine.
(a) Reload will be super cheap if it's in the memory.
(b) Usually we can hit all slots in the snapshot

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we would only not use the snapshot if we're in the process of creating it, which is a temporary headache, and it's ok to load it twice in that case, since it would also have been warmed up in the cache.

Another similar thing is if we have two identical storage tries, in two separate contracts. We can only "hand out' the trie to one of them, and the second one will have to load from disk/cache. Otherwise, the first one would modify the trie, and the second one would get the modified version.

trieChanges = append(trieChanges, key)
}
if len(trieChanges) > 0 && s.db.prefetcher != nil && s.data.Root != emptyRoot {
s.db.prefetcher.PrefetchStorage(s.data.Root, trieChanges)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Prehaps we can cut down the number of slots to be resolved.

If the original value in storage is empty, then we don't need to resolve the path of slots.

But no idea how many this kind of slots we will have(create new slots in the contract).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why don't we need to resolve the path is the original value is empty? if we're going to write to it, we'll still need to resolve the path.

var storage map[common.Hash][]byte
if s.db.snap != nil {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any particular reason to move this storage initialization?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, we might not need to do it at all, if there are no pending storage changes

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is, if the value == originStorage

s.prefetcher.Pause()
// We only want to do this _once_, if someone calls IntermediateRoot again,
// we shouldn't fetch the trie again
if s.originalRoot != (common.Hash{}) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps we can do the similar thing like storage trie, whenever we retrieve the resolved trie from the prefetcher, mark it as nil.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You mean to just release the reference for GC, or do you mean for correctness?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The cached tries should only be accessed for one time and then throw it away. E.g. storage trie.
So I mean for state trie it's same. If we fetch the state trie from prefetcher then mark it as nil.
(although the current code does the same logic)

@holiman
Copy link
Contributor Author

holiman commented Jul 6, 2020

I addressed some of the concerns, I need to think a bit more about the remaining ones + do a rebase.

@holiman
Copy link
Contributor Author

holiman commented Dec 17, 2020

Wow, this PR is really shining after a few days
Screenshot_2020-12-17 Dual Geth - Grafana

@holiman
Copy link
Contributor Author

holiman commented Dec 19, 2020

Here's a later segment, when none of the nodes were serving a lot of data to other peers
Screenshot_2020-12-19 Dual Geth - Grafana

Note: the scales are different, this PR is ~50ms faster per block

@holiman
Copy link
Contributor Author

holiman commented Dec 28, 2020

Running a new benchmark now;

  • bench03 modified version of this pr, which splits out account prefetching and storage prefetching as two separate goroutines.
  • bench04 this PR.

Both were just stopped from an earlier run, and not wiped, so should continue where they left off.

@holiman
Copy link
Contributor Author

holiman commented Jan 4, 2021

I made an alternative version here:https://github.com/holiman/go-ethereum/tree/improve_updates_3 -- it splits up the account-fetching and storage-fetching in two separate goroutines. This PR and that branch is running on the benchmarkers now, and they're getting about the same number of performance-stats -- same number of deliveries. Both work fine, it's rather a question of which implementation is 'nicest'. The latter one is perhaps less hacky, but has a bit more code.

tp := state.NewTriePrefetcher(bc.stateCache)

go func() {
bc.wg.Add(1)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please call Add in the outer goroutine instead of the one that's being launched.

root: common.Hash{},
}
// Wait for it
<-p.deliveryCh
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pass in a chan with the cmd instead

Comment on lines 349 to 261
if storageTrie, ok := p.storageTries[root]; ok {
// Two accounts may well have the same storage root, but we cannot allow
// them both to make updates to the same trie instance. Therefore,
// we need to either delete the trie now, or deliver a copy of the trie.
delete(p.storageTries, root)
return storageTrie
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@karalabe I've been thinking about this. The current Trie implementation already copies on modifications, so if we just copy the trie root node before delivering it, I think it would be fine if two accounts have the same storage root and obtain the "same" trie.
If we were to use a less copy-happy trie implementation, we'd have to do the deletion here, but we're not.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True, though I don't expect to see a gain (i.e. the chance of two tries being the same is more of a deliberate attack than a thing that would normally happen). Still, the code would get cleaner.

@@ -808,6 +846,10 @@ func (s *StateDB) IntermediateRoot(deleteEmptyObjects bool) common.Hash {
obj.updateRoot(s.db)
s.updateStateObject(obj)
}
usedAddresses = append(usedAddresses, addr)
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we were to use two separate routines for accounts/storage respectively, and had two separate Pause functions, we could do it in separate phases:

  1. Pause the storage worker,
  2. Do the storage hashing
  3. Pause the account worker
  4. Do the account hashing

It would look something like this:

	// Update the root of all stateobjects in pending
	for addr := range s.stateObjectsPending {
		obj := s.stateObjects[addr]
		if !obj.deleted {
			obj.updateRoot(s.db)
		}
	}
	if s.prefetcher != nil{
		s.prefetcher.PauseAccounts()
		if trie := s.prefetcher.GetTrie(s.originalRoot); trie != nil {
			s.trie = trie
		}
	}
	// Now update the account trie
	for addr := range s.stateObjectsPending {
		obj := s.stateObjects[addr]
		if obj.deleted {
			s.deleteStateObject(obj)
		} else {
			s.updateStateObject(obj)
		}
		usedAddresses = append(usedAddresses, addr)
	}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here's a working implementation, perhaps still with some rough edges: https://github.com/holiman/go-ethereum/pull/23/files

Copy link
Contributor Author

@holiman holiman Jan 10, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the storage hashing takes ~5-10ms of mostly-non-disk-accessing processing, then an additional 5-10ms given to the prefetcher for the remainder accounts might improve the account trie performance

EDIT: on more recent blocks, storage hash is closer to 15ms

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm wondering whether we should go all in and have 1 prefetcher per trie (root). That would maximize concurrency and make it more generic in that there's no account/storage distinction. Will try to have a stab at it. Maybe we should prep a new database with the snapshots generated.

@karalabe
Copy link
Member

The last run with a bit of extended (expensive stats):

  • A fairly large number of accounts and storage slots are loaded twice (i.e. in multiple transactions). I'm unsure if it's worth to deduplcate these vs. "just reloading" them, since the latter only iterates the trie path and detects that there's nothing to load anyway vs. maintaining a hash set of all paths seen.
  • The "wastes" chart is how many slots we load that are not needed in the end (e.g. one transaction changes a slot and a later one reverts it); or within a single tx a slot is written and then reverted, which I think currently is not detected due to some very weird cornercase. We might accept this 5% overhead or we might want to optimize this a bit further. The measurement should be dropped from the final code since it's possibly expensive.

Screenshot from 2021-01-11 09-43-15

@karalabe
Copy link
Member

The account and storage update indeed drops as expected.

Screenshot from 2021-01-11 09-52-36

However, important to keep in mind that account and storage reads from the snapshot bump upwards, since it is racing for resources vs. the prefetcher. Not sure if we can somehow prioritize the snapshot or if it's even worth it. This might be a counterargument against preloading on multiple threads, since each would be a hit vs. execution.

Screenshot from 2021-01-11 09-53-04

The account and storage commits also increased in this last run. This might however be due to the extra overhead in maintaining and calculating the "wasted" slots. Would be interesting to see a run with this feature disabled eventually, just to put a number on it.

Screenshot from 2021-01-11 09-53-16

Copy link
Contributor Author

@holiman holiman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mainly LGTM, I might need some more time with it to fully understand everything new


// If there's a prefetcher running, make an inactive copy of it that can
// only access data but does not actively preload (since the user will not
// know that they need to explicitly terminate an acive copy).
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

acive -> active

storageSkipMeter: p.storageSkipMeter,
storageWasteMeter: p.storageWasteMeter,
}
// If the prefetcher is alreacy a copy, duplicate the data
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

alreacy -> already

@@ -49,49 +37,114 @@ var (
type triePrefetcher struct {
db Database // Database to fetch trie nodes through
root common.Hash // Root hash of theaccount trie for metrics
fetches map[common.Hash]Trie // Partially or fully fetcher tries
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's quite easy to misread fetches and fetchers, when skimming through the code

Comment on lines +282 to +286
trie, err := sf.db.OpenTrie(sf.root)
if err != nil {
log.Warn("Trie prefetcher failed opening trie", "root", sf.root, "err", err)
return
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If, for some reason, we are missing a root, I think the effect will be that the subfetcher fails to start (and logs a warning), and later on, when we try to peek, nothing is listening on the copy chan, and we'll simply block forever.

I don't have a good idea on how to handle it differently, just making a note of it now...

Well, I guess one idea would be to, instead of returning here, have a loop which justs listens to the channels, doesn't actually do any work, and always deliver nil if data is requested.

Comment on lines +251 to +253
case sf.copy <- ch:
// Subfetcher still alive, return copy from it
return <-ch
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My previous comment, about being blocked on the peek -- it might be possible to handle that by doing the same check here as done a few lines below:

		if sf.trie == nil {
			return nil
		}

Becuse if the root could not be opened, then the sf.trie will be nil.

Comment on lines +248 to +262
func (sf *subfetcher) peek() Trie {
ch := make(chan Trie)
select {
case sf.copy <- ch:
// Subfetcher still alive, return copy from it
return <-ch

case <-sf.term:
// Subfetcher already terminated, return a copy directly
if sf.trie == nil {
return nil
}
return sf.db.CopyTrie(sf.trie)
}
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So maybe this would be more 'safe'

Suggested change
func (sf *subfetcher) peek() Trie {
ch := make(chan Trie)
select {
case sf.copy <- ch:
// Subfetcher still alive, return copy from it
return <-ch
case <-sf.term:
// Subfetcher already terminated, return a copy directly
if sf.trie == nil {
return nil
}
return sf.db.CopyTrie(sf.trie)
}
}
func (sf *subfetcher) peek() Trie {
if sf.trie == nil {
return nil
}
ch := make(chan Trie)
select {
case sf.copy <- ch:
// Subfetcher still alive, return copy from it
return <-ch
case <-sf.term:
// Subfetcher already terminated, return a copy directly
return sf.db.CopyTrie(sf.trie)
}
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah wait, you always close the term channel when exiting the loop, so it won't actually block here. Nevermind the noise then!

@holiman
Copy link
Contributor Author

holiman commented Jan 20, 2021

LGTM!!

holiman and others added 2 commits January 21, 2021 01:46
Squashed from the following commits:

core/state: lazily init snapshot storage map
core/state: fix flawed meter on storage reads
core/state: make statedb/stateobjects reuse a hasher
core/blockchain, core/state: implement new trie prefetcher
core: make trie prefetcher deliver tries to statedb
core/state: refactor trie_prefetcher, export storage tries
blockchain: re-enable the next-block-prefetcher
state: remove panics in trie prefetcher
core/state/trie_prefetcher: address some review concerns

sq
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants