core/state: implement fast storage deletion #27955

rjl493456442 · 2023-08-21T04:10:59Z

Deletion of 328239 slots 22.47 MiB statesize in time 812.148619ms
800ms vs 3.6 s

holiman · 2023-08-21T11:17:20Z

Looking at the old code now.. Ok, so for hashdb, it first iterates the storage trie, collects it into the set var: aborted, slots, set, err := s.deleteStorage(addr, addrHash, prev.Root),

The set are merged into nodes

		if err := nodes.Merge(set); err != nil {
			return nil, err
		}

Other changes are merged into the same nodes, and eventually, we update the triedb:

		if err := s.db.TrieDB().Update(root, origin, block, nodes, triestate.New(s.accountsOrigin, s.storagesOrigin, incomplete)); err != nil {
			return common.Hash{}, err
		}

And, eventually, it reaches Update, and does

	for _, owner := range order {
		subset := nodes.Sets[owner]
		subset.ForEachWithOrder(func(path string, n *trienode.Node) {
			if n.IsDeleted() {
				return // ignore deletion
			}

So it just ignores all deletions in the end. So all the work we did there was just discarded in the end? Or was there some period where it was used in a memory-capacity, and we just discard the actual disk-write?

If indeed we just iterate/collect the storage nodes just to ignore them in hash-mode, it seems that we can optimize that particular pipeline earlier, and ignore the iteration too?

holiman · 2023-08-21T11:42:35Z

Some more thoughts. Wouldn't this be a non-changing refactoring?

The original code collects the keys first, but it doesn't actually do anything with it, like sort it, just iterates it again. It looks like the only reason for the first collection is to filter out the empty-element, but that can be easily done in the second loop. Am I missing something?

diff --git a/trie/triedb/hashdb/database.go b/trie/triedb/hashdb/database.go
index b3ae54dbe3..e00b082a65 100644
--- a/trie/triedb/hashdb/database.go
+++ b/trie/triedb/hashdb/database.go
@@ -587,18 +587,10 @@ func (db *Database) Update(root common.Hash, parent common.Hash, block uint64, n
 	//
 	// Note, the storage tries must be flushed before the account trie to
 	// retain the invariant that children go into the dirty cache first.
-	var order []common.Hash
-	for owner := range nodes.Sets {
+	for owner, subset := range nodes.Sets {
 		if owner == (common.Hash{}) {
 			continue
 		}
-		order = append(order, owner)
-	}
-	if _, ok := nodes.Sets[common.Hash{}]; ok {
-		order = append(order, common.Hash{})
-	}
-	for _, owner := range order {
-		subset := nodes.Sets[owner]
 		subset.ForEachWithOrder(func(path string, n *trienode.Node) {
 			if n.IsDeleted() {
 				return // ignore deletion
@@ -609,6 +601,12 @@ func (db *Database) Update(root common.Hash, parent common.Hash, block uint64, n
 	// Link up the account trie and storage trie if the node points
 	// to an account trie leaf.
 	if set, present := nodes.Sets[common.Hash{}]; present {
+		set.ForEachWithOrder(func(path string, n *trienode.Node) {
+			if n.IsDeleted() {
+				return // ignore deletion
+			}
+			db.insert(n.Hash, n.Blob)
+		})
 		for _, n := range set.Leaves {
 			var account types.StateAccount
 			if err := rlp.DecodeBytes(n.Blob, &account); err != nil {

rjl493456442 · 2023-08-21T12:10:29Z

So it just ignores all deletions in the end. So all the work we did there was just discarded in the end? Or was there some period where it was used in a memory-capacity, and we just discard the actual disk-write?

Yes, in hash mode, deletion is not supported, and it's totally useless in hash mode, just try to align with path mode.

and ignore the iteration too?

I added this logic in this PR.

holiman · 2023-08-21T12:18:46Z

I added this logic in this PR.

Right, gotcha! I'm a couple of steps behind here :)

rjl493456442 · 2023-08-21T12:19:04Z

It looks like the only reason for the first collection is to filter out the empty-element, but that can be easily done in the second loop. Am I missing something?

Yep, basically we need to flush storage trie(s) first, and then account trie. I think logically the refactor is OK, but i would prefer to not change it, because last time I changed the logic here and result in a big bug in this part :)

rjl493456442 · 2023-08-22T06:52:51Z

Running a full sync on sepolia to ensure nothing is broken.

holiman · 2023-08-23T06:23:09Z

core/state/statedb.go

+	if stack.Hash() != root {
+		return false, 0, nil, nil, fmt.Errorf("snapshot is not matched, exp %x, got %x", root, stack.Hash())
+	}


Suggested change

if stack.Hash() != root {

return false, 0, nil, nil, fmt.Errorf("snapshot is not matched, exp %x, got %x", root, stack.Hash())

}

if it.Err() != nil {

return false, 0, nil, nil, it.Err()

}

if stack.Hash() != root {

return false, 0, nil, nil, fmt.Errorf("snapshot is not matched, exp %x, got %x", root, stack.Hash())

}

In theory, both it.Next() and it.Hash() can internally error, so we should check it.Err() after each invocation. If we fail to do so, and ignore an error from it.Hash(), and subsequently call it.Next(), then we will be rewarded with panic(fmt.Sprintf("called Next of failed iterator: %v", it.fail))

But I guess it should be enough to have one check after the loop, like my comment, and another check right after slots[iter.Hash()] = slot ?

Technically, we need to check the failure after each iteration, however, i guess it's unnecessary. The cached error won't be clean anyway.

And the reason I don't add the iterator error checking here is: if we encounter any failure in iterator, the stackTrie will produce a different root hash anyway.

But yeah, I will add one check right after the loop. It's cheap anyway and would be helpful to bubble up the "real issue".

And the reason I don't add the iterator error checking here is: if we encounter any failure in iterator, the stackTrie will produce a different root hash anyway.

Ah, no, as I pointed out, we won't get a different root, we will encounter a panic crash.

holiman

I added a test aswell -- maybe it was superfluous, I couldn't really tell. Anyway, LGTM!

This changes implements faster post-selfdestruct iteration of storage slots for deletion, by using snapshot-storage+stacktrie to recover the trienodes to be deleted. This mechanism is only implemented for path-based schema. For hash-based schema, the entire post-selfdestruct storage iteration is skipped, with this change, since hash-based does not actually perform deletion anyway. --------- Co-authored-by: Martin Holst Swende <martin@swende.se>

This reverts commit 924d276.

This changes implements faster post-selfdestruct iteration of storage slots for deletion, by using snapshot-storage+stacktrie to recover the trienodes to be deleted. This mechanism is only implemented for path-based schema. For hash-based schema, the entire post-selfdestruct storage iteration is skipped, with this change, since hash-based does not actually perform deletion anyway. --------- Co-authored-by: Martin Holst Swende <martin@swende.se>

core/state: implement fast storage deletion

edb987a

core/state: polish

297a3bd

core/state: random enable fast/slow deletion

e923d89

rjl493456442 marked this pull request as ready for review August 22, 2023 07:03

rjl493456442 requested review from karalabe and holiman as code owners August 22, 2023 07:03

holiman reviewed Aug 23, 2023

View reviewed changes

rjl493456442 and others added 2 commits August 23, 2023 15:02

core/state: add iterator check

ea2fd60

core/state: add test for storage deletion

47d47d1

holiman approved these changes Aug 23, 2023

View reviewed changes

rjl493456442 added 2 commits August 23, 2023 16:15

core/state: check iterator error after each iteration

44fd10a

core/state: add check after Slot

ee5b52d

holiman added this to the 1.13.0 milestone Aug 23, 2023

holiman merged commit 3ff6b3c into ethereum:master Aug 26, 2023
1 of 2 checks passed

NathanBSC mentioned this pull request Oct 8, 2023

core/state: skip handleDestruction in hash based mode bnb-chain/bsc#1908

Merged

joeylichang mentioned this pull request Oct 11, 2023

feat: cherry-pick pbss patch commits from eth repo in v1.13.2 bnb-chain/bsc#1916

Merged

devopsbo3 added a commit to HorizenOfficial/go-ethereum that referenced this pull request Nov 10, 2023

Revert "core/state: implement fast storage deletion (ethereum#27955)"

1457232

This reverts commit 924d276.

devopsbo3 added a commit to HorizenOfficial/go-ethereum that referenced this pull request Nov 10, 2023

Revert "core/state: implement fast storage deletion (ethereum#27955)"

58a6349

This reverts commit 924d276.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

core/state: implement fast storage deletion #27955

core/state: implement fast storage deletion #27955

rjl493456442 commented Aug 21, 2023 •

edited

Loading

holiman commented Aug 21, 2023

holiman commented Aug 21, 2023

rjl493456442 commented Aug 21, 2023

holiman commented Aug 21, 2023

rjl493456442 commented Aug 21, 2023

rjl493456442 commented Aug 22, 2023

holiman Aug 23, 2023

holiman Aug 23, 2023

rjl493456442 Aug 23, 2023

holiman Aug 23, 2023

holiman left a comment

core/state: implement fast storage deletion #27955

core/state: implement fast storage deletion #27955

Conversation

rjl493456442 commented Aug 21, 2023 • edited Loading

holiman commented Aug 21, 2023

holiman commented Aug 21, 2023

rjl493456442 commented Aug 21, 2023

holiman commented Aug 21, 2023

rjl493456442 commented Aug 21, 2023

rjl493456442 commented Aug 22, 2023

holiman Aug 23, 2023

Choose a reason for hiding this comment

holiman Aug 23, 2023

Choose a reason for hiding this comment

rjl493456442 Aug 23, 2023

Choose a reason for hiding this comment

holiman Aug 23, 2023

Choose a reason for hiding this comment

holiman left a comment

Choose a reason for hiding this comment

rjl493456442 commented Aug 21, 2023 •

edited

Loading