Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

trie: remove internal nodes between shortNode and child in path mode #28163

Merged
merged 7 commits into from
Sep 22, 2023

Conversation

rjl493456442
Copy link
Member

@rjl493456442 rjl493456442 commented Sep 20, 2023

This pull requests fix the state healer in path-mode context, by removing the internal disk nodes within the path range occupied by a shortNode, ensuring the guarantee of state healing that each existing sub-trie in disk should be complete.

Although the condition to trigger the issue is super hard, thus we never see the state corruption after snap sync in real.

trie/sync.go Outdated
Comment on lines 460 to 465
// Theoretically, it's necessary to check for the presence before
// blindly caching deletion commands. However, due to the fact that
// Pebble doesn't use a bloom filter to enhance read performance
// for non-existent items, this check would significantly slow down
// overall performance. FIX IT(rjl493456442)
if rawdb.HasTrieNodeInPath(s.database, owner, append(inner, key[:i]...)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

// Theoretically, it's necessary to check for the presence before
// blindly caching deletion command
...
if rawdb.HasTrieNodeInPath (...

You do check for presence, before deletion, no?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should be possible to make this faster with some range-delete, alternatively use a range-iterator + delete. Because the layout of the keys are incremental, it should be reasonably efficient

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, i just want to see the performance impact, but looks like no big difference..

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Range deletions makes a bit more assumptions about the db layout than I'm comfortable with. We can definitely test the performance and see how often this happens to consider if it's worth it. But if there's a way, I'd prefer to be precise vs range delete.

@@ -141,6 +141,24 @@ func DeleteStorageTrieNode(db ethdb.KeyValueWriter, accountHash common.Hash, pat
}
}

// HasTrieNodeInPath checks for the presence of the trie node with the specified
// account hash and node path, regardless of the node hash.
func HasTrieNodeInPath(db ethdb.KeyValueReader, accountHash common.Hash, path []byte) bool {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since for all other operation we have separate methods for Account and Storage TrieNode, I'd recommend making this also separate. HasAccountTrieNodeInPath and HadStorageTrieNodeInPath. Also, do we need the InPath suffix? Isn't that implicit?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, we already defined HasAccountTrieNode(db ethdb.KeyValueReader, path []byte, hash common.Hash), so..

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is basically the same thing, just does a hash check too? Ok, not the same, because it does a Get vs HAs doesn't need to touch the value tables.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we then have?

ExistsAccountTrieNode
ExistsStorageTrieNode

Comment on lines +369 to +372
for path := range s.membatch.deletes {
owner, inner := ResolvePath([]byte(path))
rawdb.DeleteTrieNode(dbw, owner, inner, common.Hash{} /* unused */, s.scheme)
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't you do the deletions before the Writes? In case we're about to overwrite some parts of the path, the deletions might contain all parts, and we don't want to delete the things we just wrote. Right?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was also thinking about it. IMO the deletions should only contain the short-key-internal parts, so it should not duplicate any existing new node we've just written.

Copy link
Member Author

@rjl493456442 rjl493456442 Sep 20, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, for each written node and deleted node, the path should be unique. I will add a comment to clarify it.

//
// This step is only necessary for path mode, as there is no deletion
// in hash mode at all.
if _, ok := node.Val.(hashNode); ok && s.scheme == rawdb.PathScheme {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do you check against a hashNode? Wouldn't the same issue happen if the child was a valueNode?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are a few possibilities there:

a) the child node is a shortNode, it means the child node is a complete node stored in disk. In this case, we need to cleanup internal nodes to reclaim the ownership of this path.

b) the child node is a valueNode, it is always embedded in the parent. In this case, it means the path is terminated at this shortNode and have no more nodes stored in disk. We don't need to clean up the disk nodes after this shortNode because the path after it is not occupied by us.

c) the child node is an embedded full node(smaller than 32b). It's identical with case b).

// in hash mode at all.
if _, ok := node.Val.(hashNode); ok && s.scheme == rawdb.PathScheme {
owner, inner := ResolvePath(req.path)
for i := 1; i < len(key); i++ {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ugh, we loop over all possible keys and request them from pebble... this is gonna hurt...

Copy link
Member Author

@rjl493456442 rjl493456442 Sep 20, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In practice, it won't be too much. The target shortNode is the one between two fullNodes in a path, just in case a few elements share a path prefix. And in mainnet, the shared prefix won't be too long(possibly a few nibbles). And for the shortNode contains a value, then the key can be quite long, but it's not our target.

trie/sync.go Outdated
// overall performance. FIX IT(rjl493456442)
if rawdb.HasTrieNodeInPath(s.database, owner, append(inner, key[:i]...)) {
req.deletes = append(req.deletes, key[:i])
log.Info("Detected dangling node", "owner", owner, "path", append(inner, key[:i]...))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be lowered down to Debug in the final code before merge

exists = rawdb.ExistsStorageTrieNode(s.database, owner, append(inner, key[:i]...))
}
if exists {
req.deletes = append(req.deletes, key[:i])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This tripped me up. In req.deletes, you store key[:i], which is the key which this shortnode (extension-node) is holding. However, the actual full key is larger, and you use inner to construct the full one:

rawdb.ExistsAccountTrieNode(s.database, append(inner, key[:i]...))

So, a bit later in this code, when going from req.deletes -> membatch.deletes, you convert the "partial paths" into full paths by prefixing:

for _, segment := range req.deletes {
		path := append(req.path, segment...)
		s.membatch.deletes[string(path)] = struct{}{}
		s.membatch.size += uint64(len(path))
	}

I missed the req.deletes -> membatch.deletes conversion the first time around, and couldn't figure out how on earth the snippet which does the deletion could work, using non-full paths:

	for path := range s.membatch.deletes {
		owner, inner := ResolvePath([]byte(path))
		rawdb.DeleteTrieNode(dbw, owner, inner, common.Hash{} /* unused */, s.scheme)
	}

I guess you save a bit of memory by not storing the expanded path here, and recalculating it later. So I can't really object, just noting that it makes it a bit more complex.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's the intention, yes.

Copy link
Contributor

@holiman holiman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@karalabe karalabe added this to the 1.13.2 milestone Sep 22, 2023
@karalabe karalabe merged commit 4773dcb into ethereum:master Sep 22, 2023
1 check passed
tyler-smith pushed a commit to blocknative/go-ethereum that referenced this pull request Oct 12, 2023
…thereum#28163)

* trie: remove internal nodes between shortNode and child in path mode

* trie: address comments

* core/rawdb, trie: address comments

* core/rawdb: delete unused func

* trie: change comments

* trie: add missing tests

* trie: fix lint
tyler-smith pushed a commit to blocknative/go-ethereum that referenced this pull request Oct 12, 2023
…thereum#28163)

* trie: remove internal nodes between shortNode and child in path mode

* trie: address comments

* core/rawdb, trie: address comments

* core/rawdb: delete unused func

* trie: change comments

* trie: add missing tests

* trie: fix lint
siosw pushed a commit to fabriqnetwork/go-ethereum that referenced this pull request Oct 16, 2023
…thereum#28163)

* trie: remove internal nodes between shortNode and child in path mode

* trie: address comments

* core/rawdb, trie: address comments

* core/rawdb: delete unused func

* trie: change comments

* trie: add missing tests

* trie: fix lint
devopsbo3 pushed a commit to HorizenOfficial/go-ethereum that referenced this pull request Nov 10, 2023
…thereum#28163)

* trie: remove internal nodes between shortNode and child in path mode

* trie: address comments

* core/rawdb, trie: address comments

* core/rawdb: delete unused func

* trie: change comments

* trie: add missing tests

* trie: fix lint
devopsbo3 added a commit to HorizenOfficial/go-ethereum that referenced this pull request Nov 10, 2023
devopsbo3 added a commit to HorizenOfficial/go-ethereum that referenced this pull request Nov 10, 2023
maoueh pushed a commit to streamingfast/go-ethereum that referenced this pull request Nov 29, 2023
…thereum#28163)

* trie: remove internal nodes between shortNode and child in path mode

* trie: address comments

* core/rawdb, trie: address comments

* core/rawdb: delete unused func

* trie: change comments

* trie: add missing tests

* trie: fix lint
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants