Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cmd, core/state, eth, tests, trie: improve state reader #27428

Merged
merged 4 commits into from
Jun 20, 2023

Conversation

rjl493456442
Copy link
Member

@rjl493456442 rjl493456442 commented Jun 6, 2023

The state availability is checked when a state reader is requested to be created.

  • In hash-based database, if the specified root node is not existent in disk, then
    the state reader won't be created and an error will be returned.

  • In path-based database, if the specified state layer is not available, then the
    state reader won't be created and an error will be returned.

And also this PR emphasize a notion that: once the statedb instance is committed,
it's not usable. A new instance must be created with new root upon the updated
database in order to access full state correctly. This behavior is aligned with underlying
trie.

@rjl493456442 rjl493456442 force-pushed the fix-state-commit-2 branch 2 times, most recently from 5d9bb4a to a94dc6c Compare June 6, 2023 08:11
@rjl493456442 rjl493456442 marked this pull request as ready for review June 6, 2023 08:11
core/state/statedb_test.go Show resolved Hide resolved
trie/trie_reader.go Outdated Show resolved Hide resolved
trie/triedb/hashdb/database.go Outdated Show resolved Hide resolved
@rjl493456442 rjl493456442 force-pushed the fix-state-commit-2 branch 2 times, most recently from dc57f2b to c9f369b Compare June 14, 2023 02:44
//
// Once the state is committed, it's not usable anymore. A new state instance
// must be created with new root and updated database for accessing latest
// states.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that's a big problem for me, as this means calling OpenTrie, which becomes an expensive operation with verkle. Is there a way that the trie could be reused?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In our case, we always re-create a new state-db after processing a block, even now.

Re reusing the trie, it will be slightly ugly. Perhaps we can introduce some cache layer above the trie database which caches the decoded nodes. So that reloading trie node can be cheap?

@holiman
Copy link
Contributor

holiman commented Jun 15, 2023

For me, it's still a bit unclear what Once the state is committed, it's not usable anymore means, in practice.

For example, cmd/evm/runner.go does this:

	if ctx.Bool(DumpFlag.Name) {
		statedb.Commit(true)
		statedb.IntermediateRoot(true)
		fmt.Println(string(statedb.Dump(nil)))
	}

Is it "wrong"? Is it relying on undefined behaviour but is ok? Will it misbehave?

And in tests/state_test_util.go, we return a statedb which has already committed:

	root, _ := statedb.Commit(config.IsEIP158(block.Number()))
	return snaps, statedb, root, err
}

Later on, that statedb is used, for intermediate root and rawdumping:

			_, s, err := test.Run(st, cfg, false)
			// print state root for evmlab tracing
			if s != nil {
				root := s.IntermediateRoot(false)
..
				if dump && s != nil {
					dump := s.RawDump(nil)

And in cmd/evm/internal/t8ntool/execution.go, we commit and later ask for logs

	root, err := statedb.Commit(chainConfig.IsEIP158(vmContext.BlockNumber))
	if err != nil {
		return nil, nil, NewError(ErrorEVM, fmt.Errorf("could not commit state: %v", err))
	}
	execRs := &ExecutionResult{
...
		LogsHash:    rlpHash(statedb.Logs()),

We do some funky commit then commit in core/chain_makers.go

			// Write state changes to db
			root, err := statedb.Commit(config.IsEIP158(b.header.Number))
			if err != nil {
				panic(fmt.Sprintf("state write error: %v", err))
			}
			if err := statedb.Database().TrieDB().Commit(root, false); err != nil {
				panic(fmt.Sprintf("trie write error: %v", err))
			}

So, TLDR is that it needs to be more obvious what is allowed vs what is not allowed, and potentially we should even make it explicit: e.g. we could panic if the api is misused (or maybe, in production code, print a very stern warning is better than panic)

Comment on lines +42 to +51
func newTrieReader(stateRoot, owner common.Hash, db *Database) (*trieReader, error) {
if stateRoot == (common.Hash{}) || stateRoot == types.EmptyRootHash {
if stateRoot == (common.Hash{}) {
log.Error("Zero state root hash!")
}
return &trieReader{owner: owner}, nil
}
reader, err := db.Reader(stateRoot)
if err != nil {
return nil, &MissingNodeError{Owner: owner, NodeHash: stateRoot, err: err}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note for reviewers: This is the big functional change, as far as I understand. This change makes it so that a Commit-then-Copy leads to a non-functional statedb, since the trieReader does not actually have a reader.

@rjl493456442
Copy link
Member Author

e.g. we could panic if the api is misused (or maybe, in production code, print a very stern warning is better than panic)

I will think about it. I am not a big fan of this approach. But I totally get it. I didn't do it because we need a marker to represent this state-db is committed and following operations should be rejected. It's a big change then.

But now we just assume that users won't use committed state-db, and a part of APIs will still work, and may not work(e.g. a copied state-db from the committed one). It's just too implicit.

Maybe I need to find a line, that this API is not usable after commit, and these APIs can still work.

@holiman
Copy link
Contributor

holiman commented Jun 15, 2023 via email

@rjl493456442
Copy link
Member Author

I'm curious though... Those examples I listed, what is the status on them? Ok or Not?

They can still work..

statedb.IntermediateRoot(true) works because root hash is cached.
dump := s.RawDump(nil) rawdump shouldn't work, I will check it later.
statedb.Logs() logs work.

In another word, only the trie is not functional after commit, if trie is not touched, then it's fine.

@rjl493456442
Copy link
Member Author

@holiman And in tests/state_test_util.go, we return a statedb which has already committed:

The case you mentioned above is actually be fixed by re-initialize a new one here https://github.com/rjl493456442/go-ethereum/blob/fix-state-commit-2/tests/state_test_util.go#L208

@rjl493456442 rjl493456442 force-pushed the fix-state-commit-2 branch 2 times, most recently from c37006c to 20d7fd4 Compare June 19, 2023 03:53
Copy link
Contributor

@holiman holiman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review-wise this looks fine to me now, but I haven't tested the code thoroughly to check if there are any new places where it fails now, due to the new error-condition.
Did you run it on any of our bench-machines @rjl493456442 ?

@rjl493456442
Copy link
Member Author

rjl493456442 commented Jun 19, 2023

@holiman deployed on benchmark08, on top of a sync'd node. Let's see if it goes well.

The state availibilty is checked when a state reader is requested
to be created. In hash-based database, if the specified root node
is not existent in disk, then the state reader won't be created
and an error will be returned.
@rjl493456442
Copy link
Member Author

After running it for a few hours on benchmark08, nothing is wrong.

@holiman holiman merged commit 6d2aeb4 into ethereum:master Jun 20, 2023
@rjl493456442 rjl493456442 added this to the 1.12.1 milestone Jun 21, 2023
Tristan-Wilson added a commit to OffchainLabs/go-ethereum that referenced this pull request Oct 27, 2023
statedb must now be recreated after Commit is called on it, see
ethereum/go-ethereum#27428
devopsbo3 pushed a commit to HorizenOfficial/go-ethereum that referenced this pull request Nov 10, 2023
The state availability is checked during the creation of a state reader.

-    In hash-based database, if the specified root node does not exist on disk disk, then
    the state reader won't be created and an error will be returned.

-    In path-based database, if the specified state layer is not available, then the
    state reader won't be created and an error will be returned.

This change also contains a stricter semantics regarding the `Commit` operation: once it has been performed, the trie is no longer usable, and certain operations will return an error.
devopsbo3 added a commit to HorizenOfficial/go-ethereum that referenced this pull request Nov 10, 2023
devopsbo3 added a commit to HorizenOfficial/go-ethereum that referenced this pull request Nov 10, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants