-
Notifications
You must be signed in to change notification settings - Fork 20.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
core, eth/downloader: fix genesis state missing due to state sync #28124
Conversation
// | ||
// The solution is to reset the state to the genesis state. Although it may not | ||
// match the sync target, the state healer will later address and correct any | ||
// inconsistencies. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I hope to add a log here as well
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
resetState
will print something. But one thing to mention that by disabling the trie database at the beginning of snap sync, the trie database will be in "uninitialized" status until the state is fully sync'd.
In the next restart, the missing genesis status will be re-inited in SetupGenesisBlockWithOverride
. Therefore, this logic is a bit redundant, but still necessary.
// trie database unusable until the state is fully synced. To prevent any | ||
// subsequent state reads, explicitly disable the trie database and state | ||
// syncer is responsible to address and correct any state missing. | ||
if d.blockchain.TrieDB().Scheme() == rawdb.PathScheme { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The trie database and state snapshot are explicitly disabled here, at the beginning of the snap sync procedure.
However, this approach may need reconsideration. During a snap sync procedure, we may initiate multiple state sync cycles with different targets, particularly when the pivot point shifts. It is possible that a state has already been sync'd, and we begin synchronizing another one.
The enabling and disabling of these components should align more cohesively.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With the fix #28126 , we have the guarantee that the committed state won't be resync'd. In another word, for a snap sync process, we either end up with a complete state and re-enable the trie database and state snapshot for single time or exits abnormally. It's safe to invalidate the trie database and state snapshot only at the beginning of snap sync.
A few things to highlight for reviewers, the trie database and state snapshot will be disabled at the beginning of snap sync, and will only be re-enabled if the state is completely synced. Therefore, If restart occurs before the sync is finished, the trie database will be in the uninitialized status, and results in the re-init genesis operation here https://github.com/ethereum/go-ethereum/blob/master/core/genesis.go#L335. But it's correct, re-init genesis state in the middle of snap sync won't affect the correctness. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SGTM
…hereum#28124) * core: fix chain repair corner case in path-based scheme * eth/downloader: disable trie database whenever state sync is launched
…hereum#28124) * core: fix chain repair corner case in path-based scheme * eth/downloader: disable trie database whenever state sync is launched
…sync (ethereum#28124)" This reverts commit 5a51327.
…sync (ethereum#28124)" This reverts commit 5a51327.
eth/downloader: fix genesis state missing due to state sync (ethereum#28124) Conflicts: core/blockchain.go These conflicts were dut to our changes in setHeadBeyondRoot. In the function call upstream only indented the code, but tested Upstream also added to the file an assumption that genesis is block 0, which we removed.
Fixes #28120
This pull request fixes two main flaws introduced by path-based scheme.
a) In the blockchain repair stage, initiate the repair process to restore the missing genesis state if the head block is still the genesis block.
This is indeed a unique and exceptional scenario that can only occur in the path-based scheme. The root cause of this issue is the state syncer overwriting the persistent genesis state, leading to a situation where no complete and usable state is available.
This pull requests fixes this issue by restoring the genesis state. It won't affect the correctness of following state sync, due to the fact that the written genesis state is complete and state healer will address and fix any inconsistencies between the genesis state and sync target.
It's an necessary operation to guarantee the associated state of head block is present and usable, otherwise it will break the assumption in many places which blindly create the state of head block(e.g. txpool).
b) Disable trie database and state snap whenever state sync is launched.
State sync overwrites the persistent state, and this action can disrupt states that reference the current persistent state. The appropriate solution in such a situation is to mark the entire trie database as unusable, which prevents any subsequent state reads, and raises a
layer stale
error if the read operation is done by the prev-created stale state. The trie database will remain in this state until the state sync process is completed, and a usable persistent state is assembled.Besides, these two components will be disabled whenever a new state sync cycle is launched, not just at the beginning of the whole sync process. It's because the complete state will be re-synced because of pivot point moving.