-
Notifications
You must be signed in to change notification settings - Fork 840
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add experimental --Xsnapsync-bft-enabled
which enables snap sync for BFT chains
#7140
Conversation
d9240ee
to
4002539
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This LGTM. It is probably worth having @matkt have a look also, but this behavior seems well contained for *bft. We might want to push the "snap-sync minimum distance" up into PivotSelectorFromPeers. That is a worthwhile distinction to make on any consensus mechanism imo
// pivot distance | ||
if (bestPeer.get().chainState().getEstimatedHeight() | ||
<= syncConfig.getFastSyncPivotDistance()) { | ||
throw new NoSyncRequiredException(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🥇 This alone could solve a majority of the snap at genesis problems. I think griefing could be an issue, but for a private chain, a bootstrap node should help solve issues of only finding peers with no chain history.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just not sure why we need snapsync for a new chain. a fullsync will work well
&& validatorProvider | ||
.getValidatorsAtHead() | ||
.contains(Util.publicKeyToAddress(nodeKey.getPublicKey()))) { | ||
throw new NoSyncRequiredException(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
// pivot distance | ||
if (bestPeer.get().chainState().getEstimatedHeight() | ||
<= syncConfig.getFastSyncPivotDistance()) { | ||
throw new NoSyncRequiredException(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just not sure why we need snapsync for a new chain. a fullsync will work well
@@ -80,7 +80,7 @@ public CompletableFuture<FastSyncState> start() { | |||
if (!running.compareAndSet(false, true)) { | |||
throw new IllegalStateException("SyncDownloader already running"); | |||
} | |||
LOG.info("Starting sync"); | |||
LOG.info("Starting fast sync"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it's not clean but this class is also used by snapsync. I think it's why we removed the fast
...um/eth/src/main/java/org/hyperledger/besu/ethereum/eth/sync/fastsync/FastSyncDownloader.java
Outdated
Show resolved
Hide resolved
...um/eth/src/main/java/org/hyperledger/besu/ethereum/eth/sync/fastsync/FastSyncDownloader.java
Outdated
Show resolved
Hide resolved
...um/eth/src/main/java/org/hyperledger/besu/ethereum/eth/sync/snapsync/SnapSyncDownloader.java
Outdated
Show resolved
Hide resolved
Please test these changes with a Mainnet or Holesky / sepolia node to look for unintended side-effects as called out by Karim. Or insane logging or strange behavior. A Holesky snap sync should complete in a few hours and be considered an acceptance test for this PR. |
Getting back into this PR now. I have some refactoring planned but pleased to see that the general direction seems to be OK. @matkt I think there are benefits of snap sync to permissioned chains in the same way as public chains. Adding a new node (validator or otherwise) to a permissioned chain will take a long time to sync if the chain has been running for a while. I'll look at your comments on the log changes I made. I found it clearer to see what was going on and to match that to the code if the log entries were more specific to the class they were coming from. |
syncState.markInitialSyncPhaseAsDone(); | ||
|
||
if (result instanceof NoSyncRequiredState) { | ||
LOG.info("Sync completed (no sync required)"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just wonder in terms of readability if we say that the sync is finished and then we have a backward sync or a fullsync which starts. I don't know if we could slightly modify the log, but honestly I haven't yet found a word that would allow snap/fast that download directly the state to be qualified differently from the other syncs that execute the blocks
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I completely agree - I've been struggling to come up with a term that refers to all of the non-full sync methods. It would be really useful if we could come up with a consistent term that we can use. "Checkpoint" feels like a good common term for it, but since there's a specific sync method called "checkpoint" we can't really use that. Maybe "pivot-based sync" is a a reasonable description of them all? That would make the above log message "Sync completed (no pivot-based sync required)" or similar. Perhaps to make it clear that no sync took place I should change it to "Sync ended (no pivot-based sync required)". I think "ended" is clearer that nothing happened, where as "completed" suggests something did happen.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is checkpoint sync appropriate for BFT networks? It should work based on a specified checkpoint but users would probably use a full snap sync in this case? Maybe "no snap sync required" is still valuable. If we are attempting to deprecate fast, we can likely avoid that language.
This page lays out some language we could try to mirror.
...reum/eth/src/main/java/org/hyperledger/besu/ethereum/eth/sync/SynchronizerConfiguration.java
Show resolved
Hide resolved
--Xsnapsync-bft-enabled
which enables snap sync for BFT chains
Signed-off-by: Matthew Whitehead <matthew1001@gmail.com>
Signed-off-by: Matthew Whitehead <matthew1001@gmail.com>
…ental flag to enabled snap + BFT Signed-off-by: Matthew Whitehead <matthew1001@gmail.com>
Signed-off-by: Matthew Whitehead <matthew1001@gmail.com>
Signed-off-by: Matthew Whitehead <matthew1001@gmail.com>
Completed a Holesky snap-sync with no unintended consequences (with the flag set). Some logging noise for cleanup. |
…account range handler. Add pipeline name and logs Signed-off-by: Matthew Whitehead <matthew1001@gmail.com>
Signed-off-by: Matthew Whitehead <matthew1001@gmail.com>
Signed-off-by: Karim Taam <karim.t2am@gmail.com>
Signed-off-by: Matthew Whitehead <matthew1001@gmail.com>
…falls back to full sync if peer status isn't received quickly enough Signed-off-by: Matthew Whitehead <matthew1001@gmail.com>
Signed-off-by: Matthew Whitehead <matthew1001@gmail.com>
…-based sync types Signed-off-by: Matthew Whitehead <matthew1001@gmail.com>
Signed-off-by: Matthew Whitehead <matthew1001@gmail.com>
Signed-off-by: Matthew Whitehead <matthew1001@gmail.com>
Signed-off-by: Matthew Whitehead <matthew1001@gmail.com>
@garyschulte @matkt think I've tidied up remaining loose ends:
|
I won't die on the Would mark as 🚢 with that minor change. |
Yeah that's reasonable. I'll refactor the |
I think it will be better to test also a mainnet node just to be sure ( we have more pivot block swtich, more interesting usecases, etc). I don't think the modification will add regression but I will prefer to test (checkpoint is enough) |
@@ -143,13 +143,6 @@ public static DataStorageOptions create() { | |||
*/ | |||
public void validate(final CommandLine commandLine, final SyncMode syncMode) { | |||
if (DataStorageFormat.BONSAI == dataStorageFormat && bonsaiLimitTrieLogsEnabled) { | |||
if (SyncMode.FULL == syncMode) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why we are removing this one ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is going to get refactored into a separate PR (see Gary's comment #7140 (comment)).
But the reason for removing it is I've changed the logic to set --bonsai-limit-trie-logs-enabled=false
if --sync-mode=FULL
, rather than force the user to set --bonsai-limit-trie-logs-enabled=false
. See
if (syncMode == SyncMode.FULL |
It's a bit like how we automatically set --tx-pool-price-bump=0
if --min-gas-price=0
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok thanks for the info 🙏
@@ -99,7 +99,7 @@ private int preloadQueue() { | |||
} | |||
} | |||
|
|||
public void addToPruneQueue(final long blockNumber, final Hash blockHash) { | |||
public synchronized void addToPruneQueue(final long blockNumber, final Hash blockHash) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could you explain why this modification is needed ? I think if it was not needed on mainnet, I think it should be the same on qbft ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it is an existing bug that hasn't been hit on mainnet because the rate of receiving and handling new trie-logs from peers isn't fast enough to hit the threading issue.
I was getting ConcurrentModificationExceptions
on the trieLogBlocksAndForksByDescendingBlockNumber
map when a QBFT node was syncing from several peers due to addToPruneQueue
modifying the map while pruneFromQueue
was iterating over it.
Since it hasn't been hit on mainnet before, I'm assuming the making the methods synchronized
isn't going to affect mainnet syncing performance at all.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If it's useful I can try recreating it to show the full ConcurrentModificationException
stack, but it was definitely an intermittent (but reliable) exception once we had got the basic syncing logic working.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good. I just ping @siladu who has coded this part and can verify that we are good
@@ -94,7 +94,7 @@ protected CompletableFuture<FastSyncState> start(final FastSyncState fastSyncSta | |||
onBonsai.clearFlatDatabase(); | |||
onBonsai.clearTrieLog(); | |||
}); | |||
LOG.debug("Start sync with initial sync state {}", fastSyncState); | |||
LOG.debug("Start fast sync with initial sync state {}", fastSyncState); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe we can do the same as here https://github.com/hyperledger/besu/pull/7140/files#diff-c971b57a1d994b1c08fd5f43fd3cf282a41ccd524cafa8421c6eb1bad9087b08R83
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In this case it genuinely is fast
sync. The start()
method is common to this class and the SnapSyncDownload
subclass.
The start(final FastSyncState fastSyncState)
method is actually doing fast
sync in this class, and is doing snap
sync in SnapSyncDownloader.java
so in SnapSyncDownloader
the log says "Start snap sync..."
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok missed that thanks 🙏
…f sync-mode = FULL (moving to another PR) Signed-off-by: Matthew Whitehead <matthew1001@gmail.com>
The latest commit removes the trie-log change and I'll raise a separate PR for that shortly where any additional discussion can take place |
Signed-off-by: Matt Whitehead <matthew.whitehead@kaleido.io>
Signed-off-by: Matt Whitehead <matthew.whitehead@kaleido.io>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
GTG by me, 🚢
…r BFT chains (hyperledger#7140) * Create a BFT-specific pivot block handler Signed-off-by: Matthew Whitehead <matthew1001@gmail.com> * Change visibility Signed-off-by: Matthew Whitehead <matthew1001@gmail.com> * Refactor sync-peer-count internal variable to match name, add experimental flag to enabled snap + BFT Signed-off-by: Matthew Whitehead <matthew1001@gmail.com> * Merge with main Signed-off-by: Matthew Whitehead <matthew1001@gmail.com> * Fix uppercase Signed-off-by: Matthew Whitehead <matthew1001@gmail.com> * Address synchronization issue with trie pruning. Create BFT-specific account range handler. Add pipeline name and logs Signed-off-by: Matthew Whitehead <matthew1001@gmail.com> * Remove debug log Signed-off-by: Matthew Whitehead <matthew1001@gmail.com> * fixing snapsync for empty state Signed-off-by: Karim Taam <karim.t2am@gmail.com> * Don't queue up events we can't handle Signed-off-by: Matthew Whitehead <matthew1001@gmail.com> * Fix timing window where a validator with an empty data dir sometimes falls back to full sync if peer status isn't received quickly enough Signed-off-by: Matthew Whitehead <matthew1001@gmail.com> * Remove BFT-specific account request class. Not needed Signed-off-by: Matthew Whitehead <matthew1001@gmail.com> * Refactor some more 'fast' sync variables that are common to all pivot-based sync types Signed-off-by: Matthew Whitehead <matthew1001@gmail.com> * In FULL sync mode, disable bonsai-limit-trie-logs-enabled instead of failing to start Signed-off-by: Matthew Whitehead <matthew1001@gmail.com> * Add javadoc comments, clarify overriding bonsai-limit-trie-logs-enabled Signed-off-by: Matthew Whitehead <matthew1001@gmail.com> * Add BFT pivot block selector tests Signed-off-by: Matthew Whitehead <matthew1001@gmail.com> * Fix failure error message Signed-off-by: Matthew Whitehead <matthew1001@gmail.com> * Remove the unnamed Pipe constructor and update tests to set a pipe name Signed-off-by: Matthew Whitehead <matthew1001@gmail.com> * Revert some info logs back to debug given the feedback on noise in the logs syncing with holesky Signed-off-by: Matthew Whitehead <matthew1001@gmail.com> * Refactor fastSyncPivotDistance to syncPivotDistance Signed-off-by: Matthew Whitehead <matthew1001@gmail.com> * Incomplete refactoring Signed-off-by: Matthew Whitehead <matthew1001@gmail.com> * Update BFT event queueing tests Signed-off-by: Matthew Whitehead <matthew1001@gmail.com> * Event queue test fixes Signed-off-by: Matthew Whitehead <matthew1001@gmail.com> * Remove automatic setting of bonsai-limit-trie-logs-enabled to false if sync-mode = FULL (moving to another PR) Signed-off-by: Matthew Whitehead <matthew1001@gmail.com> --------- Signed-off-by: Matthew Whitehead <matthew1001@gmail.com> Signed-off-by: Karim Taam <karim.t2am@gmail.com> Signed-off-by: Matt Whitehead <matthew.whitehead@kaleido.io> Co-authored-by: Karim Taam <karim.t2am@gmail.com>
…r BFT chains (hyperledger#7140) * Create a BFT-specific pivot block handler Signed-off-by: Matthew Whitehead <matthew1001@gmail.com> * Change visibility Signed-off-by: Matthew Whitehead <matthew1001@gmail.com> * Refactor sync-peer-count internal variable to match name, add experimental flag to enabled snap + BFT Signed-off-by: Matthew Whitehead <matthew1001@gmail.com> * Merge with main Signed-off-by: Matthew Whitehead <matthew1001@gmail.com> * Fix uppercase Signed-off-by: Matthew Whitehead <matthew1001@gmail.com> * Address synchronization issue with trie pruning. Create BFT-specific account range handler. Add pipeline name and logs Signed-off-by: Matthew Whitehead <matthew1001@gmail.com> * Remove debug log Signed-off-by: Matthew Whitehead <matthew1001@gmail.com> * fixing snapsync for empty state Signed-off-by: Karim Taam <karim.t2am@gmail.com> * Don't queue up events we can't handle Signed-off-by: Matthew Whitehead <matthew1001@gmail.com> * Fix timing window where a validator with an empty data dir sometimes falls back to full sync if peer status isn't received quickly enough Signed-off-by: Matthew Whitehead <matthew1001@gmail.com> * Remove BFT-specific account request class. Not needed Signed-off-by: Matthew Whitehead <matthew1001@gmail.com> * Refactor some more 'fast' sync variables that are common to all pivot-based sync types Signed-off-by: Matthew Whitehead <matthew1001@gmail.com> * In FULL sync mode, disable bonsai-limit-trie-logs-enabled instead of failing to start Signed-off-by: Matthew Whitehead <matthew1001@gmail.com> * Add javadoc comments, clarify overriding bonsai-limit-trie-logs-enabled Signed-off-by: Matthew Whitehead <matthew1001@gmail.com> * Add BFT pivot block selector tests Signed-off-by: Matthew Whitehead <matthew1001@gmail.com> * Fix failure error message Signed-off-by: Matthew Whitehead <matthew1001@gmail.com> * Remove the unnamed Pipe constructor and update tests to set a pipe name Signed-off-by: Matthew Whitehead <matthew1001@gmail.com> * Revert some info logs back to debug given the feedback on noise in the logs syncing with holesky Signed-off-by: Matthew Whitehead <matthew1001@gmail.com> * Refactor fastSyncPivotDistance to syncPivotDistance Signed-off-by: Matthew Whitehead <matthew1001@gmail.com> * Incomplete refactoring Signed-off-by: Matthew Whitehead <matthew1001@gmail.com> * Update BFT event queueing tests Signed-off-by: Matthew Whitehead <matthew1001@gmail.com> * Event queue test fixes Signed-off-by: Matthew Whitehead <matthew1001@gmail.com> * Remove automatic setting of bonsai-limit-trie-logs-enabled to false if sync-mode = FULL (moving to another PR) Signed-off-by: Matthew Whitehead <matthew1001@gmail.com> --------- Signed-off-by: Matthew Whitehead <matthew1001@gmail.com> Signed-off-by: Karim Taam <karim.t2am@gmail.com> Signed-off-by: Matt Whitehead <matthew.whitehead@kaleido.io> Co-authored-by: Karim Taam <karim.t2am@gmail.com> Signed-off-by: Daniel Lehrner <daniel.lehrner@consensys.net>
* EIP 7702 first draft Signed-off-by: Daniel Lehrner <daniel.lehrner@consensys.net> * added CHANGELOG.md entry Signed-off-by: Daniel Lehrner <daniel.lehrner@consensys.net> * bug fixes, added first tests Signed-off-by: Daniel Lehrner <daniel.lehrner@consensys.net> * container verify GitHub workflow (#7239) Container verification step in release process automated with the container verify GitHub workflow. New workflow is triggered at the end of the release workflow which will check the release container images starts successfully. Verification test only checks container starts and reach the Ethereum main loop Signed-off-by: Chaminda Divitotawela <cdivitotawela@gmail.com> Signed-off-by: Daniel Lehrner <daniel.lehrner@consensys.net> * Investigate chain halts when syncing (#7162) Fix some reasons for chain download halts when syncing Signed-off-by: stefan.pingel@consensys.net <stefan.pingel@consensys.net> Signed-off-by: Stefan Pingel <16143240+pinges@users.noreply.github.com> Co-authored-by: Sally MacFarlane <macfarla.github@gmail.com> Signed-off-by: Daniel Lehrner <daniel.lehrner@consensys.net> * Check for EOFCreate subcontainer rules (#7232) Check and test for the unused container rule, and only returncontract targets can have truncated data rule. Also test the other subcontainer rules in unit tests. Signed-off-by: Danno Ferrin <danno@numisight.com> Signed-off-by: Daniel Lehrner <daniel.lehrner@consensys.net> * Remove deprecation message for `--Xp2p-peer-lower-bound` (#7247) Signed-off-by: Gabriel-Trintinalia <gabriel.trintinalia@consensys.net> Signed-off-by: Daniel Lehrner <daniel.lehrner@consensys.net> * less invasive code injection approach Signed-off-by: Daniel Lehrner <daniel.lehrner@consensys.net> * added missing java doc & fixed test Signed-off-by: Daniel Lehrner <daniel.lehrner@consensys.net> * added (currently non-working) acceptance test, some bug fixes in the transaction validation and tx pool logic Signed-off-by: Daniel Lehrner <daniel.lehrner@consensys.net> * fix spotless Signed-off-by: Daniel Lehrner <daniel.lehrner@consensys.net> * updated acceptance test, still not working, newPayload request seems to be necessary before final fork choice update Signed-off-by: Daniel Lehrner <daniel.lehrner@consensys.net> * use correct world state to inject temporary code, inject code in existing accounts as well Signed-off-by: Daniel Lehrner <daniel.lehrner@consensys.net> * renamed test service to prague, because the engine versions used are only available in the prague hard fork Signed-off-by: Daniel Lehrner <daniel.lehrner@consensys.net> * fixed acceptance test, some bug fixes if authorized account does not yet exist Signed-off-by: Daniel Lehrner <daniel.lehrner@consensys.net> * Add build version option to prefix git hash with custom version property (#7222) * Add build version option to prefix git hash with custom version property * Refactor to make appending the git hash a boolean property. Include a commented-out example of how to use the properties in the gradle file Signed-off-by: Matthew Whitehead <matthew1001@gmail.com> Signed-off-by: Daniel Lehrner <daniel.lehrner@consensys.net> * Handle invalid snap getTrieNode requests with empty paths gracefully (#7221) Signed-off-by: Jason Frame <jason.frame@consensys.net> Co-authored-by: Sally MacFarlane <macfarla.github@gmail.com> Signed-off-by: Daniel Lehrner <daniel.lehrner@consensys.net> * fix typos in CHANGELOG (#7226) Signed-off-by: Ties <71668189+TiesD@users.noreply.github.com> Co-authored-by: Matt Nelson <85905982+non-fungible-nelson@users.noreply.github.com> Co-authored-by: Sally MacFarlane <macfarla.github@gmail.com> Signed-off-by: Daniel Lehrner <daniel.lehrner@consensys.net> * feat: Add network option for LUKSO Mainnet (#7223) * Add option for LUKSO network Signed-off-by: Wolmin <lamonos123@gmail.com> * Add tests for LUKSO Signed-off-by: Wolmin <lamonos123@gmail.com> * Apply spotless Signed-off-by: Wolmin <lamonos123@gmail.com> * Add changelog entry Signed-off-by: Wolmin <lamonos123@gmail.com> * Fix duplicate func Signed-off-by: Wolmin <lamonos123@gmail.com> * Fix changelog Signed-off-by: Wolmin <lamonos123@gmail.com> * Add bootnodes to genesis Signed-off-by: Wolmin <lamonos123@gmail.com> --------- Signed-off-by: Wolmin <lamonos123@gmail.com> Signed-off-by: Wolmin <44748271+Wolmin@users.noreply.github.com> Signed-off-by: Daniel Lehrner <daniel.lehrner@consensys.net> * Update Docker base image to Ubuntu 24.04 (#7251) Signed-off-by: Fabio Di Fabio <fabio.difabio@consensys.net> Signed-off-by: Daniel Lehrner <daniel.lehrner@consensys.net> * Reconfigure how Protocol Specs are created WRT EVM condiguration (#7245) Make the max code size and max initcode size a part of the EVM configuration. As part of the change we need to move the tasks CodeFactory once handled as a static class and move it into the EVM. This has a nice follow on effect that we don't need to pass in max EOF versions or max code sizes anymore. Signed-off-by: Danno Ferrin <danno@numisight.com> Signed-off-by: Daniel Lehrner <daniel.lehrner@consensys.net> * Fix the wrong 'Identifier' and 'Synchronizer' usage (#7252) * fix the synchronizer usage Signed-off-by: Leni <leniram159@gmail.com> * fix Identifier usage Signed-off-by: Leni <leniram159@gmail.com> --------- Signed-off-by: Leni <leniram159@gmail.com> Co-authored-by: Sally MacFarlane <macfarla.github@gmail.com> Signed-off-by: Daniel Lehrner <daniel.lehrner@consensys.net> * Fix flaky SECP256R1 test (#7249) Signed-off-by: Daniel Lehrner <daniel.lehrner@consensys.net> Co-authored-by: Sally MacFarlane <macfarla.github@gmail.com> Signed-off-by: Daniel Lehrner <daniel.lehrner@consensys.net> * update to work with the new max retries value (#7253) Signed-off-by: Justin Florentine <justin+github@florentine.us> Co-authored-by: Sally MacFarlane <macfarla.github@gmail.com> Signed-off-by: Daniel Lehrner <daniel.lehrner@consensys.net> * Temporary CancunEOF fork for EOF testing. (#7227) Add Genesis ("CancunEOFTime") and reference test ("CancunEOF") support for a temporary Cancun+EOF fork, in anticipation of potential devnets. Signed-off-by: Danno Ferrin <danno@numisight.com> Signed-off-by: Daniel Lehrner <daniel.lehrner@consensys.net> * fixed bug introduced through merge of main, made acceptance test easier to understand Signed-off-by: Daniel Lehrner <daniel.lehrner@consensys.net> * added missing java docs Signed-off-by: Daniel Lehrner <daniel.lehrner@consensys.net> * removed unnecessary tag Signed-off-by: Daniel Lehrner <daniel.lehrner@consensys.net> * make encodeSingleSetCode public again Signed-off-by: Daniel Lehrner <daniel.lehrner@consensys.net> * Snapserver responses to return at least one response (#7190) Signed-off-by: Jason Frame <jason.frame@consensys.net> Signed-off-by: Daniel Lehrner <daniel.lehrner@consensys.net> * copy setCodeTransactionPayloads as well Signed-off-by: Daniel Lehrner <daniel.lehrner@consensys.net> * fixed bug during tests with forrest db Signed-off-by: Daniel Lehrner <daniel.lehrner@consensys.net> * Snapserver GetTrieNodes request to handle short hash for storage (#7264) Signed-off-by: Jason Frame <jason.frame@consensys.net> Signed-off-by: Daniel Lehrner <daniel.lehrner@consensys.net> * javadoc: Adding javadoc for ethstats module (#7269) * javadoc: Adding javadoc for ethstats module --------- Signed-off-by: Usman Saleem <usman@usmans.info> Signed-off-by: Daniel Lehrner <daniel.lehrner@consensys.net> * Fix javadoc for ethereum:core top level package (#7270) * javadoc - Apply javadoc to ethereum core top level package --------- Signed-off-by: Usman Saleem <usman@usmans.info> Signed-off-by: Daniel Lehrner <daniel.lehrner@consensys.net> * Disable Flaky tests - permissioning (#7256) * disable some flaky tests Signed-off-by: Sally MacFarlane <macfarla.github@gmail.com> * correct name for test Signed-off-by: Sally MacFarlane <macfarla.github@gmail.com> * formatting Signed-off-by: Sally MacFarlane <macfarla.github@gmail.com> * disable some flaky tests Signed-off-by: Sally MacFarlane <macfarla.github@gmail.com> --------- Signed-off-by: Sally MacFarlane <macfarla.github@gmail.com> Co-authored-by: Usman Saleem <usman@usmans.info> Signed-off-by: Daniel Lehrner <daniel.lehrner@consensys.net> * Add bootnodes to the maintained peer list (#7257) * Add bootnodes to the maintained peer list Signed-off-by: Matthew Whitehead <matthew1001@gmail.com> * Update unit tests Signed-off-by: Matthew Whitehead <matthew1001@gmail.com> * Add entry in changelog Signed-off-by: Matthew Whitehead <matthew1001@gmail.com> * Tweak unit test Signed-off-by: Matthew Whitehead <matthew1001@gmail.com> * Refactor to keep common steps the same for both cases Signed-off-by: Matthew Whitehead <matthew1001@gmail.com> * Add debug log, call sanitizePeers() only once Signed-off-by: Matthew Whitehead <matthew1001@gmail.com> --------- Signed-off-by: Matthew Whitehead <matthew1001@gmail.com> Signed-off-by: Daniel Lehrner <daniel.lehrner@consensys.net> * Fix javadoc for ethereum api module, graphql package (#7272) * javadoc - Adding missing javadocs ethereum:api graphql package Signed-off-by: Usman Saleem <usman@usmans.info> Signed-off-by: Daniel Lehrner <daniel.lehrner@consensys.net> * T8n support for isStateTest and empty accounts (#7275) Update t8n executor to support new isStateTest env flag that will disable extra-transactional processing such as block rewards and beacon root. Also, make sure such extra-transactional commits don't create empty accounts. Signed-off-by: Danno Ferrin <danno@numisight.com> Co-authored-by: Usman Saleem <usman@usmans.info> Signed-off-by: Daniel Lehrner <daniel.lehrner@consensys.net> * Promote storage x-trie-log subcommand to trie-log (#7278) Signed-off-by: Simon Dudley <simon.dudley@consensys.net> Signed-off-by: Daniel Lehrner <daniel.lehrner@consensys.net> * Evm tool readme update (#7274) * update paths to binary. update docker build to use java 21 * updated suggested jdk --------- Signed-off-by: Justin Florentine <justin+github@florentine.us> Signed-off-by: Daniel Lehrner <daniel.lehrner@consensys.net> * javadoc - Add missing javadoc for evmtool module (#7277) Signed-off-by: Usman Saleem <usman@usmans.info> Signed-off-by: Daniel Lehrner <daniel.lehrner@consensys.net> * Rename ValidatorPublicKey to ValidatorPubKey (#7280) Adapt to EIP-7002 name change for validator public key in all places. Signed-off-by: Danno Ferrin <danno@numisight.com> Signed-off-by: Daniel Lehrner <daniel.lehrner@consensys.net> * Add info-level diagnostic logs to aid with resolving stalled BFT chains (#7271) * Add info-level diagnostic logs to aid with resolving stalled BFT chains Signed-off-by: Matthew Whitehead <matthew1001@gmail.com> * Add javadoc Signed-off-by: Matthew Whitehead <matthew1001@gmail.com> --------- Signed-off-by: Matthew Whitehead <matthew1001@gmail.com> Signed-off-by: Matt Whitehead <matthew.whitehead@kaleido.io> Signed-off-by: Daniel Lehrner <daniel.lehrner@consensys.net> * Update EIP-2935 contract (#7281) Use the updated contract and address for EIP-2539. Signed-off-by: Danno Ferrin <danno@numisight.com> Signed-off-by: Daniel Lehrner <daniel.lehrner@consensys.net> * add evmtool compability, fixing bugs related to sender recovery of 7702 txs and handling authorizations to empty accounts Signed-off-by: Daniel Lehrner <daniel.lehrner@consensys.net> * Deeper tracing of self-destructed accounts (#7284) Consider previously self-destructed accounts when creating accounts. Signed-off-by: Danno Ferrin <danno@numisight.com> Signed-off-by: Daniel Lehrner <daniel.lehrner@consensys.net> * next release version after 24.7.0 (#7285) Signed-off-by: garyschulte <garyschulte@gmail.com> Signed-off-by: Daniel Lehrner <daniel.lehrner@consensys.net> * Add experimental `--Xsnapsync-bft-enabled` which enables snap sync for BFT chains (#7140) * Create a BFT-specific pivot block handler Signed-off-by: Matthew Whitehead <matthew1001@gmail.com> * Change visibility Signed-off-by: Matthew Whitehead <matthew1001@gmail.com> * Refactor sync-peer-count internal variable to match name, add experimental flag to enabled snap + BFT Signed-off-by: Matthew Whitehead <matthew1001@gmail.com> * Merge with main Signed-off-by: Matthew Whitehead <matthew1001@gmail.com> * Fix uppercase Signed-off-by: Matthew Whitehead <matthew1001@gmail.com> * Address synchronization issue with trie pruning. Create BFT-specific account range handler. Add pipeline name and logs Signed-off-by: Matthew Whitehead <matthew1001@gmail.com> * Remove debug log Signed-off-by: Matthew Whitehead <matthew1001@gmail.com> * fixing snapsync for empty state Signed-off-by: Karim Taam <karim.t2am@gmail.com> * Don't queue up events we can't handle Signed-off-by: Matthew Whitehead <matthew1001@gmail.com> * Fix timing window where a validator with an empty data dir sometimes falls back to full sync if peer status isn't received quickly enough Signed-off-by: Matthew Whitehead <matthew1001@gmail.com> * Remove BFT-specific account request class. Not needed Signed-off-by: Matthew Whitehead <matthew1001@gmail.com> * Refactor some more 'fast' sync variables that are common to all pivot-based sync types Signed-off-by: Matthew Whitehead <matthew1001@gmail.com> * In FULL sync mode, disable bonsai-limit-trie-logs-enabled instead of failing to start Signed-off-by: Matthew Whitehead <matthew1001@gmail.com> * Add javadoc comments, clarify overriding bonsai-limit-trie-logs-enabled Signed-off-by: Matthew Whitehead <matthew1001@gmail.com> * Add BFT pivot block selector tests Signed-off-by: Matthew Whitehead <matthew1001@gmail.com> * Fix failure error message Signed-off-by: Matthew Whitehead <matthew1001@gmail.com> * Remove the unnamed Pipe constructor and update tests to set a pipe name Signed-off-by: Matthew Whitehead <matthew1001@gmail.com> * Revert some info logs back to debug given the feedback on noise in the logs syncing with holesky Signed-off-by: Matthew Whitehead <matthew1001@gmail.com> * Refactor fastSyncPivotDistance to syncPivotDistance Signed-off-by: Matthew Whitehead <matthew1001@gmail.com> * Incomplete refactoring Signed-off-by: Matthew Whitehead <matthew1001@gmail.com> * Update BFT event queueing tests Signed-off-by: Matthew Whitehead <matthew1001@gmail.com> * Event queue test fixes Signed-off-by: Matthew Whitehead <matthew1001@gmail.com> * Remove automatic setting of bonsai-limit-trie-logs-enabled to false if sync-mode = FULL (moving to another PR) Signed-off-by: Matthew Whitehead <matthew1001@gmail.com> --------- Signed-off-by: Matthew Whitehead <matthew1001@gmail.com> Signed-off-by: Karim Taam <karim.t2am@gmail.com> Signed-off-by: Matt Whitehead <matthew.whitehead@kaleido.io> Co-authored-by: Karim Taam <karim.t2am@gmail.com> Signed-off-by: Daniel Lehrner <daniel.lehrner@consensys.net> * Turn off CicleCI for Besu (#7291) All the CI jobs run in GitHub actions and Circle CI it no longer needed in Besu project Signed-off-by: Chaminda Divitotawela <cdivitotawela@gmail.com> Signed-off-by: Daniel Lehrner <daniel.lehrner@consensys.net> * Check for snap server (#6609) * EthPeer add isServingSnap to be able to make sure that we have enough snap servers connected when we are snap syncing Signed-off-by: stefan.pingel@consensys.net <stefan.pingel@consensys.net> Signed-off-by: Sally MacFarlane <macfarla.github@gmail.com> Co-authored-by: Sally MacFarlane <macfarla.github@gmail.com> Signed-off-by: Daniel Lehrner <daniel.lehrner@consensys.net> * Implement System Calls (#7263) Signed-off-by: Gabriel-Trintinalia <gabriel.trintinalia@consensys.net> Signed-off-by: Daniel Lehrner <daniel.lehrner@consensys.net> * wrap WorldUpdater inside a WorldUpdaterService to inject the authorized code whenever needed Signed-off-by: Daniel Lehrner <daniel.lehrner@consensys.net> * Update limit trie logs validation message for sync-mode=FULL (#7279) Signed-off-by: Simon Dudley <simon.dudley@consensys.net> Co-authored-by: Sally MacFarlane <macfarla.github@gmail.com> Signed-off-by: Daniel Lehrner <daniel.lehrner@consensys.net> * Execute requests before block persist (#7295) Signed-off-by: Gabriel-Trintinalia <gabriel.trintinalia@consensys.net> Signed-off-by: Daniel Lehrner <daniel.lehrner@consensys.net> * fixed MainnetTransactionProcessor retrieval of correctn `to` account with injected code, fixed code injection Signed-off-by: Daniel Lehrner <daniel.lehrner@consensys.net> * only first authorization is accepted, all the following ones are ignored Signed-off-by: Daniel Lehrner <daniel.lehrner@consensys.net> * don't cache account with empty code Signed-off-by: Daniel Lehrner <daniel.lehrner@consensys.net> * revert wrapping of world updater, as its `updater()` method creates a type of nesting that is not compatible with wrapping it. Instead a service is injected in the world updater to inject the code into the authorized accounts Signed-off-by: Daniel Lehrner <daniel.lehrner@consensys.net> * Fixed outdated tech redirect link. (#7297) * fix wiki link Signed-off-by: Snazzy <snazzysam933@gmail.com> * fix format Signed-off-by: Snazzy <snazzysam933@gmail.com> * change knownHash Signed-off-by: Snazzy <snazzysam933@gmail.com> --------- Signed-off-by: Snazzy <snazzysam933@gmail.com> Signed-off-by: Daniel Lehrner <daniel.lehrner@consensys.net> * Increment private nonce even if transaction failed. (#6593) Increment private nonce even if transaction failed Signed-off-by: George Tebrean <george@web3labs.com> Signed-off-by: stefan.pingel@consensys.net <stefan.pingel@consensys.net> Co-authored-by: stefan.pingel@consensys.net <stefan.pingel@consensys.net> Co-authored-by: Stefan Pingel <16143240+pinges@users.noreply.github.com> Signed-off-by: Daniel Lehrner <daniel.lehrner@consensys.net> * feat: Enhance --profile to load external profiles (#7292) * feat: --profile can load external profiles * fix external profile name method * fix ProfilesCompletionCandidate * test: Add unit tests * changelog: Update changelog * test: Fix TomlConfigurationDefaultProviderTest * test: Fix BesuCommandTest --------- Signed-off-by: Usman Saleem <usman@usmans.info> Signed-off-by: Daniel Lehrner <daniel.lehrner@consensys.net> * Fix status badge for documentation (#7304) Documentation has been moved to GitHub pages and no longer use readthedocs. Updated the README status badge for docs with correct link Signed-off-by: Chaminda Divitotawela <cdivitotawela@gmail.com> Signed-off-by: Daniel Lehrner <daniel.lehrner@consensys.net> * [MINOR] Fixed some typos (#7299) * typos Signed-off-by: Sally MacFarlane <macfarla.github@gmail.com> --------- Signed-off-by: Sally MacFarlane <macfarla.github@gmail.com> Signed-off-by: Daniel Lehrner <daniel.lehrner@consensys.net> * refactored to share one AuthorizedAccountService between the different instances of the world updater, renamed some classes Signed-off-by: Daniel Lehrner <daniel.lehrner@consensys.net> * spotless Signed-off-by: Daniel Lehrner <daniel.lehrner@consensys.net> * load code for authorization at the beginning of the transaction Signed-off-by: Daniel Lehrner <daniel.lehrner@consensys.net> * ignore authorization if chain id doesn't match Signed-off-by: Daniel Lehrner <daniel.lehrner@consensys.net> * cache authority address, evmtool: do not fail if sender address is wrong Signed-off-by: Daniel Lehrner <daniel.lehrner@consensys.net> * Add evmtool block-test subcommand (#7293) * Add evmtool block-test subcommand Add an evmtool subcommand that will run non-hive blockchain tests. Signed-off-by: Danno Ferrin <danno@numisight.com> Signed-off-by: Daniel Lehrner <daniel.lehrner@consensys.net> * Make the retrying snap tasks switching (#7307) * make snap tasks switching Signed-off-by: stefan.pingel@consensys.net <stefan.pingel@consensys.net> Signed-off-by: Daniel Lehrner <daniel.lehrner@consensys.net> * 6612: Remove deprecated sync modes and related helper methods (#7309) * 6612: Remove deprecated sync modes and related helper methods Signed-off-by: Matilda-Clerke <matilda.shay.clerke@gmail.com> Signed-off-by: Daniel Lehrner <daniel.lehrner@consensys.net> * EOF Reference Test Fixes (#7306) Fix a number of issues found in reference tests and evmone tests. - Be tolerant of more nulls in json - Support ContainerKind in reference tests - re-order EXTCALL oeprands - correct return value for REVERT in EXT*CALL - re-order EOFCREATE code validation Signed-off-by: Danno Ferrin <danno@numisight.com> Signed-off-by: Daniel Lehrner <daniel.lehrner@consensys.net> * test template refactor, bump besu-native to 0.9.2 (#7315) Signed-off-by: garyschulte <garyschulte@gmail.com> Signed-off-by: Daniel Lehrner <daniel.lehrner@consensys.net> * Feature/use gnark-crypto for eip-2537 (#7316) * use gnark-crypto for bls precompiles Signed-off-by: garyschulte <garyschulte@gmail.com> Signed-off-by: Daniel Lehrner <daniel.lehrner@consensys.net> * 6612 update changelog with removed syncmodes (#7320) * 6612: Update changelog with removal of deprecated sync modes Signed-off-by: Matilda Clerke <matilda.clerke@consensys.net> * 6612: Update changelog with removal of deprecated sync modes Signed-off-by: Matilda Clerke <matilda.clerke@consensys.net> * 6612: Update changelog with removal of deprecated sync modes Signed-off-by: Matilda Clerke <matilda.clerke@consensys.net> --------- Signed-off-by: Matilda Clerke <matilda.clerke@consensys.net> Signed-off-by: Daniel Lehrner <daniel.lehrner@consensys.net> * Update datacopy (#7319) Check for OOG earlier in DataCopy. Add unit tests to cover operation branches. Signed-off-by: Danno Ferrin <danno@numisight.com> Signed-off-by: Daniel Lehrner <daniel.lehrner@consensys.net> * disable flaky test (#7308) * disable flaky test Signed-off-by: Sally MacFarlane <macfarla.github@gmail.com> * disable flaky test Signed-off-by: Sally MacFarlane <macfarla.github@gmail.com> --------- Signed-off-by: Sally MacFarlane <macfarla.github@gmail.com> Signed-off-by: Daniel Lehrner <daniel.lehrner@consensys.net> * Update unit test (#7317) * Update parameterized unit tests so the enumerate with --dry-run * Update the prague-withdrawal.json unit test to handle current code Signed-off-by: Danno Ferrin <danno@numisight.com> Co-authored-by: Sally MacFarlane <macfarla.github@gmail.com> Signed-off-by: Daniel Lehrner <daniel.lehrner@consensys.net> * persist accounts that have storage updates, but no nonce, balance nor code Signed-off-by: Daniel Lehrner <daniel.lehrner@consensys.net> * Revert "persist accounts that have storage updates, but no nonce, balance nor code" This reverts commit 9c9121a. Signed-off-by: Daniel Lehrner <daniel.lehrner@consensys.net> * removed PKI backed QBFT (#7310) * removed PKI backed QBFT Signed-off-by: Sally MacFarlane <macfarla.github@gmail.com> * changelog Signed-off-by: Sally MacFarlane <macfarla.github@gmail.com> --------- Signed-off-by: Sally MacFarlane <macfarla.github@gmail.com> Signed-off-by: Daniel Lehrner <daniel.lehrner@consensys.net> * EIP-7251 add consolidation request type (#7266) * add request type for consolidations, encoder, decoder and tests * added raw tx for consolidation * add consolidation reqs to EngineGetPayloadResultV4 * set storage slot value to 0 initially and value for tx * updates plugin api Signed-off-by: Justin Florentine <justin+github@florentine.us> Signed-off-by: Sally MacFarlane <macfarla.github@gmail.com> --------- Signed-off-by: Sally MacFarlane <macfarla.github@gmail.com> Signed-off-by: Justin Florentine <justin+github@florentine.us> Co-authored-by: Justin Florentine <justin+github@florentine.us> Signed-off-by: Daniel Lehrner <daniel.lehrner@consensys.net> * fix: `eth_call` deserialization to correctly ignore unknown fields in the transaction object (#7323) * fix: Use Builder for JsonCallParameter * changelog * add additional unit tests * fix: Update builder to withGas to match the json eth_call --------- Signed-off-by: Usman Saleem <usman@usmans.info> Signed-off-by: Daniel Lehrner <daniel.lehrner@consensys.net> * Stop transaction selection on TX_EVALUATION_TOO_LONG (#7330) Signed-off-by: Fabio Di Fabio <fabio.difabio@consensys.net> Signed-off-by: Daniel Lehrner <daniel.lehrner@consensys.net> * message frame buider will create AuthorizedCodeService by itsef if it isn't set Signed-off-by: Daniel Lehrner <daniel.lehrner@consensys.net> * get correct nonce for authorization Signed-off-by: Daniel Lehrner <daniel.lehrner@consensys.net> * nonce only returns a vaid nonce, new method nonceList returns all the nonces Signed-off-by: Daniel Lehrner <daniel.lehrner@consensys.net> * plugs leaky abstraction Signed-off-by: Justin Florentine <justin+github@florentine.us> * some renaming, acceptance tests checks for exact balance of tx sponsor at the end. Signed-off-by: Daniel Lehrner <daniel.lehrner@consensys.net> * inject the 7702 code into DiffBasedWorldStateUpdateAccumulator.createAccount Signed-off-by: Daniel Lehrner <daniel.lehrner@consensys.net> * spotless Signed-off-by: Justin Florentine <justin+github@florentine.us> * spotless fix, removed todos Signed-off-by: Daniel Lehrner <daniel.lehrner@consensys.net> * make AuthorityProcessor & chain id for it optional Signed-off-by: Daniel Lehrner <daniel.lehrner@consensys.net> --------- Signed-off-by: Daniel Lehrner <daniel.lehrner@consensys.net> Signed-off-by: Chaminda Divitotawela <cdivitotawela@gmail.com> Signed-off-by: stefan.pingel@consensys.net <stefan.pingel@consensys.net> Signed-off-by: Stefan Pingel <16143240+pinges@users.noreply.github.com> Signed-off-by: Danno Ferrin <danno@numisight.com> Signed-off-by: Gabriel-Trintinalia <gabriel.trintinalia@consensys.net> Signed-off-by: Matthew Whitehead <matthew1001@gmail.com> Signed-off-by: Jason Frame <jason.frame@consensys.net> Signed-off-by: Ties <71668189+TiesD@users.noreply.github.com> Signed-off-by: Wolmin <lamonos123@gmail.com> Signed-off-by: Wolmin <44748271+Wolmin@users.noreply.github.com> Signed-off-by: Fabio Di Fabio <fabio.difabio@consensys.net> Signed-off-by: Leni <leniram159@gmail.com> Signed-off-by: Justin Florentine <justin+github@florentine.us> Signed-off-by: Usman Saleem <usman@usmans.info> Signed-off-by: Sally MacFarlane <macfarla.github@gmail.com> Signed-off-by: Simon Dudley <simon.dudley@consensys.net> Signed-off-by: Matt Whitehead <matthew.whitehead@kaleido.io> Signed-off-by: garyschulte <garyschulte@gmail.com> Signed-off-by: Karim Taam <karim.t2am@gmail.com> Signed-off-by: Snazzy <snazzysam933@gmail.com> Signed-off-by: George Tebrean <george@web3labs.com> Signed-off-by: Matilda-Clerke <matilda.shay.clerke@gmail.com> Signed-off-by: Matilda Clerke <matilda.clerke@consensys.net> Co-authored-by: Chaminda Divitotawela <cdivitotawela@users.noreply.github.com> Co-authored-by: Stefan Pingel <16143240+pinges@users.noreply.github.com> Co-authored-by: Sally MacFarlane <macfarla.github@gmail.com> Co-authored-by: Danno Ferrin <danno@numisight.com> Co-authored-by: Gabriel-Trintinalia <gabriel.trintinalia@consensys.net> Co-authored-by: Matt Whitehead <matthew.whitehead@kaleido.io> Co-authored-by: Jason Frame <jason.frame@consensys.net> Co-authored-by: Ties <71668189+TiesD@users.noreply.github.com> Co-authored-by: Matt Nelson <85905982+non-fungible-nelson@users.noreply.github.com> Co-authored-by: Wolmin <44748271+Wolmin@users.noreply.github.com> Co-authored-by: Fabio Di Fabio <fabio.difabio@consensys.net> Co-authored-by: leniram159 <leniram159@gmail.com> Co-authored-by: Justin Florentine <justin+github@florentine.us> Co-authored-by: Usman Saleem <usman@usmans.info> Co-authored-by: Simon Dudley <simon.dudley@consensys.net> Co-authored-by: garyschulte <garyschulte@gmail.com> Co-authored-by: Karim Taam <karim.t2am@gmail.com> Co-authored-by: gringsam <snazzysam933@gmail.com> Co-authored-by: George Tebrean <99179176+gtebrean@users.noreply.github.com> Co-authored-by: stefan.pingel@consensys.net <stefan.pingel@consensys.net> Co-authored-by: Matilda-Clerke <matilda.shay.clerke@gmail.com>
PR description
This PR adds various changes/fixes that allow snap-sync and BFT to be used together.
Since this is a relatively new combination of configuration I have added an experimental flag
--Xsnapsync-bft-enabled
which must be set in order for Besu to start.My general opinion for enterprise chains is that it is probably reasonable for a BFT chain to have
sync-min-peers
set to1
. The trust requirements of a public chain are not as applicable to an enterprise chain, so the need to have e.g.5
peers before you are willing to consider their pivot-block data is too restrictive. At the very least it makes it impossible to have a chain of 4 validators, or a chain of 6 validators with one unavailable. I've broadly outlined below thesnap
sync logic that this PR uses, some of which assumes that an enterprise user will set--sync-min-peers=1
if their chain only has a small number of nodes. I think there's an argument for the--profile=enterprise
defaults to set--sync-min-peers=1
, but I'm going to save that for a separate PR.The basic logic used for sync is as follows:
--Xsynchronizer-fast-sync-pivot-distance
(50
by default) it skips any attempt atsnap
sync. Quitting the snap sync process allows the node to start contributing to BFT voting which ensures the chain isn't stalled waiting for this node. (The state for the most recent 50 blocks is obtained usingfull
sync, which is existing behaviour)snap
sync but believes itself to be the only validator it exitssnap
sync. As above, this ensures that the node can then start its BFT mining coordinator and continue producing new blocks, rather than get stuck in a loop of failedsnap
sync attempts which it will probably never get out of. This is particularly useful for the case where a single BFT validator node is started as a new chain, to which other validators will be added later on. Without this logic it would wait for at least 1 other peer, which may never come along.sync-min-peers
peers who are validators, but none of them has any usable pivot data, it quits thesnap
sync process, again to allow its BFT mining coordinator to start producing new blocks. This is necessary in a case where a number of new validators have been created for a brand new chain, but because they are all new and have no state to share they would get stuck in a loop of failedsnap
sync attempts.(It's also worth noting that I've continued the process of renaming the fast-sync internal variables to be more generic as they also apply to
snap
andcheckpoint
sync)(Note: see related PR #7204 which fixes a follow on issue)
Documentation
The broad topic of sync-ing is covered in the public section of the docs here: https://besu.hyperledger.org/development/public-networks/get-started/connect/sync-node
I think the private section of the docs would do with its own sync-ing topic, which refers to the article above for the basic "how do the different sync modes work" but explains the permissioned-chain specific behaviours that I've implemented in this PR. I think the explanation in the description above is probably a reasonable starting point to write those docs, but I'm happy to review and edit the new topic once an initial draft is there.