-
Notifications
You must be signed in to change notification settings - Fork 1.7k
ethcore: fix detection of major import #9552
Conversation
LGTM, but I wonder if it wouldn't make more sense to pass a |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm pretty sure it will break warp-barrier. If WaitingForPeers
is not the expected state we should change get_init_state
or use a different method in reset
Resetting to WaitingForPeers
should be fine if there is a timeout set that will fall back to regular sync if we can't warp.
ethcore/sync/src/chain/mod.rs
Outdated
@@ -700,6 +700,7 @@ impl ChainSync { | |||
fn complete_sync(&mut self, io: &mut SyncIo) { | |||
trace!(target: "sync", "Sync complete"); | |||
self.reset(io); | |||
self.state = SyncState::Idle; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reset is setting the state to ChainSync::get_init_state(self.warp_sync, io.chain());
already. Setting it to Idle
here will break warp-barrier
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oups, you're right @tomusdrw !
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're right it does break the warp barrier, but I also don't think the current behavior of setting to get_init_state
is correct, that's why I forced it to Idle
. Every time we're synced with the tip of the chain we go into WaitingPeers
state which tries to trigger a snapshot sync. If you monitor eth_syncing
it goes into syncing state briefly every time a new block is imported.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does it make sense to set it to get_init_state
until WAIT_PEERS_TIMEOUT
is elapsed, and force it to idle afterwards?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this line: https://github.com/paritytech/parity-ethereum/blob/master/ethcore/sync/src/chain/mod.rs#L451 should be modified as you are saying, set to Idle
when WAIT_PEERS_TIMEOUT
is elapsed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay I was wrong. This doesn't break the warp-barrier, it keeps on searching for the given snapshot until it finds it. It doesn't break the warp barrier because it will be stuck in WaitingPeers
state and therefore will never complete any sync round (i.e. complete_sync
won't be called which would force the state to Idle
). It does break the informant because of the changes I made to is_major_importing
, it will print as if the node was syncing while it is actually searching for the snapshot. Though I'm not sure what is the correct semantics here? When eth_syncing
returns false
should it mean that the node just isn't syncing, or that it's not syncing because it doesn't need to? I.e. it is already synced to the tip of the chain.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I guess if we're in complete_sync
, then we've already reached warp barrier. We can make reset
take an additional state param and pass either ChainSync::get_init_state(self.warp_sync, io.chain())
or SyncState::Idle
, but I don't see it as blocking this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the fix should be moved from complete_sync()
to reset()
– it's brittle to have the state first set in reset()
and then clobbered right afterwards in complete_sync()
.
I am new to this code and might be completely wrong here, but it seems like we're conflating state with configuration (WarpSync::Enabled
). Maybe I'm making the wrong assumption here, but isn't it true that once warp syncing is done we want it turned off until the node is restarted? Or are there cases when a node reaches the tip of the chain at one point t and starts syncing normally and then at t + n need to go back to warp syncing? If not, then Idle
would be the correct state as soon as complete_sync
is called?
In other words: if we're beyond the warp barrier or at the tip, calling complete_sync()
should put us in Idle
(but I'd move that logic to reset()
if possible).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or are there cases when a node reaches the tip of the chain at one point t and starts syncing normally and then at t + n need to go back to warp syncing?
Yes, we can switch back to warp syncing e.g. if our node went offline for a while (or just slow) and is too far away (SNAPSHOT_RESTORE_THRESHOLD
) from the current tip (see maybe_start_snapshot_sync
, which is called regularly on timer).
rpc/src/v1/helpers/block_import.rs
Outdated
/// be considered a syncing state. | ||
pub fn is_major_importing_or_waiting(sync_state: Option<SyncState>, queue_info: BlockQueueInfo, waiting_is_syncing_state: bool) -> bool { | ||
/// Check if client is during major sync or during block import. | ||
pub fn is_major_importing(sync_state: Option<SyncState>, queue_info: BlockQueueInfo) -> bool { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the flag was introduced quite recently, no? WaitingForPeers
is the initial state in case we are attempting warp sync, and we don't really want to start regular sync just yet (we first wait for at least couple of warp-peers before falling back to regular sync).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes it was added recently (#9112), but it's not necessary if we don't switch into WaitingPeers
state on the beginning of each sync round, and in that case we should always consider WaitingPeers
to be a syncing state.
Please rebase on master, sorry for the inconvenience. |
2383ad1
to
f128ce8
Compare
ethcore/sync/src/chain/mod.rs
Outdated
@@ -700,6 +700,7 @@ impl ChainSync { | |||
fn complete_sync(&mut self, io: &mut SyncIo) { | |||
trace!(target: "sync", "Sync complete"); | |||
self.reset(io); | |||
self.state = SyncState::Idle; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I guess if we're in complete_sync
, then we've already reached warp barrier. We can make reset
take an additional state param and pass either ChainSync::get_init_state(self.warp_sync, io.chain())
or SyncState::Idle
, but I don't see it as blocking this PR.
f128ce8
to
6d06c7a
Compare
6d06c7a
to
e040a97
Compare
Added |
* sync: set state to idle after sync is completed * sync: refactor sync reset
…mon-deps * origin/master: fix (light/provider) : Make `read_only executions` read-only (#9591) ethcore: fix detection of major import (#9552) return 0 on error (#9705) ethcore: delay ropsten hardfork (#9704) make instantSeal engine backwards compatible, closes #9696 (#9700) Implement CREATE2 gas changes and fix some potential overflowing (#9694) Don't hash the init_code of CREATE. (#9688) ethcore: minor optimization of modexp by using LR exponentiation (#9697) removed redundant clone before each block import (#9683)
* parity-version: bump beta to 2.1.2 * docs(rpc): push the branch along with tags (#9578) * docs(rpc): push the branch along with tags * ci: remove old rpc docs script * Remove snapcraft clean (#9585) * Revert " add snapcraft package image (master) (#9584)" This reverts commit ceaedbb. * Update package-snap.sh * Update .gitlab-ci.yml * ci: fix regex 🙄 (#9597) * docs(rpc): annotate tag with the provided message (#9601) * Update ropsten.json (#9602) * HF in POA Sokol (2018-09-19) (#9607) poanetwork/poa-chain-spec#86 * fix(network): don't disconnect reserved peers (#9608) The priority of && and || was borked. * fix failing node-table tests on mac os, closes #9632 (#9633) * ethcore-io retries failed work steal (#9651) * ethcore-io uses newer version of crossbeam && retries failed work steal * ethcore-io non-mio service uses newer crossbeam * remove master from releasable branches (#9655) * remove master from releasable branches need backporting in beta fix https://gitlab.parity.io/parity/parity-ethereum/-/jobs/101065 etc * add except for snap packages for master * Test fix for windows cache name... (#9658) * Test fix for windows cache name... * Fix variable name. * fix(light_fetch): avoid race with BlockNumber::Latest (#9665) * Calculate sha3 instead of sha256 for push-release. (#9673) * Calculate sha3 instead of sha256 for push-release. * Add pushes to the script. * Hardfork the testnets (#9562) * ethcore: propose hardfork block number 4230000 for ropsten * ethcore: propose hardfork block number 9000000 for kovan * ethcore: enable kip-4 and kip-6 on kovan * etcore: bump kovan hardfork to block 9.2M * ethcore: fix ropsten constantinople block number to 4.2M * ethcore: disable difficulty_test_ropsten until ethereum/tests are updated upstream * ci: fix push script (#9679) * ci: fix push script * Fix copying & running on windows. * CI: Remove unnecessary pipes (#9681) * ci: reduce gitlab pipelines significantly * ci: build pipeline for PR * ci: remove dead weight * ci: remove github release script * ci: remove forever broken aura tests * ci: add random stuff to the end of the pipes * ci: add wind and mac to the end of the pipe * ci: remove snap artifacts * ci: (re)move dockerfiles * ci: clarify job names * ci: add cargo audit job * ci: make audit script executable * ci: ignore snap and docker files for rust check * ci: simplify audit script * ci: rename misc to optional * ci: add publish script to releaseable branches * ci: more verbose cp command for windows build * ci: fix weird binary checksum logic in push script * ci: fix regex in push script for windows * ci: simplify gitlab caching * docs: align README with ci changes * ci: specify default cargo target dir * ci: print verbose environment * ci: proper naming of scripts * ci: restore docker files * ci: use docker hub file * ci: use cargo home instead of cargo target dir * ci: touch random rust file to trigger real builds * ci: set cargo target dir for audit script * ci: remove temp file * ci: don't export the cargo target dir in the audit script * ci: fix windows unbound variable * docs: fix gitlab badge path * rename deprecated gitlab ci variables https://docs.gitlab.com/ee/ci/variables/#9-0-renaming * ci: fix git compare for nightly builds * test: skip c++ example for all platforms but linux * ci: add random rust file to trigger tests * ci: remove random rust file * disable cpp lib test for mac, win and beta (#9686) * cleanup ci merge * ci: fix tests * fix bad-block reporting no reason (#9638) * ethcore: fix detection of major import (#9552) * sync: set state to idle after sync is completed * sync: refactor sync reset * Don't hash the init_code of CREATE. (#9688) * Docker: run as parity user (#9689) * Implement CREATE2 gas changes and fix some potential overflowing (#9694) * Implement CREATE2 gas changes and fix some potential overflowing * Ignore create2 state tests * Split CREATE and CREATE2 in gasometer * Generalize rounding (x + 31) / 32 to to_word_size * make instantSeal engine backwards compatible, closes #9696 (#9700) * ethcore: delay ropsten hardfork (#9704) * fix (light/provider) : Make `read_only executions` read-only (#9591) * `ExecutionsRequest` from light-clients as read-only This changes so all `ExecutionRequests` from light-clients are executed as read-only which the `virtual``flag == true ensures. This boost up the current transaction to always succeed Note, this only affects `eth_estimateGas` and `eth_call` AFAIK. * grumbles(revert renaming) : TransactionProof * grumbles(trace) : remove incorrect trace * grumbles(state/prove_tx) : explicit `virt` Remove the boolean flag to determine that a `state::prove_transaction` whether it should be executed in a virtual context or not. Because of that also rename the function to `state::prove_transction_virtual` to make more clear * CI: Skip docs job for nightly (#9693) * ci: force-tag wiki changes * ci: force-tag wiki changes * ci: skip docs job for master and nightly * ci: revert docs job checking for nightly tag * ci: exclude docs job from nightly builds in gitlab script
* sync: set state to idle after sync is completed * sync: refactor sync reset
* parity-version: bump stable to 2.0.7 * HF in POA Sokol (2018-09-19) (#9607) poanetwork/poa-chain-spec#86 * fix failing node-table tests on mac os, closes #9632 (#9633) * fix(light_fetch): avoid race with BlockNumber::Latest (#9665) * CI: Remove unnecessary pipes (#9681) * ci: reduce gitlab pipelines significantly * ci: build pipeline for PR * ci: remove dead weight * ci: remove github release script * ci: remove forever broken aura tests * ci: add random stuff to the end of the pipes * ci: add wind and mac to the end of the pipe * ci: remove snap artifacts * ci: (re)move dockerfiles * ci: clarify job names * ci: add cargo audit job * ci: make audit script executable * ci: ignore snap and docker files for rust check * ci: simplify audit script * ci: rename misc to optional * ci: add publish script to releaseable branches * ci: more verbose cp command for windows build * ci: fix weird binary checksum logic in push script * ci: fix regex in push script for windows * ci: simplify gitlab caching * docs: align README with ci changes * ci: specify default cargo target dir * ci: print verbose environment * ci: proper naming of scripts * ci: restore docker files * ci: use docker hub file * ci: use cargo home instead of cargo target dir * ci: touch random rust file to trigger real builds * ci: set cargo target dir for audit script * ci: remove temp file * ci: don't export the cargo target dir in the audit script * ci: fix windows unbound variable * docs: fix gitlab badge path * rename deprecated gitlab ci variables https://docs.gitlab.com/ee/ci/variables/#9-0-renaming * ci: fix git compare for nightly builds * test: skip c++ example for all platforms but linux * ci: add random rust file to trigger tests * ci: remove random rust file * disable cpp lib test for mac, win and beta (#9686) * cleanup ci merge * parity: bump clib * ci: fix tests * ci: disable c++ example * Docker: run as parity user (#9689) * CI: Skip docs job for nightly (#9693) * ci: force-tag wiki changes * ci: force-tag wiki changes * ci: skip docs job for master and nightly * ci: revert docs job checking for nightly tag * ci: exclude docs job from nightly builds in gitlab script * fix (light/provider) : Make `read_only executions` read-only (#9591) * `ExecutionsRequest` from light-clients as read-only This changes so all `ExecutionRequests` from light-clients are executed as read-only which the `virtual``flag == true ensures. This boost up the current transaction to always succeed Note, this only affects `eth_estimateGas` and `eth_call` AFAIK. * grumbles(revert renaming) : TransactionProof * grumbles(trace) : remove incorrect trace * grumbles(state/prove_tx) : explicit `virt` Remove the boolean flag to determine that a `state::prove_transaction` whether it should be executed in a virtual context or not. Because of that also rename the function to `state::prove_transction_virtual` to make more clear * ethcore: fix detection of major import (#9552) * sync: set state to idle after sync is completed * sync: refactor sync reset * parity: revert clib bump and fix tests * Fix path to parity.h (#9274) * Fix path to parity.h * Fix other paths as well * ethcore-io retries failed work steal (#9651) * ethcore-io uses newer version of crossbeam && retries failed work steal * ethcore-io non-mio service uses newer crossbeam
After a sync round completes successfully we call
Chainsync::reset
, which in turn will set the sync state to the initial state by callingChainSync::get_init_state
. If warp sync is enabled the initial state will be set toWaitingPeers
, which would be incorrect since it would make the client try to start a warp restore and also leads to incorrect detection of major import state. The fix is to force the state toIdle
after the sync is complete. Also reverted changes introduced in #9112. I tested this by hammering theeth_syncing
RPC and comparing the results to current master. On current master whenever a new block is importedeth_syncing
would temporarily return that the node is syncing.Fixes #9428.