Skip to content
This repository has been archived by the owner on Nov 6, 2020. It is now read-only.

Reject GetBlockHeaders requests while importing blocks with possible fork #10895

Closed
wants to merge 24 commits into from

Conversation

grbIzl
Copy link
Collaborator

@grbIzl grbIzl commented Jul 16, 2019

Purpose:
This PR aims to fix the following scenario

  1. There are two connected nodes A and F
  2. Sync is going up to block N for both nodes
  3. Node F receives new canonical subchain starting from N block (blocks N' and N' + 1)
  4. Before starting new subchain's import node F sends broadcast about new blocks (this notification was introduced by @tomusdrw in his Improve block and transaction propagation #9954 )
  5. Node A receives notification about new block N' + 1, sees, that it has unknown parent.
  6. Node A starts retraction in order to find the beginning of fork, requesting headers for blocks N, N-1, N-2, N-3 etc from F
  7. Meanwhile node F has not completed import of N' and N' + 1 blocks yet and replied to all requests from A with previous blocks (up to N)
  8. Node A quickly runs into N - 128 block (128 is the limit for number of requested headers)
  9. Node F finishes new blocks import
  10. Unfortunately retraction on node A went too far already and A requested 128 headers beginning from N - M block, where M > 128. As a result F returns old 128 headers without new N' and N'+ 1
  11. As node A doesn't receive new blocks N' and N' + 1, it continues retraction to 0.
  12. Sync stops.

Comments to the scenario.
This problem is reproduced on configuration from #10724 when A and F run locally on one machine and A is only connected to F (F connected to the whole mainnet). As a result, network latency between A and F is low and A can quickly descend down to N - M block to the stuck condition. For regular configurations stuck is less likely but it still causes overhead by sequences of GetBlock requests emitted by nodes, trying to find the common parent in such situation.

Solution.
I suggest (with help of @arkpar ) to break this sequence on step 7 on node F. And do not reply to GetBlockHeaders request, if there is a suspicion, that node is currently importing a block, that can cause reorg. In order to define this condition, I check block verification queue and look, if there is a block there, that has current best block as a parent. If there are no such block processing, I assume, that the queue has block with fork, and node should not reply to GetBlockHeaders. Also I used empirical limitation for queue size and assume, that if it's longer, than 8 elements, it doesn't contain fork block.

It's the second part of #10864 because this problem was masked by #10864, so these PRs should go together.

soc1c and others added 9 commits April 2, 2019 09:37
* Reject crazy timestamps instead of truncating.

* fix(light cull): poll light cull instead of timer (#10559)

* fix(light cull): poll light cull instead of timer

* fix(grumbles): remove error + updated docs

* fix(on-demand request): `expect()` reason

* docs(remove misleading info)
* version: bump beta to 2.5.1

* fix(whisper expiry): current time + work + ttl (#10587)

* update bootnodes (#10595)

* config: update goerli bootnodes

* config: update kotti bootnodes

* adds rpc error message for --no-ancient-blocks (#10608)

* adds error message for --no-ancient-blocks, closes #10261

* Apply suggestions from code review

Co-Authored-By: seunlanlege <seunlanlege@gmail.com>

* Constantinople HF on POA Core (#10606)

* Constantinople HF on POA Core

Plan Constantinople/St.Petersfork HF on POA Core network at block 8582254.
Original PR in POA repository: poanetwork/poa-chain-spec#110

* Remove extra empty line

* evm: add some mulmod benches (#10600)

* evm: add blockhash_mulmod bench

* evm: use num-bigint for mod ops

* Clique: zero-fill extradata when the supplied value is less than 32 bytes in length (#10605)

* Update kovan.json to switch validator set to POA Consensus Contracts (#10628)

* Fix publish docs (#10635)

* Fix publish docs

* this never should be forced, either way compiling previous versions will produce outdated docs

* fix array, var was moved to the group project global variables list

* Fix rinkeby petersburg fork (#10632)
* ci: publish docs debug (#10638)

* ci: backport missing diff from master
* version: bump beta to 2.5.2

* [CI] allow cargo audit to fail (#10676)

* [CI] allow cargo audit to fail

* [.gitlab-ci.yml] add a comment about cargo audit

* [Cargo.lock] cargo update -p protobuf

* Reset blockchain properly (#10669)

* delete BlockDetails from COL_EXTRA

* better proofs

* added tests

* PR suggestions

* new image (#10673)

* Update publishing (#10644)

* docker images are now built on k8s: test run

* copy check_sync.sh in build-linux job

* copy scripts/docker/hub/* in build-linux job

* removed cache var

* cleanup, no more nightly dockers

* cleanup in dockerfile

* some new tags

* removed sccsche debug log, cleanup

* no_gits, new artifacts dir, changed scripts. Test run.

* define version once

* one source for TRACK

* stop kovan onchain updates

* moved changes for two images to a new branch

* rename Dockerfile

* no need in libudev-dev

* enable lto for release builds (#10717)

* Use RUSTFLAGS to set the optimization level (#10719)

* Use RUSTFLAGS to set the optimization level

Cargo has a [quirk]() in how configuration settings are propagated when `cargo test` runs: local code respect the settings in `[profile.test]` but all dependencies use the `[profile.dev]` settings. Here we force `opt-level=3` for all dependencies.

* Remove unused profile settings

* Maybe like this?

* Turn off incremental compilation

* Remove colors; try again with overflow-checks on

* Use quiet CI machine

* Turn overflow checking back on

* Be explicit about what options we use

* Remove "quiet machine" override

* ethcore: enable ECIP-1054 for classic (#10731)

* config: enable atlantis on ethereum classic

* config: enable atlantis on morden classic

* config: enable atlantis on morden classic

* config: enable atlantis on kotti classic

* ethcore: move kotti fork block to 0xAEF49

* ethcore: move morden fork block to 0x4829BA

* ethcore: move classic fork block to 0x81B320

* remove trailing comma

* remove trailing comma

* fix chainspec

* ethcore: move classic fork block to 0x7fffffffffffffff
* ethcore/res: activate atlantis classic hf on block 8772000 (#10766)

* fix docker tags for publishing (#10741)

* fix: aura don't add `SystemTime::now()` (#10720)

This commit does the following:
- Prevent overflow in `verify_timestamp()` by not adding `now` to found faulty timestamp
- Use explicit `CheckedSystemTime::checked_add` to prevent potential consensus issues because SystemTime is platform
depedent
- remove `#[cfg(not(time_checked_add))]` conditional compilation

* Update version

* Treat empty account the same as non-exist accounts in EIP-1052 (#10775)

* DevP2p: Get node IP address and udp port from Socket, if not included in PING packet (#10705)

* get node IP address and udp port from Socket, if not included in PING packet

* prevent bootnodes from being added to host nodes

* code corrections

* code corrections

* code corrections

* code corrections

* docs

* code corrections

* code corrections

* Apply suggestions from code review

Co-Authored-By: David <dvdplm@gmail.com>

* Add a way to signal shutdown to snapshotting threads (#10744)

* Add a way to signal shutdown to snapshotting threads

* Pass Progress to fat_rlps() so we can abort from there too.

* Checking for abort in a single spot

* Remove nightly-only weak/strong counts

* fix warning

* Fix tests

* Add dummy impl to abort snapshots

* Add another dummy impl for TestSnapshotService

* Remove debugging code

* Return error instead of the odd Ok(())
Switch to AtomicU64

* revert .as_bytes() change

* fix build

* fix build maybe
* cargo update -p smallvec (#10822)

Fixes servo/rust-smallvec#148

* Update version to v2.5.3

Signed-off-by: Martin Pugh <pugh@s3kr.it>
@grbIzl grbIzl added A3-inprogress ⏳ Pull request is in progress. No review needed at this stage. M4-core ⛓ Core client code / Rust. labels Jul 16, 2019
@grbIzl grbIzl requested a review from tomusdrw July 16, 2019 12:58
s3krit added 2 commits August 12, 2019 18:55
  -  Fix cargo audit (#10921)
  - Add support for Energy Web Foundation's new chains (#10957)
  - Kaspersky AV whitelisting (#10919)
  - Avast whitelist script (#10900)
  - Docker images renaming (#10863)
  - Remove excessive warning (#10831)
  - Allow --nat extip:your.host.here.org (#10830)
  - When updating the client or when called from RPC, sleep should mean sleep (#10814)
  - added new ropsten-bootnode and removed old one (#10794)
  - ethkey no longer uses byteorder (#10786)
  - Do not drop the peer with None difficulty (#10772)
  - docs: Update Readme with TOC, Contributor Guideline. Update Cargo package descriptions (#10652)
* [trace] check mem diff within range (#11002)

* Update version (v2.5.7-stable)
@dvdplm
Copy link
Collaborator

dvdplm commented Sep 5, 2019

@grbIzl is this PR still relevant?

@grbIzl
Copy link
Collaborator Author

grbIzl commented Sep 6, 2019

@grbIzl is this PR still relevant?

I'll put this PR on ice. If we are not able to verify it in the following weeks, I'll prefer to close it.

@grbIzl grbIzl added the A1-onice 🌨 Pull request is reviewed well, but should not yet be merged. label Sep 6, 2019
s3krit and others added 6 commits September 12, 2019 18:43
* add more tx tests (#11038)
* Fix parallel transactions race-condition (#10995)
* Add blake2_f precompile (#11017)
* [trace] introduce trace failed to Ext (#11019)
* Edit publish-onchain.sh to use https (#11016)
* Fix deadlock in network-devp2p (#11013)
* EIP 1108: Reduce alt_bn128 precompile gas costs (#11008)
* xDai chain support and nodes list update (#10989)
* EIP 2028: transaction gas lowered from 68 to 16 (#10987)
* EIP-1344 Add CHAINID op-code (#10983)
* manual publish jobs for releases, no changes for nightlies (#10977)
* [blooms-db] Fix benchmarks (#10974)
* Verify transaction against its block during import (#10954)
* Better error message for rpc gas price errors (#10931)
* tx-pool: accept local tx with higher gas price when pool full (#10901)
* Fix fork choice (#10837)
* Cleanup unused vm dependencies (#10787)
* Fix compilation on recent nightlies (#10991)
* EIP 1884 Re-pricing of trie-size dependent operations  (#10992)
* Implement EIP-1283 reenable transition, EIP-1706 and EIP-2200  (#10191)
* ethcore/res: activate Istanbul on Ropsten, Görli, Rinkeby, Kovan (#11068)

* ethcore/res: activate Istanbul on Ropsten block 6485846

* ethcore/res: activate Istanbul on Goerli block 1561651

* ethcore/res: use hex values for Istanbul specs

* ethcore/res: fix trailing comma

* ethcore/res: be pedantic about EIP-1283 in Petersburg and Istanbul test specs

* ethcore/res: activate Istanbul on Rinkeby block 5435345

* ethcore/res: activate Istanbul on Kovan block 14111141

* ethcore/res: fix kovan istanbul number to 0xd751a5

* [json-spec] make blake2 pricing spec more readable (#11034)

* [json-spec] make blake2 pricing spec more readable

* [ethcore] fix compilation

* Manual backport of #11033
* Update CHANGELOG.md and version
@grbIzl grbIzl force-pushed the HandleImportOfForks branch from 5277eb3 to 4ad44bb Compare October 25, 2019 08:04
@grbIzl grbIzl force-pushed the HandleImportOfForks branch from bcdd470 to 3b555c2 Compare November 6, 2019 15:28
@grbIzl
Copy link
Collaborator Author

grbIzl commented Nov 15, 2019

The fix is verified. I'm closing this PR in favor of newly created #11265 with clear history of commits.

@grbIzl grbIzl closed this Nov 15, 2019
@denisgranha denisgranha deleted the HandleImportOfForks branch March 17, 2020 11:39
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
A1-onice 🌨 Pull request is reviewed well, but should not yet be merged. A3-inprogress ⏳ Pull request is in progress. No review needed at this stage. M4-core ⛓ Core client code / Rust.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants