Skip to content
This repository has been archived by the owner on Nov 6, 2020. It is now read-only.

Fix unstable peers and slowness in sync #9967

Merged
merged 5 commits into from
Nov 28, 2018
Merged

Fix unstable peers and slowness in sync #9967

merged 5 commits into from
Nov 28, 2018

Conversation

ngotchac
Copy link
Contributor

This PR fixes some issues with the stability of peer counts, and overall slowness in block sync.

Here are graphs that compares block height against time for the current stable and this branch:
block_heights
block_heights
Here for the peer counts:
peer_counts
peer_counts

The issue was that sync.continue_sync was called every time a packet was received, so it had to go through the whole list of peers, randomize it, etc.. Which could be quite expensive when connected to a lot of peers.

Now it is called on a timer, every 2.5 seconds.

@ngotchac ngotchac added A0-pleasereview 🤓 Pull request needs code review. M4-core ⛓ Core client code / Rust. labels Nov 26, 2018
Copy link
Collaborator

@sorpaas sorpaas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

Some tests would need to be fixed. Seems we can just add call continue_sync in sync::tests::helpers::EthPeer::sync_step.

ethcore/sync/src/api.rs Outdated Show resolved Hide resolved
ethcore/sync/src/chain/mod.rs Outdated Show resolved Hide resolved
ethcore/sync/src/chain/mod.rs Outdated Show resolved Hide resolved
@sorpaas sorpaas added A5-grumble 🔥 Pull request has minor issues that must be addressed before merging. and removed A0-pleasereview 🤓 Pull request needs code review. labels Nov 26, 2018
@5chdn 5chdn added this to the 2.3 milestone Nov 26, 2018
@5chdn 5chdn added A6-mustntgrumble 💦 Pull request has areas for improvement. The author need not address them before merging. and removed A5-grumble 🔥 Pull request has minor issues that must be addressed before merging. labels Nov 27, 2018
@sorpaas sorpaas added A5-grumble 🔥 Pull request has minor issues that must be addressed before merging. and removed A6-mustntgrumble 💦 Pull request has areas for improvement. The author need not address them before merging. labels Nov 27, 2018
Copy link
Collaborator

@tomusdrw tomusdrw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Amazing 🎉! Great work @ngotchac!

Copy link
Contributor

@andresilva andresilva left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Great job 🥇 :)

@andresilva
Copy link
Contributor

Should we backport this as well?

@sorpaas sorpaas added A8-looksgood 🦄 Pull request is reviewed well. and removed A5-grumble 🔥 Pull request has minor issues that must be addressed before merging. labels Nov 28, 2018
@5chdn 5chdn merged commit 5e9dc18 into master Nov 28, 2018
@5chdn 5chdn deleted the ng-stable-peers-2 branch November 28, 2018 11:19
5chdn pushed a commit that referenced this pull request Nov 28, 2018
* Don't sync all peers after each response

* Update formating

* Fix tests: add `continue_sync` to `Sync_step`

* Update ethcore/sync/src/chain/mod.rs

Co-Authored-By: ngotchac <ngotchac@gmail.com>
@5chdn 5chdn mentioned this pull request Nov 28, 2018
12 tasks
@folsen
Copy link
Contributor

folsen commented Nov 29, 2018

@ngotchac just out of curiosity, how did you generate the graphs?

5chdn added a commit that referenced this pull request Nov 29, 2018
* version: bump beta to 2.2.2

* Add experimental RPCs flag (#9928)

* WiP

* Enable experimental RPCs.

* Keep existing blocks when restoring a Snapshot (#8643)

* Rename db_restore => client

* First step: make it compile!

* Second step: working implementation!

* Refactoring

* Fix tests

* PR Grumbles

* PR Grumbles WIP

* Migrate ancient blocks interating backward

* Early return in block migration if snapshot is aborted

* Remove RwLock getter (PR Grumble I)

* Remove dependency on `Client`: only used Traits

* Add test for recovering aborted snapshot recovery

* Add test for migrating old blocks

* Fix build

* PR Grumble I

* PR Grumble II

* PR Grumble III

* PR Grumble IV

* PR Grumble V

* PR Grumble VI

* Fix one test

* Fix test

* PR Grumble

* PR Grumbles

* PR Grumbles II

* Fix tests

* Release RwLock earlier

* Revert Cargo.lock

* Update _update ancient block_ logic: set local in `commit`

* Update typo in ethcore/src/snapshot/service.rs

Co-Authored-By: ngotchac <ngotchac@gmail.com>

* Adjust requests costs for light client (#9925)

* PIP Table Cost relative to average peers instead of max peers

* Add tracing in PIP new_cost_table

* Update stat peer_count

* Use number of leeching peers for Light serve costs

* Fix test::light_params_load_share_depends_on_max_peers (wrong type)

* Remove (now) useless test

* Remove `load_share` from LightParams.Config
Prevent div. by 0

* Add LEECHER_COUNT_FACTOR

* PR Grumble: u64 to u32 for f64 casting

* Prevent u32 overflow for avg_peer_count

* Add tests for LightSync::Statistics

* Fix empty steps (#9939)

* Don't send empty step twice or empty step then block.

* Perform basic validation of locally sealed blocks.

* Don't include empty step twice.

* prevent silent errors in daemon mode, closes #9367 (#9946)

* Fix a deadlock (#9952)

* Update informant:
  - decimal in Mgas/s
  - print every 5s (not randomly between 5s and 10s)

* Fix dead-lock in `blockchain.rs`

* Update locks ordering

* Fix light client informant while syncing (#9932)

* Add `is_idle` to LightSync to check importing status

* Use SyncStateWrapper to make sure is_idle gets updates

* Update is_major_import to use verified queue size as well

* Add comment for `is_idle`

* Add Debug to `SyncStateWrapper`

* `fn get` -> `fn into_inner`

*  ci: rearrange pipeline by logic (#9970)

* ci: rearrange pipeline by logic

* ci: rename docs script

* fix docker build (#9971)

* Deny unknown fields for chainspec (#9972)

* Add deny_unknown_fields to chainspec

* Add tests and fix existing one

* Remove serde_ignored dependency for chainspec

* Fix rpc test eth chain spec

* Fix starting_nonce_test spec

* Improve block and transaction propagation (#9954)

* Refactor sync to add priority tasks.

* Send priority tasks notifications.

* Propagate blocks, optimize transactions.

* Implement transaction propagation. Use sync_channel.

* Tone down info.

* Prevent deadlock by not waiting forever for sync lock.

* Fix lock order.

* Don't use sync_channel to prevent deadlocks.

* Fix tests.

* Fix unstable peers and slowness in sync (#9967)

* Don't sync all peers after each response

* Update formating

* Fix tests: add `continue_sync` to `Sync_step`

* Update ethcore/sync/src/chain/mod.rs

Co-Authored-By: ngotchac <ngotchac@gmail.com>

* fix rpc middlewares

* fix Cargo.lock

* json: resolve merge in spec

* rpc: fix starting_nonce_test

* ci: allow nightl job to fail
@ngotchac
Copy link
Contributor Author

@folsen I wrote a small Rust script that runs the given binary with a given config basically, then prints the graphs using gnuplot. Are you interested? I can publish it on GH

@folsen
Copy link
Contributor

folsen commented Nov 29, 2018

@ngotchac definitely interested, seems to cross over with what some of the QA guys are working on

@ngotchac
Copy link
Contributor Author

The script used to get the graphs : https://github.com/ngotchac/eth-metrics
It's very PoC for now.

niklasad1 pushed a commit that referenced this pull request Dec 16, 2018
* Don't sync all peers after each response

* Update formating

* Fix tests: add `continue_sync` to `Sync_step`

* Update ethcore/sync/src/chain/mod.rs

Co-Authored-By: ngotchac <ngotchac@gmail.com>
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
A8-looksgood 🦄 Pull request is reviewed well. M4-core ⛓ Core client code / Rust.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants