Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

eth, p2p/msgrate: move peer QoS tracking to its own package and use it for snap #22876

Merged
merged 5 commits into from
May 19, 2021

Conversation

karalabe
Copy link
Member

@karalabe karalabe commented May 13, 2021

Way back when we wrote the downloader in 2015, we've added various mechanisms to try and estimate the live throughput of our connected peers. The rationale was that a node will always have faster and slower peers too, so assigning the same amount of retrieval task to each would result in wildly different response times. This is problematic in chain syncing, because a very slow peer can block faster ones from delivering, as we need to import (verify) the data as a stream. By live tracking the capacities of the different peers, we could make delivery times the same independent of how fast a peer was, so chain data import was fully stabilized.

A second important feature that this idea turned into was automatic adjustment to local resources (e.g. bandwidth). If a node has many many peers connected, those might have enough capacity to overload the local node. By dynamically tracking how much data peers can deliver us, we implicitly also ensure that we never request more than what we can download from them globally. This ensures that the downloader is more or less stable across various connection types and can work even on satellite with huge latencies.


Whilst the mechanism was effective, it was hacked in all over the downloader. With snap sync being implemented completely outside of the original downloader, it became obvious we need to abstract out all that capacity estimation and rate limiting logic if we want to reuse it elsewhere too. Thus this PR, which does exactly that. It introduces a p2p/msgrate package to track and limit requests based on delivery times and amounts; it refactors the downloader to use the new package instead of the old mechanisms; and it also adds the new traffic shaping to the snap sync.

Performance wise snap syncing with the old code vs. new code takes the same amount of time.


What we can notice with this PR is that whilst previously snap sync had quite random serving times (at 100KB request caps) across the different peers:

Screenshot from 2021-05-13 21-29-53

This PR stabilizes the serving times so they hover around the same mean and have a nice normal distribution, all whilst allowing packet sizes to range between 64KB and 512KB.

Screenshot from 2021-05-13 21-30-48


Packet counts also go down as the PR can detect that larger packets can also be fulfilled within the same time limits, so there's no need for so many round trips.

eth/protocols/snap/sync.go Outdated Show resolved Hide resolved
@fjl fjl self-assigned this May 18, 2021
@fjl fjl removed the status:triage label May 18, 2021
@fjl fjl added this to the 1.10.4 milestone May 18, 2021
@fjl fjl changed the title eth, p2p/msgrate: extract sync message rater, add to snap too eth, p2p/msgrate: move peer QoS tracking to its own package and use it for snap May 19, 2021
@fjl fjl merged commit 3e79588 into ethereum:master May 19, 2021
atif-konasl pushed a commit to frozeman/pandora-execution-engine that referenced this pull request Oct 15, 2021
…t for snap (ethereum#22876)

This change extracts the peer QoS tracking logic from eth/downloader, moving
it into the new package p2p/msgrate. The job of msgrate.Tracker is determining
suitable timeout values and request sizes per peer.

The snap sync scheduler now uses msgrate.Tracker instead of the hard-coded 15s
timeout. This should make the sync work better on network links with high latency.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants