-
Notifications
You must be signed in to change notification settings - Fork 677
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix: snappy downloader #5393
Fix: snappy downloader #5393
Conversation
…if there's download pressure
… so that we only forward results that contain blocks (drop tx and stackerdb messages)
…-network/stacks-blockchain into fix/relayer-drain-channel
…ded), and merge un-sent NetworkResult's in order to keep the queue length bound to at most one outstanding NetworkResult
… and clean out completed tenures based on whether or not we see them processed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
I think this breaks simple_neon_integration test. I don't see this failing anywhere else (passes on develop with prom metrics enabled). It seems to be there was a change to the prometheus metric in this PR that is screwing it up. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will reapprove once simple_neon_integration test is fixed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM -- will approve once the prom test issue is resolved
… reward set for nakamoto prepare phases eagerly, and pass the stacks tip height via NetworkResult to the relayer so it can update prometheus
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, just some logging comments
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
This pull request has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs. |
This fixes a few bugs in the relayer and networking stack:
It removes a convoy effect that can happen when the node is under load. Before, the channel between the p2p thread and relayer thread could grow unbounded if the relayer couldn't keep up with bursts of
NetworkResult
s. In this PR, the p2p thread merges outstandingNetworkResult
s into a singleNetworkResult
and drops / consolidates obsolete data, which both minimizes the relayer's total workload and minimizes the time between receiving a data-bearing message and processing it.It fixes the block downloader so that it detects and deprioritizes unhealthy replicas during block download, so that most of the time, the node is only querying replicas that can serve it data. It also improves error and retry logging in the downloader.
To stress-test the downloader, it adds an option to disable block-push altogether, so the node is forced to download everything
It fixes an off-by-one error in the p2p stack which was preventing it from caching reward sets. Instead, the p2p stack would always fetch reward sets from disk, which lead to performance degradation.