-
Notifications
You must be signed in to change notification settings - Fork 278
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
chore: update the default per peer bandwidth limits #2390
Conversation
Codecov Report
@@ Coverage Diff @@
## main #2390 +/- ##
==========================================
- Coverage 20.62% 20.61% -0.01%
==========================================
Files 131 131
Lines 15266 15269 +3
==========================================
Hits 3148 3148
- Misses 11816 11819 +3
Partials 302 302
|
@@ -175,6 +183,9 @@ func DefaultConsensusConfig() *tmcfg.Config { | |||
cfg.Consensus.TimeoutCommit = appconsts.TimeoutCommit | |||
cfg.Consensus.SkipTimeoutCommit = false | |||
cfg.TxIndex.Indexer = "null" | |||
|
|||
cfg.P2P.SendRate = int64(defaultBandwidth / cfg.P2P.MaxNumInboundPeers) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't it be:
cfg.P2P.SendRate = int64(defaultBandwidth / cfg.P2P.MaxNumInboundPeers) | |
cfg.P2P.SendRate = int64(defaultBandwidth / cfg.P2P.MaxNumOutboundPeers) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In theory yeah, but I think having a default where peers are sending more bandwidth than that peer is configured to receive is likely a bad idea. That being said, I haven't tried it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this being due to nodes by default having more inbound than outbound peers
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
having a default where peers are sending more bandwidth than that peer is configured to receive is likely a bad idea.
The total amount of consumed bandwidth for upload and download would remain the same for a node, only the amount dedicated to its connected peer would change.
this being due to nodes by default having more inbound than outbound peers
I see, Is the concern about the symmetry in the send and receive rate per connection? Unless the two peers that are connected ensure parity in their count of inbound connections, disparities in their sending and receiving bandwidth could arise.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see, Is the concern about the symmetry in the send and receive rate per connection?
yeah exactly. Having a default send rate per peer that is higher than the receive rate naively seemed like a bad idea. Our default OutboundPeers max is 10, where our default inbound is 40, therefore we could end up sending 4 times as much data as our peer is willing to receive.
After further thought I think there may be disadvantages to increasing this, namely the fact that Tendermint sends mempool txes in a constant loop even though they already sent them before, so nodes will be constantly using 1gbit even if they don't need to, we should take a step back and consider the issue holistically |
good point, converted to a draft for now. we can definitely wait until we make CAT the default |
What if we only doubled RecvRate and SendRate to 1MB each for now. The fact that the mempool gossiping isn't efficient is unfortunate but it shouldn't stop us from under-utilizing the validators' bandwidth, so I think increasing it to 1MB - which would use 400mbit/s of bandwidth, is a good first step/compromise to see how the validators behave Any thoughts @liamsi? |
celestiaorg/celestia-core#1089 would fix this |
@@ -28,6 +28,14 @@ import ( | |||
coretypes "github.com/tendermint/tendermint/types" | |||
) | |||
|
|||
const ( | |||
// oneGigabit 1Gbit == 1 billion bits | |||
oneGigabit = 1_000_000_000 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the rationale of this number
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
iirc, this was the bandwidth reccomendations for a consensus node (side note this should be gigabyte I think, since we're converting to bit below)
I could be wrong though, in which case we should update the default bandwidth expectation below
Isn't it current 5MB/s so doubling would be 10MB? |
Afaik its 512KB/s in the config.toml last time I checked |
the default is currently 5MB/s. This PR sets it to 12.5MB.
|
currently this PR tries to use the default outboud peers to determine bandwidth limits per peer, but I don't think that's accurate since we have 40 inbound. |
Is this from a recent default config? I vividly remember it being 512KB/s and that was also what peers were throttled to 512KB/s. It says 512KB/s here: https://github.com/celestiaorg/celestia-core/blob/86e67b726f017e6c05c4a627c137ba78667dbb66/p2p/conn/connection.go#L43 |
The default send and recv rate for |
closing this draft until further experiments have been done. It also make incorrect assumptions on inbound and outbound peers |
Overview
part of #2387
fair warning, this is technically using the metric versions of bandwidth, so 1 Gigabit = 1_000_000_000 bytes (power of 10 instead of a power of 2)
Checklist