-
Notifications
You must be signed in to change notification settings - Fork 75
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ETCM-446: Connection limit ranges #833
Conversation
cf34f4a
to
5ed1481
Compare
…until min handshaked is reached.
989f989
to
e8e0ee6
Compare
) | ||
} | ||
|
||
def prunePeers( | ||
incoming: Boolean, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why is this flag is necessary ? Maybe i am missing some thing, but in call sites we are passing here true
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In theory we could prune outgoing connections as well, the communication is bi-directional on them, maybe some of them would be better replaced with other nodes, perhaps if they are slow. But in this PR it's just about the incoming ones, to free incoming slots.
|
||
case class ConnectedPeers( | ||
private val incomingPendingPeers: Map[PeerId, Peer], | ||
private val outgoingPendingPeers: Map[PeerId, Peer], | ||
private val handshakedPeers: Map[PeerId, Peer] | ||
private val handshakedPeers: Map[PeerId, Peer], | ||
private val pruningPeers: Map[PeerId, Peer], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just to make sure, this is necessary to not prune peer twice while waiting for its proper disconnect ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, not to prune the same peer twice and also to remember the overall number of peers we are trying to prune, so that if the routine is ran again, it will know that it doesn't have to prune yet another 10 random nodes.
Co-authored-by: Enrique Rodríguez <enrique.rodriguez@iohk.io>
…utput-hk/mantis into ETCM-446-connection-limit-ranges
Description
Currently the node will try to connect to nodes until their
max-outgoing-peers
is reached and limit incoming connection tomax-incoming-peers
. The problem is once incoming peers are saturated over the network it's difficult to find nodes to connect to, and this situation is likely to arise ifoutgoing >= incoming
is configured.The PR changes the connection strategy to be based on acceptable ranges:
min-outgoing-peers
: The node is not seeking new connections if the outgoing handshaked connection count is above this number. Otherwise it tries to open up tomax-outgoing-peers
connections.prune-incoming-peers
: A number of incoming peers to try to prune if the incoming handshaked peer count hit the maximum and we are rejecting connections withTooManyPeers
.min-prune-age
: A duration that a peer is minimally allowed to be connected before being selected for random pruning, and also a minimum amount of time that needs to pass between pruning attempts, to avoid hostile takeovers.Proposed Solution
There are two ideas:
The new default settings target at a 40/60 ratio for
min-outgoing / max-incoming
, to protect against eclipse attacks but also maintain thattotal(max-incoming) >= total(min-outgoing)
on the the network level.max-outgoing
is high to allow quickly trying to open many connections after startup, anticipating that most nodes will not be compatible or receptive.Currently the pruning only kicks in when a new incoming connection would bring the number above the maximum. We could also prune periodically whenever the number equals the max. That would allow new joiners to be accepted immediately rather than when they try to re-connect after a short blacklisting.
To decide which connections to prune we take the average number of received responses over the lifetime of the peers, with higher being better. The statistics are collected for the last 12 hours, which is the same time as the bonding duration in discovery.