Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ETCM-446: Connection limit ranges #833

Merged
merged 21 commits into from
Dec 15, 2020
Merged

Conversation

aakoshh
Copy link
Contributor

@aakoshh aakoshh commented Dec 4, 2020

Description

Currently the node will try to connect to nodes until their max-outgoing-peers is reached and limit incoming connection to max-incoming-peers. The problem is once incoming peers are saturated over the network it's difficult to find nodes to connect to, and this situation is likely to arise if outgoing >= incoming is configured.

The PR changes the connection strategy to be based on acceptable ranges:

  • min-outgoing-peers: The node is not seeking new connections if the outgoing handshaked connection count is above this number. Otherwise it tries to open up to max-outgoing-peers connections.
  • prune-incoming-peers: A number of incoming peers to try to prune if the incoming handshaked peer count hit the maximum and we are rejecting connections with TooManyPeers.
  • min-prune-age: A duration that a peer is minimally allowed to be connected before being selected for random pruning, and also a minimum amount of time that needs to pass between pruning attempts, to avoid hostile takeovers.

Proposed Solution

There are two ideas:

  1. We can try to aggressively connect to a lot of nodes in the beginning, expecting that many of them won't accept our connection, but once we have enough peers, we can stop at a lower level, only opening more if we fall below that.
  2. We can close down incoming connections if the overall number goes beyond a limit, so that new nodes can be accepted into the network instead. This also compensates to the aggressive over-connecting in the beginning.

The new default settings target at a 40/60 ratio for min-outgoing / max-incoming, to protect against eclipse attacks but also maintain that total(max-incoming) >= total(min-outgoing) on the the network level. max-outgoing is high to allow quickly trying to open many connections after startup, anticipating that most nodes will not be compatible or receptive.

Currently the pruning only kicks in when a new incoming connection would bring the number above the maximum. We could also prune periodically whenever the number equals the max. That would allow new joiners to be accepted immediately rather than when they try to re-connect after a short blacklisting.

To decide which connections to prune we take the average number of received responses over the lifetime of the peers, with higher being better. The statistics are collected for the last 12 hours, which is the same time as the bonding duration in discovery.

@aakoshh aakoshh force-pushed the ETCM-446-connection-limit-ranges branch 5 times, most recently from cf34f4a to 5ed1481 Compare December 4, 2020 15:31
@aakoshh aakoshh force-pushed the ETCM-446-connection-limit-ranges branch from 989f989 to e8e0ee6 Compare December 4, 2020 17:13
@aakoshh aakoshh marked this pull request as ready for review December 7, 2020 15:40
src/main/resources/application.conf Outdated Show resolved Hide resolved
)
}

def prunePeers(
incoming: Boolean,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is this flag is necessary ? Maybe i am missing some thing, but in call sites we are passing here true

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In theory we could prune outgoing connections as well, the communication is bi-directional on them, maybe some of them would be better replaced with other nodes, perhaps if they are slow. But in this PR it's just about the incoming ones, to free incoming slots.


case class ConnectedPeers(
private val incomingPendingPeers: Map[PeerId, Peer],
private val outgoingPendingPeers: Map[PeerId, Peer],
private val handshakedPeers: Map[PeerId, Peer]
private val handshakedPeers: Map[PeerId, Peer],
private val pruningPeers: Map[PeerId, Peer],
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just to make sure, this is necessary to not prune peer twice while waiting for its proper disconnect ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, not to prune the same peer twice and also to remember the overall number of peers we are trying to prune, so that if the routine is ran again, it will know that it doesn't have to prune yet another 10 random nodes.

@ntallar ntallar removed their request for review December 14, 2020 18:25
aakoshh and others added 3 commits December 14, 2020 20:50
@aakoshh aakoshh requested a review from enriquerodbe December 15, 2020 12:24
@aakoshh aakoshh assigned KonradStaniec and unassigned aakoshh Dec 15, 2020
@aakoshh aakoshh merged commit b368808 into develop Dec 15, 2020
@aakoshh aakoshh deleted the ETCM-446-connection-limit-ranges branch December 15, 2020 17:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants