ETCM-446: Connection limit ranges #833

aakoshh · 2020-12-04T13:19:52Z

Description

Currently the node will try to connect to nodes until their max-outgoing-peers is reached and limit incoming connection to max-incoming-peers. The problem is once incoming peers are saturated over the network it's difficult to find nodes to connect to, and this situation is likely to arise if outgoing >= incoming is configured.

The PR changes the connection strategy to be based on acceptable ranges:

min-outgoing-peers: The node is not seeking new connections if the outgoing handshaked connection count is above this number. Otherwise it tries to open up to max-outgoing-peers connections.
prune-incoming-peers: A number of incoming peers to try to prune if the incoming handshaked peer count hit the maximum and we are rejecting connections with TooManyPeers.
min-prune-age: A duration that a peer is minimally allowed to be connected before being selected for random pruning, and also a minimum amount of time that needs to pass between pruning attempts, to avoid hostile takeovers.

Proposed Solution

There are two ideas:

We can try to aggressively connect to a lot of nodes in the beginning, expecting that many of them won't accept our connection, but once we have enough peers, we can stop at a lower level, only opening more if we fall below that.
We can close down incoming connections if the overall number goes beyond a limit, so that new nodes can be accepted into the network instead. This also compensates to the aggressive over-connecting in the beginning.

The new default settings target at a 40/60 ratio for min-outgoing / max-incoming, to protect against eclipse attacks but also maintain that total(max-incoming) >= total(min-outgoing) on the the network level. max-outgoing is high to allow quickly trying to open many connections after startup, anticipating that most nodes will not be compatible or receptive.

Currently the pruning only kicks in when a new incoming connection would bring the number above the maximum. We could also prune periodically whenever the number equals the max. That would allow new joiners to be accepted immediately rather than when they try to re-connect after a short blacklisting.

To decide which connections to prune we take the average number of received responses over the lifetime of the peers, with higher being better. The statistics are collected for the last 12 hours, which is the same time as the bonding duration in discovery.

…until min handshaked is reached.

…m limit.

…n-limit-ranges

src/main/resources/application.conf

src/main/scala/io/iohk/ethereum/network/ConnectedPeers.scala

KonradStaniec · 2020-12-09T11:37:39Z

src/main/scala/io/iohk/ethereum/network/ConnectedPeers.scala

    )
  }
+
+  def prunePeers(
+      incoming: Boolean,


why is this flag is necessary ? Maybe i am missing some thing, but in call sites we are passing here true

In theory we could prune outgoing connections as well, the communication is bi-directional on them, maybe some of them would be better replaced with other nodes, perhaps if they are slow. But in this PR it's just about the incoming ones, to free incoming slots.

KonradStaniec · 2020-12-09T11:39:14Z

src/main/scala/io/iohk/ethereum/network/ConnectedPeers.scala


 case class ConnectedPeers(
    private val incomingPendingPeers: Map[PeerId, Peer],
    private val outgoingPendingPeers: Map[PeerId, Peer],
-    private val handshakedPeers: Map[PeerId, Peer]
+    private val handshakedPeers: Map[PeerId, Peer],
+    private val pruningPeers: Map[PeerId, Peer],


just to make sure, this is necessary to not prune peer twice while waiting for its proper disconnect ?

Yes, not to prune the same peer twice and also to remember the overall number of peers we are trying to prune, so that if the routine is ran again, it will know that it doesn't have to prune yet another 10 random nodes.

…n-limit-ranges

…it-ranges

…r pruning.

src/main/scala/io/iohk/ethereum/network/PeerManagerActor.scala

src/main/scala/io/iohk/ethereum/network/TimeSlotStats.scala

Co-authored-by: Enrique Rodríguez <enrique.rodriguez@iohk.io>

…n-limit-ranges

…utput-hk/mantis into ETCM-446-connection-limit-ranges

…n-limit-ranges

aakoshh force-pushed the ETCM-446-connection-limit-ranges branch 5 times, most recently from cf34f4a to 5ed1481 Compare December 4, 2020 15:31

aakoshh added 4 commits December 4, 2020 17:08

ETCM-446: Add min-outgoing-peers setting. Only seek more connections …

0014c54

…until min handshaked is reached.

ETCM-466: Prune incoming peers beyond a certain age if hit the maximu…

42d0392

…m limit.

ETCM-446: Only update last prune timestamp if we pruned somebody.

12f1fe1

ETCM-446: Skip shuffling if prune count is 0.

e8e0ee6

aakoshh force-pushed the ETCM-446-connection-limit-ranges branch from 989f989 to e8e0ee6 Compare December 4, 2020 17:13

aakoshh added 2 commits December 5, 2020 15:06

ETCM-446: Tweak connection numbers around the 40/60 ratio.

8e23e66

ETCM-446: Test pruning functions

bb9d6e0

aakoshh marked this pull request as ready for review December 7, 2020 15:40

aakoshh requested review from ntallar and KonradStaniec December 7, 2020 15:40

Merge remote-tracking branch 'origin/develop' into ETCM-446-connectio…

1ec3691

…n-limit-ranges

aakoshh requested review from mmrozek and kapke December 7, 2020 16:35

aakoshh assigned ntallar Dec 9, 2020

Merge remote-tracking branch 'origin/develop' into ETCM-446-connectio…

e6a53e2

…n-limit-ranges

KonradStaniec reviewed Dec 9, 2020

View reviewed changes

aakoshh added 3 commits December 9, 2020 12:07

ETCM-446: Raise max-outgoing-peers to 50.

35a8a73

ETCM-446: Move pruneability check to separate function.

0810e4c

Merge remote-tracking branch 'origin/develop' into ETCM-446-connectio…

8cdcb25

…n-limit-ranges

aakoshh mentioned this pull request Dec 9, 2020

ETCM-463: Add PeerStatisticsActor to track message counts #849

Merged

aakoshh assigned aakoshh and unassigned ntallar Dec 10, 2020

aakoshh added 3 commits December 14, 2020 12:17

Merge branch 'ETCM-463-track-peer-stats' into ETCM-446-connection-lim…

d162592

…it-ranges

ETCM-446: Use the average received response per second as priority fo…

c662b6b

…r pruning.

ETCM-446: Test the priority function.

f0f09dc

ETCM-446: Configure 12 hour stats duration.

fec5168

ntallar removed their request for review December 14, 2020 18:25

ETCM-446: Add pruning to the actor tests.

d6a0d76

aakoshh requested review from KonradStaniec, enriquerodbe and biandratti December 14, 2020 18:50

enriquerodbe reviewed Dec 14, 2020

View reviewed changes

src/main/scala/io/iohk/ethereum/network/PeerManagerActor.scala Outdated Show resolved Hide resolved

src/main/scala/io/iohk/ethereum/network/TimeSlotStats.scala Outdated Show resolved Hide resolved

aakoshh and others added 3 commits December 14, 2020 20:50

Update src/main/scala/io/iohk/ethereum/network/PeerManagerActor.scala

756d31a

Co-authored-by: Enrique Rodríguez <enrique.rodriguez@iohk.io>

Merge remote-tracking branch 'origin/develop' into ETCM-446-connectio…

afa0c46

…n-limit-ranges

Merge branch 'ETCM-446-connection-limit-ranges' of github.com:input-o…

b316476

…utput-hk/mantis into ETCM-446-connection-limit-ranges

aakoshh requested a review from enriquerodbe December 15, 2020 12:24

aakoshh assigned KonradStaniec and unassigned aakoshh Dec 15, 2020

ETCM-446: Fix test comments.

69efa13

enriquerodbe approved these changes Dec 15, 2020

View reviewed changes

Merge remote-tracking branch 'origin/develop' into ETCM-446-connectio…

87aa286

…n-limit-ranges

aakoshh merged commit b368808 into develop Dec 15, 2020

aakoshh deleted the ETCM-446-connection-limit-ranges branch December 15, 2020 17:19

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ETCM-446: Connection limit ranges #833

ETCM-446: Connection limit ranges #833

aakoshh commented Dec 4, 2020 •

edited

Loading

KonradStaniec Dec 9, 2020

aakoshh Dec 9, 2020

KonradStaniec Dec 9, 2020

aakoshh Dec 9, 2020

ETCM-446: Connection limit ranges #833

ETCM-446: Connection limit ranges #833

Conversation

aakoshh commented Dec 4, 2020 • edited Loading

Description

Proposed Solution

KonradStaniec Dec 9, 2020

Choose a reason for hiding this comment

aakoshh Dec 9, 2020

Choose a reason for hiding this comment

KonradStaniec Dec 9, 2020

Choose a reason for hiding this comment

aakoshh Dec 9, 2020

Choose a reason for hiding this comment

aakoshh commented Dec 4, 2020 •

edited

Loading