-
Notifications
You must be signed in to change notification settings - Fork 2.7k
Implement dormant connections/low overhead connections #7797
Comments
What changes in particular do you anticipate? I thought that at the Polkadot level we could simply avoid sending "dormant" peers any data and that would reduce overhead enough. |
Ain't clear if dormant is the right word, but low overhead sounds correct. We'll have a lot of these open, but Linux can handle enough TCP provided ulimit is set correctly, but they all take the same TCP connection plus noise/tls state plus libp2p state. In fact, I'd presume this amounts to reconsidering the overhead that we impose upon every connection. As an example, are we trying to gossip everything from all validators to all validators? That'd create problems. We also cannot remove gossip and ask everyone to send their own shit to everyone because then many messages never arrive. In the short term, we could simply take some off the gossip pool, although maybe we're better off with a purely randomized gossip. In any case, I think this is about the overhead that libp2p and substrate impose upon connections due to expecting them to be used in a specific way. |
Interesting. That would be easier, for sure. I was under the impression that a lot of stuff is already happening at the substrate level. My goal would have been to prevent the opening of substreams all together for low overhead/dormant connections, so we more or less end up with the overhead TCP imposes, but nothing more. |
This is what we're trying not to do! But at the moment we can't do it and also be connected to all validators. paritytech/polkadot#2177 has more details.
In terms of prioritization, I think that unless substreams require a lot of overhead to maintain, the first and easiest thing to build will be to just avoid sending gossip messages to most peers. That all can be done on a higher level. How much does it take to maintain a substream? |
This has to be implemented in Substrate. Otherwise stuff like grandpa or sync would still run and this is clearly too much for a "dormant" peer. |
See this comment. I think it would make more sense to change Substrate to make the substream handshake customizable, and to be able to pass a list of peers to keep a connection open with even when they don't have any notifications substream (i.e. dormant). |
Hmm, so far I came to the conclusion that with your latest PR it should be enough to just create a new peerset just for availability distribution which happens to have the size of the validator set (or two validator sets). So yeah, I also don't think we need another layer. With regards to keeping it open, I don't know the timeouts in place, but in case we are regularly sending out availability chunks, they probably are not closed anyways and if we don't send them out regularly it should not matter much, if connections need to be re-opened. This could in effect provide the right means to support collators with the same code path as validators (assuming we are doing something similar with availability recovery). |
I don't believe it should be just for availability distribution. Availability Recovery and PoV Distribution also need direct validator<>validator connections. |
Closing as this should all be Polkadot based on #7700. |
Much of this will need to be implemented in substrate.
The text was updated successfully, but these errors were encountered: