Implement dormant connections/low overhead connections #7797

eskimor · 2020-12-28T11:54:04Z

Much of this will need to be implemented in substrate.

rphmeier · 2021-01-02T19:50:38Z

What changes in particular do you anticipate? I thought that at the Polkadot level we could simply avoid sending "dormant" peers any data and that would reduce overhead enough.

burdges · 2021-01-02T20:55:08Z

Ain't clear if dormant is the right word, but low overhead sounds correct. We'll have a lot of these open, but Linux can handle enough TCP provided ulimit is set correctly, but they all take the same TCP connection plus noise/tls state plus libp2p state. In fact, I'd presume this amounts to reconsidering the overhead that we impose upon every connection.

As an example, are we trying to gossip everything from all validators to all validators? That'd create problems. We also cannot remove gossip and ask everyone to send their own shit to everyone because then many messages never arrive. In the short term, we could simply take some off the gossip pool, although maybe we're better off with a purely randomized gossip.

In any case, I think this is about the overhead that libp2p and substrate impose upon connections due to expecting them to be used in a specific way.

eskimor · 2021-01-02T21:21:34Z

What changes in particular do you anticipate? I thought that at the Polkadot level we could simply avoid sending "dormant" peers any data and that would reduce overhead enough.

Interesting. That would be easier, for sure. I was under the impression that a lot of stuff is already happening at the substrate level. My goal would have been to prevent the opening of substreams all together for low overhead/dormant connections, so we more or less end up with the overhead TCP imposes, but nothing more.

rphmeier · 2021-01-02T23:39:03Z

As an example, are we trying to gossip everything from all validators to all validators?

This is what we're trying not to do! But at the moment we can't do it and also be connected to all validators. paritytech/polkadot#2177 has more details.

My goal would have been to prevent the opening of substreams all together for low overhead/dormant connections, so we more or less end up with the overhead TCP imposes, but nothing more.

In terms of prioritization, I think that unless substreams require a lot of overhead to maintain, the first and easiest thing to build will be to just avoid sending gossip messages to most peers. That all can be done on a higher level. How much does it take to maintain a substream?

bkchr · 2021-01-03T00:10:06Z

This has to be implemented in Substrate. Otherwise stuff like grandpa or sync would still run and this is clearly too much for a "dormant" peer.

rphmeier · 2021-01-03T05:12:15Z

@bkchr That's not true, at least after #7700 . GRANDPA and Sync can be limited to 25-50 peers even if parachain validation has 1000+.

tomaka · 2021-01-04T09:33:14Z

See this comment.
While it doesn't take much to maintain a substream, this new layer looks extremely redundant with what sc-network already does.

I think it would make more sense to change Substrate to make the substream handshake customizable, and to be able to pass a list of peers to keep a connection open with even when they don't have any notifications substream (i.e. dormant).

eskimor · 2021-01-04T10:04:04Z

Hmm, so far I came to the conclusion that with your latest PR it should be enough to just create a new peerset just for availability distribution which happens to have the size of the validator set (or two validator sets). So yeah, I also don't think we need another layer.

With regards to keeping it open, I don't know the timeouts in place, but in case we are regularly sending out availability chunks, they probably are not closed anyways and if we don't send them out regularly it should not matter much, if connections need to be re-opened.

This could in effect provide the right means to support collators with the same code path as validators (assuming we are doing something similar with availability recovery).

rphmeier · 2021-01-05T01:23:47Z

I don't believe it should be just for availability distribution. Availability Recovery and PoV Distribution also need direct validator<>validator connections.

eskimor · 2021-01-15T10:44:06Z

Closing as this should all be Polkadot based on #7700.

eskimor added the I9-optimisation An enhancement to provide better overall performance in terms of time-to-completion for a task. label Dec 28, 2020

eskimor self-assigned this Dec 28, 2020

github-actions bot added the J2-unconfirmed Issue might be valid, but it’s not yet known. label Dec 28, 2020

burdges mentioned this issue Jan 3, 2021

Implement own peer set for availability-distribution paritytech/polkadot#2177

Closed

eskimor closed this as completed Jan 15, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement dormant connections/low overhead connections #7797

Implement dormant connections/low overhead connections #7797

eskimor commented Dec 28, 2020

rphmeier commented Jan 2, 2021

burdges commented Jan 2, 2021

eskimor commented Jan 2, 2021 •

edited

Loading

rphmeier commented Jan 2, 2021

bkchr commented Jan 3, 2021

rphmeier commented Jan 3, 2021 •

edited

Loading

tomaka commented Jan 4, 2021

eskimor commented Jan 4, 2021

rphmeier commented Jan 5, 2021

eskimor commented Jan 15, 2021

Implement dormant connections/low overhead connections #7797

Implement dormant connections/low overhead connections #7797

Comments

eskimor commented Dec 28, 2020

rphmeier commented Jan 2, 2021

burdges commented Jan 2, 2021

eskimor commented Jan 2, 2021 • edited Loading

rphmeier commented Jan 2, 2021

bkchr commented Jan 3, 2021

rphmeier commented Jan 3, 2021 • edited Loading

tomaka commented Jan 4, 2021

eskimor commented Jan 4, 2021

rphmeier commented Jan 5, 2021

eskimor commented Jan 15, 2021

eskimor commented Jan 2, 2021 •

edited

Loading

rphmeier commented Jan 3, 2021 •

edited

Loading