refactor(swarm)!: don't share event buffer for established connections #3188

thomaseizinger · 2022-12-02T02:20:28Z

Description

Currently, we only have a single channel for all established connections. This requires us to construct the channel ahead of time, before we even have a connection. As it turns out, sharing this buffer across all connections actually has downsides. In particular, this means a single, very busy connection can starve others by filling up this buffer, forcing other connections to wait until they can emit an event.

Notes

Depends-On: #3187

Links to any relevant issues

Open Questions

Does this need a changelog entry?

Change checklist

I have performed a self-review of my own code
I have made corresponding changes to the documentation
~~I have added tests that prove my fix is effective or that my feature works~~
A changelog entry has been made in the appropriate crates

The task for a pending connection only ever sends one event into this channel: Either a success or a failure. Cloning a sender adds one slot to the capacity of the channel. Hence, we can start this capacity at 0 and have the `cloning` of the `Sender` take care of properly increasing the capacity.

mxinden

In general, having a channel per connection sounds reasonable to me.

I am not yet sure I agree with the underlying motivation, namely whether we should get rid of SwarmBuilder. Happy to proceed here though.

swarm/src/connection/pool.rs

thomaseizinger · 2022-12-06T13:44:01Z

I am not yet sure I agree with the underlying motivation, namely whether we should get rid of SwarmBuilder.

Do you mind posting your concerns/arguments to #3186?

elenaf9

I am not sure whether we should simply release this as a patch.
The default for this value was (and remains) 7.
Users that are dealing with a large number of connections per node probably changed this value in their implementation to a much higher number (e.g. iroh to 256).
Buffering 256 events in total vs 256 events per connection is imo a significant difference.

I would actually prefer to do an explicit breaking change here by removing/ renaming connection_event_buffer_size and adding per_connection_event_buffer_size to ensure that users take note of this change.

swarm/src/connection/pool.rs

thomaseizinger · 2022-12-10T21:46:51Z

Explicit breaking changes can be a useful tool but I think it might be a bit exaggerated here. What problems other than more memory usage do you see that might arise from this?

Also, breaking changes in libp2p-swarm and libp2p-core are such a pain to pull through the workspace that I am honestly not willing to do it for such a (IMO) minor thing. Happy to do it if we are already in the middle of a release cycle with breaking changes in it.

dignifiedquire · 2022-12-10T22:01:18Z

From a user perspective I think, as long as this is clearly communicated in the release notes/changelog, it is fine to not make this a breaking change.

dignifiedquire · 2022-12-10T22:02:08Z

Generally I am glad to see this change, as I suspect we actually ran into this issue, where a protocol emitting a lot of events per connection would overflow the rest of the system.

mergify · 2022-12-14T15:33:00Z

This pull request has merge conflicts. Could you please resolve them @thomaseizinger? 🙏

elenaf9 · 2022-12-20T21:22:15Z

Explicit breaking changes can be a useful tool but I think it might be a bit exaggerated here. What problems other than more memory usage do you see that might arise from this?

Also, breaking changes in libp2p-swarm and libp2p-core are such a pain to pull through the workspace that I am honestly not willing to do it for such a (IMO) minor thing. Happy to do it if we are already in the middle of a release cycle with breaking changes in it.

From a user perspective I think, as long as this is clearly communicated in the release notes/changelog, it is fine to not make this a breaking change.

I don't feel strongly about it; maybe I am being overly careful. I am currently on vacation and don't have time to re-review.
Feel free to move forward here without the explicit breaking change.

mergify · 2022-12-23T00:15:04Z

This pull request has merge conflicts. Could you please resolve them @thomaseizinger? 🙏

mxinden · 2023-01-02T17:44:46Z

Generally I am glad to see this change, as I suspect we actually ran into this issue, where a protocol emitting a lot of events per connection would overflow the rest of the system.

@dignifiedquire note that this does not introduce a limit per protocol, but per connection. Thus, in case your protocol was previously reaching the limit and thus harming other protocols, it might still be doing so with this change, just a little delayed and per connection.

mxinden · 2023-01-02T17:51:54Z

I am not sure whether we should simply release this as a patch. The default for this value was (and remains) 7. Users that are dealing with a large number of connections per node probably changed this value in their implementation to a much higher number (e.g. iroh to 256). Buffering 256 events in total vs 256 events per connection is imo a significant difference.

I would actually prefer to do an explicit breaking change here by removing/ renaming connection_event_buffer_size and adding per_connection_event_buffer_size to ensure that users take note of this change.

I agree with @elenaf9. I do think this is a significant change for folks in low-resource or high-connection environments and thus I think this change in functionality warrants a breaking change.

What problems other than more memory usage do you see that might arise from this?

Higher memory usage can be used as an attack, e.g. forcing the process to run out of memory. This is especially relevant as this is the size of the channel from the ConnectionHandler to the NetworkBehaviour and not the other way around. The former is potentially controlled by an attacker, in the worst case with large-sized messages.

thomaseizinger · 2023-01-16T00:45:56Z

I am not sure whether we should simply release this as a patch. The default for this value was (and remains) 7. Users that are dealing with a large number of connections per node probably changed this value in their implementation to a much higher number (e.g. iroh to 256). Buffering 256 events in total vs 256 events per connection is imo a significant difference.
I would actually prefer to do an explicit breaking change here by removing/ renaming connection_event_buffer_size and adding per_connection_event_buffer_size to ensure that users take note of this change.

I agree with @elenaf9. I do think this is a significant change for folks in low-resource or high-connection environments and thus I think this change in functionality warrants a breaking change.

Should it come with a deprecation warning first then? i.e. should we maintain the old functionality in parallel so users can gradually migrate?

thomaseizinger · 2023-01-16T00:46:41Z

Also, are we fine with users who are sticking to the default value potentially not noticing this?

mxinden · 2023-01-16T13:24:32Z

Should it come with a deprecation warning first then? i.e. should we maintain the old functionality in parallel so users can gradually migrate?

I am assuming (intuition) that maintaining both introduces unreasonable complexity. Would you agree?

I am fine with a hard breaking change without deprecation. It does not require a large change on the user side. I consider this an expert-option, only set by people deeply familiar with libp2p.

Also, are we fine with users who are sticking to the default value potentially not noticing this?

I consider both the former and the new default value sane. Thus I don't think further notifications beyond the changelog are needed.

thomaseizinger · 2023-01-17T05:37:10Z

Ready for re-review.

Code has changed.

mxinden

Thanks for the follow-ups!

thomaseizinger added 2 commits December 2, 2022 13:04

Don't share event buffer between established connections

0019edf

thomaseizinger requested a review from mxinden December 2, 2022 02:20

thomaseizinger changed the base branch from master to 3186-no-buffer-size-pending-connections December 2, 2022 02:20

thomaseizinger mentioned this pull request Dec 2, 2022

feat(swarm): deprecate SwarmBuilder in favor of configuring Swarm #3189

Closed

4 tasks

Properly handle empty stream of established connections

d610b85

mxinden reviewed Dec 6, 2022

View reviewed changes

swarm/src/connection/pool.rs Outdated Show resolved Hide resolved

Document semantic change of variable

1b67152

elenaf9 requested changes Dec 10, 2022

View reviewed changes

swarm/src/connection/pool.rs Outdated Show resolved Hide resolved

Merge branch 'master' into 3186-one-channel-per-connection

57f11c2

Base automatically changed from 3186-no-buffer-size-pending-connections to master December 19, 2022 05:25

Update docs

39d02f4

thomaseizinger requested review from elenaf9 and mxinden December 19, 2022 05:29

elenaf9 previously approved these changes Dec 20, 2022

View reviewed changes

Merge branch 'master' into 3186-one-channel-per-connection

7d80e8a

thomaseizinger changed the title ~~refactor(swarm): don't share event buffer between established connections~~ refactor(swarm)!: don't share event buffer between established connections Jan 17, 2023

Remove bad changelog entry

f71b87a

Rename internal variable

5d49db8

thomaseizinger requested a review from elenaf9 January 17, 2023 05:37

Fix formatting

e9fc37c

thomaseizinger changed the title ~~refactor(swarm)!: don't share event buffer between established connections~~ refactor(swarm)!: don't share event buffer for established connections Jan 17, 2023

mxinden approved these changes Jan 19, 2023

View reviewed changes

thomaseizinger added the send-it label Jan 19, 2023

Merge branch 'master' into 3186-one-channel-per-connection

96c8cb7

mergify bot merged commit b5a3f81 into master Jan 19, 2023

mergify bot deleted the 3186-one-channel-per-connection branch January 19, 2023 22:49

dmitry-markin mentioned this pull request Apr 3, 2023

Upgrade to libp2p 0.51.3 paritytech/substrate#13587

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

refactor(swarm)!: don't share event buffer for established connections #3188

refactor(swarm)!: don't share event buffer for established connections #3188

thomaseizinger commented Dec 2, 2022 •

edited

Loading

mxinden left a comment

thomaseizinger commented Dec 6, 2022

elenaf9 left a comment •

edited

Loading

thomaseizinger commented Dec 10, 2022

dignifiedquire commented Dec 10, 2022

dignifiedquire commented Dec 10, 2022

mergify bot commented Dec 14, 2022

elenaf9 commented Dec 20, 2022

mergify bot commented Dec 23, 2022

mxinden commented Jan 2, 2023

mxinden commented Jan 2, 2023

thomaseizinger commented Jan 16, 2023

thomaseizinger commented Jan 16, 2023

mxinden commented Jan 16, 2023

thomaseizinger commented Jan 17, 2023

mxinden left a comment

refactor(swarm)!: don't share event buffer for established connections #3188

refactor(swarm)!: don't share event buffer for established connections #3188

Conversation

thomaseizinger commented Dec 2, 2022 • edited Loading

Description

Notes

Links to any relevant issues

Open Questions

Change checklist

mxinden left a comment

Choose a reason for hiding this comment

thomaseizinger commented Dec 6, 2022

elenaf9 left a comment • edited Loading

Choose a reason for hiding this comment

thomaseizinger commented Dec 10, 2022

dignifiedquire commented Dec 10, 2022

dignifiedquire commented Dec 10, 2022

mergify bot commented Dec 14, 2022

elenaf9 commented Dec 20, 2022

mergify bot commented Dec 23, 2022

mxinden commented Jan 2, 2023

mxinden commented Jan 2, 2023

thomaseizinger commented Jan 16, 2023

thomaseizinger commented Jan 16, 2023

mxinden commented Jan 16, 2023

thomaseizinger commented Jan 17, 2023

mxinden left a comment

Choose a reason for hiding this comment

thomaseizinger commented Dec 2, 2022 •

edited

Loading

elenaf9 left a comment •

edited

Loading