Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

{core,swarm}: Remove Network abstraction #2492

Merged
merged 10 commits into from
Feb 13, 2022
Merged

Conversation

mxinden
Copy link
Member

@mxinden mxinden commented Feb 8, 2022

This commit removes the Network abstraction, thus managing Listeners
and the connection Pool in Swarm directly. This is done under the
assumption that noone uses the Network abstraction directly, but
instead everyone always uses it through Swarm. Both Listeners and
Pool are moved from libp2p-core into libp2p-swarm. Given that they
are no longer exposed via Network, they can be treated as an
implementation detail of libp2p-swarm and Swarm.

This change does not include any behavioural changes.

This change has the followin benefits:

  • Removal of NetworkEvent, which was mostly an isomorphism of
    SwarmEvent.
  • Removal of the never-directly-used Network abstraction.
  • Removal of now obsolete verbose Peer (core/src/network/peer.rs)
    construct.
  • Removal of libp2p-core DialOpts, which is a direct mapping of
    libp2p-swarm DialOpts.
  • Allowing breaking changes to the connection handling and Swarm API
    interface without a breaking change in libp2p-core and thus a
    without a breaking change in /transport protocols.

This change enables the following potential future changes:

  • Removal of NodeHandler and ConnectionHandler. Thus allowing to
    rename ProtocolsHandler into ConnectionHandler.
  • Moving NetworkBehaviour and ProtocolsHandler into libp2p-core,
    having libp2p-xxx protocol crates only depend on libp2p-core and
    thus allowing general breaking changes to Swarm without breaking all
    libp2p-xxx crates.

This commit removes the `Network` abstraction, thus managing `Listeners`
and the connection `Pool` in `Swarm` directly. This is done under the
assumption that noone uses the `Network` abstraction directly, but
instead everyone always uses it through `Swarm`. Both `Listeners` and
`Pool` are moved from `libp2p-core` into `libp2p-swarm`. Given that they
are no longer exposed via `Network`, they can be treated as an
implementation detail of `libp2p-swarm` and `Swarm`.

This change does not include any behavioural changes.

This change has the followin benefits:

- Removal of `NetworkEvent`, which was mostly an isomorphism of
  `SwarmEvent`.
- Removal of the never-directly-used `Network` abstraction.
- Removal of now obsolete verbose `Peer` (`core/src/network/peer.rs`)
  construct.
- Removal of `libp2p-core` `DialOpts`, which is a direct mapping of
  `libp2p-swarm` `DialOpts`.
- Allowing breaking changes to the connection handling and `Swarm` API
  interface without a breaking change in `libp2p-core` and thus a
  without a breaking change in `/transport` protocols.

This change enables the following potential future changes:

- Removal of `NodeHandler` and `ConnectionHandler`. Thus allowing to
  rename `ProtocolsHandler` into `ConnectionHandler`.
- Moving `NetworkBehaviour` and `ProtocolsHandler` into `libp2p-core`,
  having `libp2p-xxx` protocol crates only depend on `libp2p-core` and
  thus allowing general breaking changes to `Swarm` without breaking all
  `libp2p-xxx` crates.
@mxinden
Copy link
Member Author

mxinden commented Feb 8, 2022

+1,315 −3,147

I would argue that this is a pretty good ratio as a refactoring 🙂

@thomaseizinger
Copy link
Contributor

I am excited to review this, hopefully will get to it today or tomorrow!

Copy link
Contributor

@rkuhn rkuhn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, nice simplification!

}
}
Poll::Ready(Ok(ConnectionHandlerEvent::OutboundSubstreamRequest(user_data))) => {
self.muxing.open_substream(user_data);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Such an unbounded loop in poll always makes me uneasy — this is cooperative scheduling after all. How about using cx.waker().wake_by_ref()? That also makes me uneasy in the case that this poll is called deep inside some nested task, so the optimal choice isn’t crystal clear (at least not to me).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

 this is cooperative scheduling after all

Yes, good point.

poll is called deep inside some nested task

This is running on the connection specific task. In other words, there is only one level above it, namely the future polling the channel to the Swarm and Connection::poll.

(Note that this has not been changed in this pull request, merely copied from core/src/connection.rs. Not arguing to not change it, just suggesting to change it within another pull request.)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, so in this case wake_by_ref is desirable. This is one unfortunate consequence of Rust’s Future design: reusing this future somewhere else may break this assumption, which will silently introduce performance or fairness issues.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In order for wake_by_ref to work, the following guarantee needs to be made by an executor:

If the waker of a task is woken, where the task itself is currently running, the task is still rescheduled to be polled again.

In other words, the optimization of ignoring wake signals for running tasks needs to be forbidden.

If I am not mistaken this is an unwritten rule, or do you know of a place where this behavior is documented @rkuhn? Maybe @tomaka knows this?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, that wouldn’t be an optimisation, it would break a lot of code. Including Tokio.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Somewhat related and an interesting read: https://tokio.rs/blog/2020-04-preemption

Copy link
Contributor

@MarcoPolo MarcoPolo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yessss. Let's code to understand! Very excited for the potential future changes, especially:

protocol crates only depend on libp2p-core and
thus allowing general breaking changes to Swarm without breaking all
libp2p-xxx crates.

That would be very cool!

@@ -0,0 +1,204 @@
// Copyright 2020 Parity Technologies (UK) Ltd.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Useful for seeing that this is just core/src/connection.rs with less stuff:

git diff upstream/master:core/src/connection.rs head:swarm/src/connection.rs

@@ -397,8 +385,8 @@ where
/// All connections to the peer, whether pending or established are
/// closed asap and no more events from these connections are emitted
/// by the pool effective immediately.
pub fn disconnect(&mut self, peer: &PeerId) {
if let Some(conns) = self.established.get_mut(peer) {
pub fn disconnect(&mut self, peer: PeerId) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why this change?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PeerId is Copy, thus there is no need to pass a reference. Given that this is an internal API now, I thought breaking the signature is worth the simplifications on the caller side.

Happy to revert this change @MarcoPolo. What do you think?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good. I didn’t realize it was copy.

/// longer be able to deliver events to the associated `ConnectionHandler`,
/// thus exerting back-pressure on the connection and peer API.
pub fn with_notify_handler_buffer_size(mut self, n: NonZeroUsize) -> Self {
self.task_command_buffer_size = n.get() - 1;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know this is pre-existing, but why -1?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The -1 is to hide the implementation detail of futures::channel::mpsc::channel:

The channel’s capacity is equal to buffer + num-senders. In other words, each sender gets a guaranteed slot in the channel capacity, and on top of that there are buffer “first come, first serve” slots available to all senders.

https://docs.rs/futures/latest/futures/channel/mpsc/fn.channel.html

Given that each sender gets an additional slot, and given that we have one sender, we need to subtract 1 to have the right buffer size.

Copy link
Contributor

@thomaseizinger thomaseizinger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Excellent work!

In my view, this is a really good change in the right direction. I am looking forward to the simplifications mentioned in the PR description! :)

@@ -153,12 +154,12 @@ fn p2p_addr(peer: Option<PeerId>, addr: Multiaddr) -> Result<Multiaddr, Multiadd
None => return Ok(addr),
};

if let Some(multiaddr::Protocol::P2p(hash)) = addr.iter().last() {
if let Some(Protocol::P2p(hash)) = addr.iter().last() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can't comment on the file directly so will do it here:

Why can't ConcurrentDial not stay within libp2p-core? It only depends on libp2p-core defined interfaces from what I can see so could be a useful primitive for building on top of libp2p-core without libp2p-swarm.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We would have to make it public for that which is different from what it was before!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could be a useful primitive for building on top of libp2p-core without libp2p-swarm.

Agreed that in the long run we could expose it and move it to libp2p-core.

We would have to make it public for that which is different from what it was before!

There might be a couple of follow-up simplifications that might touch ConcurrentDial, thus I would like to wait will those are done, before exposing it to external users.

@dvc94ch
Copy link
Contributor

dvc94ch commented Feb 13, 2022

might also be nice to simplify the NetworkBehaviour trait by adding an inject_swarm_event, although I've tried it before and it never seemed to work. maybe after this it's possible to do.

@mxinden mxinden merged commit 7fc342e into libp2p:master Feb 13, 2022
@@ -1,5 +1,9 @@
# 0.32.0 [unreleased]

- Remove `Network`. `libp2p-core` is from now on an auxiliary crate only. Users
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is an interesting take (the "auxiliary" part).

My PoV is that libp2p-core is / should remain a standalone foundation that should make it possible to write wire-compatible libp2p applications. i.e. the entire connection upgrade process etc that is currently in there is pretty foundational. Perhaps we agree on this and "auxiliary" was just not the best word to use here but if we don't agree then I'd like to discuss it :D

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. I think we are on the same page. "Foundational" would have been a better word here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants