Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

{core/,swarm/}: Dial with handler and return handler on error and closed #2191

Merged
merged 43 commits into from
Aug 31, 2021

Conversation

mxinden
Copy link
Member

@mxinden mxinden commented Aug 12, 2021

Background

As promised in #2118 (reply in thread) I wanted to trial adding something similar to ProtocolsHandler::OutboundOpenInfo for connections instead of substreams to NetworkBehaviourAction::{DialPeer,DialAddress}.

#2118 (comment) details the benefits. Long story short, it allows one to attach state to a dial request, thus not having to keep track of it within a NetworkBehaviour implementation.

While trialing the above, I had the ideas for #2182 and #2183, both simplifying trait parameters, one in libp2p-swarm and one in libp2p-core. Passing an opaque OpenInfo type along with NetworkBehaviourAction::{DialPeer,DialAddress} would require yet another trait parameter. To circumvent this additional complexity, I came up with the below, in other words with the change proposed in this pull request.

Proposal

  • Require NetworkBehaviourAction::{DialPeer,DialAddress} to contain a ProtocolsHandler (see diff below). This allows a behaviour to attach custom state to its handler. The behaviour would no longer need to track this state separately during connection establishment, thus reducing state required in a behaviour. E.g. in the case of libp2p-kad the behaviour can include a GetRecord request in its handler, or e.g. in the case of libp2p-request-response the behaviour can include the first request in the handler.

    Diff
    +pub enum NetworkBehaviourAction<TInEvent, TOutEvent, THandler> {
         /// Instructs the `Swarm` to return an event when it is being polled.
         GenerateEvent(TOutEvent),
     
    @@ -237,6 +238,7 @@ pub enum NetworkBehaviourAction<TInEvent, TOutEvent> {
         DialAddress {
             /// The address to dial.
             address: Multiaddr,
    +        handler: THandler,
         },
     
         /// Instructs the swarm to dial a known `PeerId`.
    @@ -254,6 +256,7 @@ pub enum NetworkBehaviourAction<TInEvent, TOutEvent> {
             peer_id: PeerId,
             /// The condition for initiating a new dialing attempt.
             condition: DialPeerCondition,
    +        handler: THandler,
         },
  • Return ProtocolsHandler on connection error and close (see diff below). This allows a behaviour to extract its custom state previously included in the handler on connection failure and connection closing. E.g. in the case of libp2p-kad the behaviour could extract the attached GetRecord from the handler of the failed connection and then start another connection attempt with a new handler with the same GetRecord or bubble up an error to the user.

    Diff
    diff --git a/core/src/network/event.rs b/core/src/network/event.rs
    index 7b415826..82f12b13 100644
    --- a/core/src/network/event.rs
    +++ b/core/src/network/event.rs
    @@ -92,6 +92,7 @@ where
             send_back_addr: Multiaddr,
             /// The error that happened.
             error: PendingConnectionError<TTrans::Error>,
    +        handler: THandler,
         },
     
         /// A new connection to a peer has been established.
    @@ -124,12 +125,13 @@ where
             error: Option<ConnectionError<<THandler::Handler as ConnectionHandler>::Error>>,
             /// The remaining number of established connections to the same peer.
             num_established: u32,
    +        handler: THandler::Handler,
         },
     
         /// A dialing attempt to an address of a peer failed.
         DialError {
             /// The number of remaining dialing attempts.
    -        attempts_remaining: u32,
    +        attempts_remaining: DialAttemptsRemaining<THandler>,
     
             /// Id of the peer we were trying to dial.
             peer_id: PeerId,
    @@ -148,6 +150,8 @@ where
     
             /// The error that happened.
             error: PendingConnectionError<TTrans::Error>,
    +
    +        handler: THandler,
         },
     
         /// An established connection produced an event.
    @@ -169,6 +173,21 @@ where
         },
     }
     
    +pub enum DialAttemptsRemaining<THandler> {
    +    // TODO: Make this a NonZeroU32.
    +    Some(u32),
    +    None(THandler),
    +}

What do people think? Do you see additional use-cases this would support or do you see use-cases that would not be but should be supported? Especially interested n @thomaseizinger opinion given your initial proposal on #2118.

Copy link
Contributor

@thomaseizinger thomaseizinger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nice idea. I think it is a good direction.

swarm/src/lib.rs Outdated Show resolved Hide resolved
@@ -65,6 +65,11 @@ where
self.substream_upgrade_protocol_override = version;
self
}

// TODO: Rethink this method.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've always found the indirection between ConnectionHandler and ProtocolsHandler a bit annoying. I understand that it is necessary because libp2p-core can't know about the swarm abstraction. I do wonder though on whether or not libp2p-core couldn't provide the ProtocolsHandler abstraction right away instead of us having to effectively re-define almost the same interface.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is an option worth exploring. Though I don't think we should do so in this pull request to separate concerns. Note that moving ProtocolsHandler into libp2p-core would require moving concepts like KeepAlive into libp2p-core as well.

Comment on lines +250 to +254
pub enum NetworkBehaviourAction<
TOutEvent,
THandler: IntoProtocolsHandler,
TInEvent = THandlerInEvent<THandler>,
> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar to THandlerInEvent, I think we could define NetworkBehaviourAction in terms of the NetworkBehaviour trait. That would reduce the amount of type parameters down to one.

It would also make it easier to refer to because you can just say NetworkBehaviourAction<Self> within the NetworkBehaviour.

Thoughts?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah we would have to give up our map_XYZ functions then.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea @thomaseizinger which I started to explore. See mxinden#5

Ah we would have to give up our map_XYZ functions then.

Not as far as I can tell with additional trait parameters (each defaulting to a derivative of TBehaviour). See PR above.

Unfortunately I don't see a way to use TBehaviour in NetworkBehaviourAction. Unless I do so, compilation fails.

Let's move the discussion to mxinden#5.

@mxinden
Copy link
Member Author

mxinden commented Aug 25, 2021

Reviews of all kinds are very much appreciated! (//CC @thomaseizinger in case you have time and interest.)

/// (ie. the objects returned by `new_handler`) can communicate by passing messages. Messages
/// sent from the handler to the behaviour are injected with [`NetworkBehaviour::inject_event`],
/// and the behaviour can send a message to the handler by making [`NetworkBehaviour::poll`]
/// return [`NetworkBehaviourAction::SendEvent`].
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you rename it? Otherwise I think this is actually called NotifyHandler :)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 It was previously SendEvent, I changed it to ['NetworkBehaviour::SendEvent']. 4d0faf9 changes it to ['NetworkBehaviour::NotifyHandler'].

Copy link
Contributor

@thomaseizinger thomaseizinger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work, thanks for taking this on!

Left a few nits but from what I can see, this should work :)

Maybe a test or an example within libp2p-swarm that demonstrates this usage could be useful? I am talking about the actual usecase of carrying a message within the handler and reporting back to the user in case it failed or sending it directly once the connection succeeds.

core/src/connection/manager.rs Outdated Show resolved Hide resolved
core/src/connection/pool.rs Outdated Show resolved Hide resolved
core/src/network.rs Show resolved Hide resolved
handler: THandler,
},
/// The dialing attempt is rejected because the peer being dialed is the local peer.
LocalPeerId { handler: THandler },
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe SelfDial is a more descriptive name?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I want to stress the fact that the peer ids match. This would e.g. not work by matching multiaddresses. All that said, I don't think LocalPeerId is the perfect name either.

Comment on lines 183 to 186
match attempts_remaining {
DialAttemptsRemaining::Some(attempts) => (*attempts).into(),
DialAttemptsRemaining::None(_) => 0,
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this could be written like this at which point it is questionable that we need the From implementation?

Generally, wouldn't this be cleaner to be a regular function within an impl DialAttemptsRemaining?

Suggested change
match attempts_remaining {
DialAttemptsRemaining::Some(attempts) => (*attempts).into(),
DialAttemptsRemaining::None(_) => 0,
}
attempts_remaining.map(|attempts| attempts.get()).unwrap_or(0)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see a way to do the above, given that the close of map_err would take ownership of handler, which is later on used. I don't think rustc is able to detect the early return.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess this comment should go somewhere else? Or is this a GitHub glitch?

@@ -747,7 +747,7 @@ mod tests {

// check that our subscriptions are sent to each of the peers
// collect all the SendEvents
let send_events: Vec<&NetworkBehaviourAction<Arc<GossipsubHandlerIn>, GossipsubEvent>> = gs
let send_events: Vec<_> = gs
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!

@mxinden
Copy link
Member Author

mxinden commented Aug 26, 2021

Maybe a test or an example within libp2p-swarm that demonstrates this usage could be useful? I am talking about the actual usecase of carrying a message within the handler and reporting back to the user in case it failed or sending it directly once the connection succeeds.

Yep. Good idea. I added a doc example, which thus both showcases the use-case and is continuously tested. (814ff4b)

Also note that the DCUtR implementation builds on top of this patch already: #2076

swarm/src/behaviour.rs Outdated Show resolved Hide resolved
swarm/src/lib.rs Outdated
Comment on lines 1707 to 1711

/// [`NetworkBehaviourAction::DialAddress`] and [`NetworkBehaviourAction::DialPeer`] require a
/// handler. This handler can be used to carry state. See corresponding doc comments.
#[test]
fn use_handler_to_carry_state() {}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is that a left-over from adding a dedicated test but then resorting to a doc-test?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ups. Thanks for the pointer. Removed in 60c7261.

@mxinden
Copy link
Member Author

mxinden commented Aug 30, 2021

Unless there are any objections or further suggestions, I will merge here sometime this week.

Copy link
Contributor

@thomaseizinger thomaseizinger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

protocols/gossipsub/src/behaviour.rs Outdated Show resolved Hide resolved
Co-authored-by: Thomas Eizinger <thomas@eizinger.io>
@mxinden mxinden merged commit c161acf into libp2p:master Aug 31, 2021
dvc94ch added a commit to dvc94ch/rust-libp2p that referenced this pull request Sep 14, 2021
* protocols/gossipsub: Fix inconsistency in mesh peer tracking (libp2p#2189)

Co-authored-by: Age Manning <Age@AgeManning.com>

* misc/metrics: Add auxiliary crate to record events as OpenMetrics (libp2p#2063)

This commit adds an auxiliary crate recording protocol and Swarm events
and exposing them as metrics in the OpenMetrics format.

* README: Mention security@ipfs.io

* examples/: Add file sharing example (libp2p#2186)

Basic file sharing application with peers either providing or locating
and getting files by name.

While obviously showcasing how to build a basic file sharing
application, the actual goal of this example is **to show how to
integrate rust-libp2p into a larger application**.

Architectural properties

- Clean clonable async/await interface ([`Client`]) to interact with the
network layer.

- Single task driving the network layer, no locks required.

* examples/README: Give an overview over the many examples (libp2p#2194)

* protocols/kad: Enable filtering of (provider) records (libp2p#2163)

Introduce `KademliaStoreInserts` option, which allows to filter records.

Co-authored-by: Max Inden <mail@max-inden.de>

* swarm/src/protocols_handler: Impl ProtocolsHandler on either::Either (libp2p#2192)

Implement ProtocolsHandler on either::Either representing either of two
ProtocolsHandler implementations.

Co-authored-by: Thomas Eizinger <thomas@eizinger.io>

* *: Make libp2p-core default features optional (libp2p#2181)

Co-authored-by: Max Inden <mail@max-inden.de>

* core/: Remove DisconnectedPeer::set_connected and Pool::add (libp2p#2195)

This logic seems to be a leftover of
libp2p#889 and unused today.

* protocols/gossipsub: Use ed25519 in tests (libp2p#2197)

With f2905c0 the secp256k1 feature is
disabled by default. Instead of enabling it in the dev-dependency,
simply use ed25519.

* build(deps): Update minicbor requirement from 0.10 to 0.11 (libp2p#2200)

Updates the requirements on [minicbor](https://gitlab.com/twittner/minicbor) to permit the latest version.
- [Release notes](https://gitlab.com/twittner/minicbor/tags)
- [Changelog](https://gitlab.com/twittner/minicbor/blob/master/CHANGELOG.md)
- [Commits](https://gitlab.com/twittner/minicbor/compare/minicbor-v0.10.0...minicbor-v0.11.0)

---
updated-dependencies:
- dependency-name: minicbor
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* build(deps): Update salsa20 requirement from 0.8 to 0.9 (libp2p#2206)

* build(deps): Update salsa20 requirement from 0.8 to 0.9

Updates the requirements on [salsa20](https://github.com/RustCrypto/stream-ciphers) to permit the latest version.
- [Release notes](https://github.com/RustCrypto/stream-ciphers/releases)
- [Commits](RustCrypto/stream-ciphers@ctr-v0.8.0...salsa20-v0.9.0)

---
updated-dependencies:
- dependency-name: salsa20
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>

* *: Bump pnet to v0.22

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Max Inden <mail@max-inden.de>

* *: Dial with handler and return handler on error and closed (libp2p#2191)

Require `NetworkBehaviourAction::{DialPeer,DialAddress}` to contain a
`ProtocolsHandler`. This allows a behaviour to attach custom state to its
handler. The behaviour would no longer need to track this state separately
during connection establishment, thus reducing state required in a behaviour.
E.g. in the case of `libp2p-kad` the behaviour can include a `GetRecord` request
in its handler, or e.g. in the case of `libp2p-request-response` the behaviour
can include the first request in the handler.

Return `ProtocolsHandler` on connection error and close. This allows a behaviour
to extract its custom state previously included in the handler on connection
failure and connection closing. E.g. in the case of `libp2p-kad` the behaviour
could extract the attached `GetRecord` from the handler of the failed connection
and then start another connection attempt with a new handler with the same
`GetRecord` or bubble up an error to the user.

Co-authored-by: Thomas Eizinger <thomas@eizinger.io>

* core/: Remove deprecated read/write functions (libp2p#2213)

Co-authored-by: Max Inden <mail@max-inden.de>

* protocols/ping: Revise naming of symbols (libp2p#2215)

Co-authored-by: Max Inden <mail@max-inden.de>

* protocols/rendezvous: Implement protocol (libp2p#2107)

Implement the libp2p rendezvous protocol.

> A lightweight mechanism for generalized peer discovery. It can be used for
bootstrap purposes, real time peer discovery, application specific routing, and
so on.

Co-authored-by: rishflab <rishflab@hotmail.com>
Co-authored-by: Daniel Karzel <daniel@comit.network>

* core/src/network/event.rs: Fix typo (libp2p#2218)

* protocols/mdns: Do not fire all timers at the same time. (libp2p#2212)

Co-authored-by: Max Inden <mail@max-inden.de>

* misc/metrics/src/kad: Set query_duration lowest bucket to 0.1 sec (libp2p#2219)

Probability for a Kademlia query to return in less than 100 milliseconds
is low, thus increasing the lower bucket to improve accuracy within the
higher ranges.

* misc/metrics/src/swarm: Expose role on connections_closed (libp2p#2220)

Expose whether closed connection was a Dialer or Listener.

* .github/workflows/ci.yml: Use clang 11 (libp2p#2233)

* protocols/rendezvous: Update prost (libp2p#2226)

Co-authored-by: Max Inden <mail@max-inden.de>

* *: Fix clippy warnings (libp2p#2227)

* swarm-derive/: Make event_process = false the default (libp2p#2214)

Co-authored-by: Max Inden <mail@max-inden.de>

Co-authored-by: Max Inden <mail@max-inden.de>
Co-authored-by: Age Manning <Age@AgeManning.com>
Co-authored-by: Ruben De Smet <ruben.de.smet@rubdos.be>
Co-authored-by: Thomas Eizinger <thomas@eizinger.io>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: rishflab <rishflab@hotmail.com>
Co-authored-by: Daniel Karzel <daniel@comit.network>
Co-authored-by: David Craven <david@craven.ch>
jsdanielh added a commit to nimiq/core-rs-albatross that referenced this pull request Nov 21, 2021
Fix incoming dial failure handling for connection pool behaviour.
Since the recent update of libp2p, [this
change](libp2p/rust-libp2p#2191) introduces
state to dial request that can be transmitted in
`inject_dial_failure`. So there are a couple of cases where the
function is called but we can ignore the error since the connection
is still to be completed or is already connected.
jsdanielh added a commit to nimiq/core-rs-albatross that referenced this pull request Nov 22, 2021
Fix incoming dial failure handling for connection pool behaviour.
Since the recent update of libp2p, [this
change](libp2p/rust-libp2p#2191) introduces
state to dial request that can be transmitted in
`inject_dial_failure`. So there are a couple of cases where the
function is called but we can ignore the error since the connection
is still to be completed or is already connected.
jsdanielh added a commit to nimiq/core-rs-albatross that referenced this pull request Nov 22, 2021
Fix incoming dial failure handling for connection pool behaviour.
Since the recent update of libp2p, [this
change](libp2p/rust-libp2p#2191) introduces
state to dial request that can be transmitted in
`inject_dial_failure`. So there are a couple of cases where the
function is called but we can ignore the error since the connection
is still to be completed or is already connected.

This fixes #447.
jsdanielh added a commit to nimiq/core-rs-albatross that referenced this pull request Nov 22, 2021
Fix incoming dial failure handling for connection pool behavior.
Since the recent update of libp2p, [this
change](libp2p/rust-libp2p#2191) introduces
state to dial request that can be transmitted in
`inject_dial_failure`. So there are a couple of cases where the
function is called but we can ignore the error since the connection
is still to be completed or is already connected.

This fixes #447.
jsdanielh added a commit to nimiq/core-rs-albatross that referenced this pull request Nov 22, 2021
Fix incoming dial failure handling for connection pool behavior.
Since the recent update of libp2p, [this
change](libp2p/rust-libp2p#2191) introduces
state to dial request that can be transmitted in
`inject_dial_failure`. So there are a couple of cases where the
function is called but we can ignore the error since the connection
is still to be completed or is already connected.

This fixes #447.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants