add a multiselect 2.0 spec #227

marten-seemann · 2019-11-04T04:12:33Z

I tried to condense everything down to make the new protocol as simple as possible, while providing an extension point that will allow us to deploy all of the optimizations that people have already suggested in the future.

The document here describes only the stream-based use case, and doesn't cover the packet-based use case yet. I believe that we can use a very similar format for unreliable connections, but there seem to be some subtle differences that would prevent us from reusing the exact wire format of the Use and Offer message defined in this document:

Offer can offer multiple protocols. This makes sense when establishing communication on a reliable channel, but it doesn't really work out in an unreliable protocol. For packet-based protocols, it probably makes sense to restrict Offer such that it's only possible to offer a single protocol.
Use uses a oneof of name and id. This works well when a stream correlates the Use to an Offer sent before, but isn't enough in the packet-based use case, where you might have multiple Offers in flight. Here, it would make sense to define a Use message such that it includes both a name and an id.
A packet-based protocol needs a Reject message to communicate that a certain protocol is not supported.

connections/multiselect2.md

yusefnapora · 2019-11-05T16:39:16Z

connections/multiselect2.md

+
+Note that this negotiation scheme allows peers to negotiate a "monoplexed" connection, i.e. a connection that doesn't use any stream multiplexer. Endpoints can offer support for monoplexed connections by offering the `/monoplex` stream multiplexer.
+
+**TODO**: Do we need to define a way to send an error code / error string? Or do we have something like that in libp2p already?


It would be nice to be able to send an error message if, e.g., the responder doesn't support any of the initiator's multiplexers. Perhaps that would be a potential use-case for a Reject message, even in the streaming case?

I don't believe we have any libp2p-wide error codes. I think they should be protocol specific.

yusefnapora · 2019-11-05T16:42:25Z

connections/multiselect2.md

+
+#### 0-RTT
+
+When using 0-RTT session resumption as offered by TLS 1.3 and some variants of Noise (**TODO**: specify which), the endpoints MUST remember the negotiated stream multiplexer used on the original connection. This ensures that the client can send application data in the first flight when resuming a connection.


The Noise IK handshake, which is part of the "Noise Pipes" pattern described in the spec, supports 0-RTT encryption.

Take into account that multiplexers can change across restarts.

@raulk That's on us to define. We could say that if you resume a connection, you have to use the same multiplexer. This would save us a roundtrip for negotiation the stream multiplexer in the common case.
In the rare case when a peer dropped support for a muxer between two connections, you would just reject 0-RTT and fall back to the 1-RTT handshake.

vyzo

We need to sanely account for simultaneous open, as it is a hard requirement for achieving TCP hole punching.
We can't possibly deploy a protocol that doesn't allow us to deal with this very important issue.

vyzo · 2019-11-05T16:52:57Z

Note that multiselect-1.0 can handle simultaneous open with a simple protocol extension; see #196.

marten-seemann · 2019-11-05T17:02:13Z

@vyzo If I understand #196 correctly, it works by adding (yet) one more round trip to multistream/1.0. I'd like to avoid adding round trips to multiselect 2.0 in the common case (where no simultaneous open is needed).

What do you think of this strawman proposal: If a handshake fails due to an unexpected message error (e.g. in the case of TLS, when the client receives a ClientHello in response to a ClientHello), both peers clear their handshake state and start over again. The roles are determined by the following rule: Both peers calculate the hash of the message(s) they sent and they received during the last handshake attempt. The peer that sent the message that hashes to the smaller value will then act as a client, the other one as the server in the subsequent handshake.

A protocol along those lines would make sure that peers that don't use simultaneous connect don't suffer any round-trip penalty, while peers that do simultaneous connect would pay the cost of a single additional round-trip.

vyzo · 2019-11-05T17:53:03Z

#196 does not add a round-trip to the protocol; it is pipelined together with the protocol selection.

marten-seemann · 2019-11-06T02:32:28Z

#196 does not add a round-trip to the protocol; it is pipelined together with the protocol selection.

@vyzo The crypto protocol selection, which we're getting rid of here. This negotiation made (makes) us vulnerable to a trivial MITM attack, which was one the motivations to develop a new version of this protocol that fixes this vulnerability (not to mention that removing it also speeds up the handshake by one round-trip).
So we clearly need something different than #196 for this new protocol. I'd really like to hear what you think of #227 (comment). Is that something we can build on to make it a workable specification for simultaneous open?

vyzo · 2019-11-06T08:55:09Z

@marten-seemann the strawman proposal is a hack, the TLS handshake can fail for all sorts of reasons, and the "unexpected message error" is not necessarily indicative of an unexpected "ClientHello".

Why can't we add an iamclient bit to the opening handshake?
If both parties set it, then they enter an initiator selection round, similar to #196 and after deciding who is the initiator they can proceed with TLS.

marten-seemann · 2019-11-06T09:34:54Z

@vyzo The problem with a iamclient message is that it's sent in cleartext, before the handshake even started, and it would make libp2p traffic as easily identifiable as it is today, which was a non-goal in the design of the multiselect 2 (see #205):

Current libp2p traffic can be blocked by using deep packet inspection. Handshakes should be indistinguishable from ordinary HTTP/2 (in the case of TCP) and HTTP/3 (in the case of QUIC) traffic, or other popular Internet applications.

Regarding the TLS handshake,

the strawman proposal is a hack, the TLS handshake can fail for all sorts of reasons, and the "unexpected message error" is not necessarily indicative of an unexpected "ClientHello".

I wouldn't use a loaded work like "hack" for this. It's a way to solve the problem that seems consistent with our design goals. You could easily wrap a TLS connection and parse the first byte of the first incoming message, if you're uncomfortable with relying of the error message returned by the TLS stack. That would tell you if the message you received was a ClientHello or a ServerHello.

bigs

I think this is great for the streaming case. In my opinion, it accomplishes the most important goals of #95, the original spec. As you mentioned in our chat the other day, it doesn't really cover the packet-oriented case at all, and it feels as though it'd be worth establishing a separate or--as I previously referred to it--an extension protocol for the packet case.

I think it's worth having different message types, likely defined in a separate proto file within the same directory, for that case. The differences that come to mind include:

Sender and receiver both need to establish dynamic IDs for protocols.
The protocol needs explicit reject messages. In my opinion this includes a "reject protocol" message as well as a potential "reject transport" message, since "closing" isn't a concept that is guaranteed to exist.
All protocol messages should support the optional inclusion of a payload. You made a great point that this could simply follow the var-int delimited protobuf message in the packet.

I'd like us to get another draft going that begins to tackle this. Perhaps we could have a secondary PR targeting this branch?

bigs · 2019-11-06T21:10:45Z

connections/multiselect2.md

+
+### Handshake Protocol Selection
+
+Handshake protocols are not being negotiated, but are announced in the peers' multiaddrs. 


I'd probably mention that they could be negotiated in some corner case, but that the general case will be to assume a secure transport by the time multiselect is initiated.

What is that corner case you're referring to? I'm not sure if we should even build that flexibility into our stack, even if we could come up with a situation where a negotiation wouldn't be susceptible to a MITM attack. If it's not there, it can't be exploited.

bigs · 2019-11-06T21:12:40Z

connections/multiselect2.md

+
+Handshake protocols are not being negotiated, but are announced in the peers' multiaddrs. 
+
+**TODO**: Do we need to describe the format here? I guess we don't, but we will probably need another document for that change, and we can link to it from here.


I think a link to some documentation on this would be nice! I don't believe it exists at the moment, so may be worth keeping this here.

We need an extension to the multiaddr spec + multicodec table to add secure channels as an atom. Also, @yusefnapora was writing this document to specify the semantics of multiaddrs: #191.

connections/multiselect2.md

bigs · 2019-11-06T21:52:14Z

connections/multiselect2.md

+
+Note that this negotiation scheme allows peers to negotiate a "monoplexed" connection, i.e. a connection that doesn't use any stream multiplexer. Endpoints can offer support for monoplexed connections by offering the `/monoplex` stream multiplexer.
+
+**TODO**: Do we need to define a way to send an error code / error string? Or do we have something like that in libp2p already?


I don't believe we have any libp2p-wide error codes. I think they should be protocol specific.

raulk

Initial responses to the current conversation.

connections/multiselect2.md

raulk · 2019-11-13T16:55:11Z

connections/multiselect2.md

+
+#### 0-RTT
+
+When using 0-RTT session resumption as offered by TLS 1.3 and some variants of Noise (**TODO**: specify which), the endpoints MUST remember the negotiated stream multiplexer used on the original connection. This ensures that the client can send application data in the first flight when resuming a connection.


Take into account that multiplexers can change across restarts.

connections/multiselect2.md

…ges)

marten-seemann · 2019-11-15T06:39:06Z

I just rewrote the Simultaneous Open section and specified how the hashing of handshake messages works to determine the roles in a subsequent handshake attempt. PTAL.

vyzo · 2019-11-15T07:16:16Z

connections/multiselect2.md


-Since secio doesn't provide this property, secio cannot be used with Multiselect 2.0.
+To determine the roles in the second handshake attempt, endpoints calculate the SHA-256 hash of the handshake messages that were sent and received (including any error message(s) that the handshake protocol might have sent) during the failed handshake attempt.


Are these handshake messages available to the application? It seems to me that these will be internal to the TLS implementation and not actually available to hash.

You can just wrap the Read and Write of the net.Conn.

This is true in go, but what about other languages?
Also, does the "wrap the Read/Write" imply that we hash all the bytes sent/received to make the decision?
It's still now clear how an implementor would implement this part of the spec.

These are the bytes sent in the clear over the wire. I'm not familiar with the TLS / Noise implementations in other languages, but I'd be surprised if this posed a major challenge in any language.
@tomaka, @yusefnapora Can you shed some light on this?

Any language that uses a TLS library that reads/writes to the file descriptor will have problems with this.

There's an additional detail we need to specify: When receiving the unexpected handshake message, TLS does two things:

It sends a TLS alert to the peer, and

it aborts the handshake and returns an error message.

If we just restart the handshake when the handshake message is returned, the peer's alert will still be in flight, and typically will be received after we initiated the second handshake, leading to a handshake failure. There are three ways we can solve this:

Modify the TLS stack to not send the alert,

Filter out the alert before it reaches the TLS stack

Wait for the alert to arrive, and then start the second connection attempt afterwards.

I don't think 1. is very practical, since it requires us to modify the TLS stack. 2. would be doable, but requires wrapping of the connection. 3. costs us half a roundtrip, but could be implemented without modifications to the TLS stack or wrapping of the connection.

Update: I just implemented option 3 in https://github.com/libp2p/go-libp2p-tls/compare/tcp-sim-open-option3, and unfortunately, it's inherently racy: When tls.Conn().Handshake() returns, we have no idea if this is before or after the TLS alert from the peer was received. We therefore can't reliably catch the TLS alert afterwards, since it might never arrive.

I think there's an option 4:

Filter out the ClientHello before it arrives at the TLS stack. If we receive one, abort the current handshake (taking care that no abort alert is sent out), and start anew.

Unfortunately, this option requires wrapping of the TLS connection, but at least it should be race-free. Furthermore, it doesn't cost us any additional (half) round-trips.

Here's an implementation of option 4, which seems to work: libp2p/go-libp2p-tls#38.
The downside is that we're duplicating a bit of the record layer parsing code of TLS, since we need to make sure that we read the whole ClientHello from the connection before we can start the second connection attempt.

The concern here is again other language implementations and whether they can do that in reasonable fashion.
Other than that, I am fine with it.

I haven't verified this at all, so we shouldn't assume this is correct... but the docs for node's tls.connect() show that you can pass in any duplex Stream as the underlying socket which will be used for the connection.

So it seems like we should be able to wrap a net.Socket with some code similar to @marten-seemann's go code which parses the first handshake message and aborts with a custom error if the server sends a ClientHello. Then we can do the peer id comparison and retry the connection when we get the special error type.

This won't work in browsers, but I don't think we have a clear path to supporting TLS 1.3 at all there, and we don't have access to raw TCP in browsers anyway, so it's kind of a moot point.

vyzo · 2019-11-15T07:47:57Z

connections/multiselect2.md

+To determine the roles in the second handshake attempt, endpoints calculate the SHA-256 hash of the handshake messages that were sent and received (including any error message(s) that the handshake protocol might have sent) during the failed handshake attempt.
+The peer that sent the messages resulting in the numerically smaller hash value acts a client in the second handshake attempt, the peer that sent the messages resulting in the numerically larger hash value acts as a server.
+
+Since secio assign roles during the handshake, it is not possible to detect a Simultaneous Open in this case. Therefore, secio MUST NOT be used with Multiselect 2.0.


ok, this is not quite right; we need something for secio too, unless we are actively deprecating it.

Which part of it is not right?
We've been talking about deprecating it for a long time, and the main reason we've been sticking to it is for backwards compatibility. Since Multiselect 2.0 is not backwards compatible anyways, this would be a good opportunity to finally phase it out.

I agree with the push towards deprecation, as seems to be the growing consensus. Multiselect 2.0 presents a convenient opportunity to upgrade our secure channels.

There is no path for backwards compatibility here with multistream 1 correct? Pairing the secio deprecation with multiselect 2 is going to put a lot of stress on limited environments (browsers) to upgrade, or segregate them from the rest of the network. I don't think it warrants blocking the spec, especially if providing that compatibility hampers the performance/feature gains, but we should be cognizant and clear of the network rollout time table and its impact on the network.

secio is actively being deprecated and removed from the network, so this no longer needs to be a consideration.

@jacobheun @raulk

IIUC, even Noise needs an initiator and a responder. So, even after deprecating SecIO, we still need to assign roles.

I'm happy to update the document and remove any mention of secio, provided there's interest in moving forward with this document. It's been quiet for long time...

bigs · 2019-11-26T04:18:44Z

connections/multiselect2.md

+To determine the roles in the second handshake attempt, endpoints calculate the SHA-256 hash of the handshake messages that were sent and received (including any error message(s) that the handshake protocol might have sent) during the failed handshake attempt.
+The peer that sent the messages resulting in the numerically smaller hash value acts a client in the second handshake attempt, the peer that sent the messages resulting in the numerically larger hash value acts as a server.
+
+Since secio assign roles during the handshake, it is not possible to detect a Simultaneous Open in this case. Therefore, secio MUST NOT be used with Multiselect 2.0.


I agree with the push towards deprecation, as seems to be the growing consensus. Multiselect 2.0 presents a convenient opportunity to upgrade our secure channels.

tomaka · 2019-11-26T11:00:18Z

connections/multiselect2.md

+
+TLS as well as Noise will fail the handshake if both endpoints act as clients. In case of such a handshake failure, the two endpoints need to restart the handshake. Endpoints MUST NOT close the underlying TCP connection in this case. Implementations SHOULD specifically test for this type of handshake failure, and not treat any handshake failure as a potential Simultaneous Open.
+
+To determine the roles in the second handshake attempt, endpoints compare the SHA-256 hashes of their peer IDs. The peer with the numerically smaller hash value acts as a client in the second handshake attempt, the peer with the numerically larger hash value acts as a server.


This forces us to transfer our peer ID in the initial handshake message, right?
While this is normally good, it also means we can't for example fool proxies by pretending that our traffic in regular HTTP3.

Handling TCP simultaneous connections seems a bit off-topic to me for multistream-select, and could be specific to each transport and/or encryption protocol.

When you dial a peer, you already know it’s peer ID. And of course you also know your own peer ID, so we don’t have to send anything extra on the wire.

@tomaka I tend to agree with this being slightly off-topic. The reason it's in here is because ms-2.0 also changes the connection bootstrapping, and #196 won't work any more (at least not if we want to be able to mask our traffic).

What do you think of saying in this document

Secure Channels MUST define how to handle TCP simultaneous open, if they can be used over TCP.

and then moving the text here to the TLS document?

@marten-seemann

When you dial a peer, you already know it’s peer ID. And of course you also know your own peer ID, so we don’t have to send anything extra on the wire.

I may be missing something, but how does the responder learn the peer ID without a hard requirement for the crypto handshake to transmit it on the first message?

It's a simultaneous dial, you have both peer IDs.

At this layer, you only know you have established a connection and have exchanged some bytes. You think you know the other side’s peer ID, but it’s not authenticated. After exchanging those initial bytes, you notice the other party sends you a message you didn’t expect. Does each party work on their assumption of the peer ID of the other party? That leaves the system open to a series of attacks where a peer advertises a very large/small peer ID for itself, then responds to all SYN packets with another SYN + the initiator message, to force a conflict resolution pathway it knows it’ll always win. I much rather introduce a vector of randomness per-session.

@raulk What's the attack here? Sure, an attacker can go through the hassle you're describing to make sure it ends up in the client role (from the viewpoint of the cryptographic handshake). During the cryptographic handshake both peer validate each other's peer IDs, so it seems to me that the attacker gains nothing from this attack at all.

I'm not worried about peers lying about their identity. I'm worried about making the conflict resolution 100% deterministic, and the transitive attack surface that might expose going forward. We're specifying against our future selves, not against the situations we foresee now.

If there's a bug exploitable in this circumstance, an attack can be crafted that works 100% of the time, by precomputing a large/small peer identity.

I admit I may be thinking too far. But if we want to make conflict resolution probabilistic (which I think is the right way, thinking from first principles), then we can make the protocol can send a random nonce and have peers XOR their nonces to calculate who wins, or something like that.

I’m not sure I understand the argument here. The security of our handshake relies on the TLS handshake. If TLS is broken, we don’t gain a lot from reducing the cost of an attack from succeeding in 100% of the cases to 50% of the cases.

connections/multiselect2.md

raulk

Ran out of time to review the protocol specification in-depth, but I think there's enough feedback here to issue another revision for people to review a more refined doc.

connections/multiselect2.md

raulk · 2019-11-27T14:31:48Z

connections/multiselect2.md

+
+Handshake protocols are not being negotiated, but are announced in the peers' multiaddrs. 
+
+**TODO**: Do we need to describe the format here? I guess we don't, but we will probably need another document for that change, and we can link to it from here.


We need an extension to the multiaddr spec + multicodec table to add secure channels as an atom. Also, @yusefnapora was writing this document to specify the semantics of multiaddrs: #191.

raulk · 2019-11-27T14:36:58Z

connections/multiselect2.md

+
+**TODO**: Do we need to describe the format here? I guess we don't, but we will probably need another document for that change, and we can link to it from here.
+
+Peers advertising a multiaddr that includes a handshake protocol MUST support Multiselect 2.0 as described in this document.


I know we had discussed this assumption as a way to simplify things, but I do think it'll end up being short-sighted, and in some ways, a regression compared to multistream-select 1.0, which does announce its version (although admittedly the implementations are not ready to support multiple versions, but the protocol is).

Something I've been thinking about is to create an extension to multistream-select 1.0 / upgraders that would allow us to go straight into a cryptographic handshake, as a way to deliver censorship resistance to downstream users that require it before we realistically ship ms2.0.

I'm not sure how this would work without either defining another multicoded or requiring an additional round-trip to negotiate.

connections/multiselect2.md

raulk · 2019-11-27T14:48:34Z

connections/multiselect2.md

+
+![](handshake.png)
+
+Handshake protocols (or implementations of handshake protocols) that don't support sending of Early Data will have to run the stream multiplexer selection after the handshake completes.


Oh ok, I see that you added the fallback. We need to define how that'll work.

I'm not sure if there's anything we need to define. "Early data" is not a separate byte stream, it's the same byte stream as the rest of the connection. The only difference is that the data is sent earlier (and, depending on the handshake protocol, might use a different set of keys).
To the application Early Data and Late (?) Data is not distinguishable at all.

raulk · 2019-11-27T14:52:26Z

connections/multiselect2.md

+
+#### 0-RTT
+
+When using 0-RTT session resumption as offered by TLS 1.3 and Noise, the endpoints MUST remember the negotiated stream multiplexer used on the original connection. This ensures that the client can send application data in the first flight when resuming a connection.


Stream multiplexers can change over time.

Peers may forget other peers' protocols.

We may need some form of opaque token with which we can verify that our assertions about the other party still remain valid, upon reconnecting. If the other party NACKs, we fall back to full connection bootstrapping.

The TLS session ticket is exactly that opaque token.

But multiselect 2.0 doesn't know about concrete secure channels. If there's a handshake-specific token that we can bind to, it needs to percolate up to this layer.

connections/multiselect2.md

Co-Authored-By: Raúl Kripalani <raul@protocol.ai>

connections/multiselect2.md

jacobheun · 2019-11-27T15:27:13Z

connections/multiselect2.md

+In Multiselect 2 the server makes use of Early Data by sending a list of stream multiplexers. This ensures that the client can choose a stream multiplexer as soon as the handshake completes (or fail the connection if it doesn't support any stream multiplexer offered by the server).
+
+When using TLS 1.3, the server can send Early Data after it receives the ClientHello. Early Data is encrypted, but at this point of the handshake the client's identity is not yet verified.
+While Noise in principle allows sending of unencrypted data, endpoints MUST NOT use this to send their list of stream multiplexers. An endpoint MAY send it as soon it is possible to send encrypted data, even if the peers' identity is not verified at that point.


Suggested change

While Noise in principle allows sending of unencrypted data, endpoints MUST NOT use this to send their list of stream multiplexers. An endpoint MAY send it as soon it is possible to send encrypted data, even if the peers' identity is not verified at that point.

While Noise in principle allows sending of unencrypted data, endpoints MUST NOT use this to send their list of stream multiplexers. An endpoint MAY send it as soon as it is possible to send encrypted data, even if the peers' identity is not verified at that point.

connections/multiselect2.md

jacobheun · 2019-11-27T15:50:07Z

connections/multiselect2.md

+To determine the roles in the second handshake attempt, endpoints calculate the SHA-256 hash of the handshake messages that were sent and received (including any error message(s) that the handshake protocol might have sent) during the failed handshake attempt.
+The peer that sent the messages resulting in the numerically smaller hash value acts a client in the second handshake attempt, the peer that sent the messages resulting in the numerically larger hash value acts as a server.
+
+Since secio assign roles during the handshake, it is not possible to detect a Simultaneous Open in this case. Therefore, secio MUST NOT be used with Multiselect 2.0.


There is no path for backwards compatibility here with multistream 1 correct? Pairing the secio deprecation with multiselect 2 is going to put a lot of stress on limited environments (browsers) to upgrade, or segregate them from the rest of the network. I don't think it warrants blocking the spec, especially if providing that compatibility hampers the performance/feature gains, but we should be cognizant and clear of the network rollout time table and its impact on the network.

Co-Authored-By: Jacob Heun <jacobheun@gmail.com>

This reverts commit 86030ed.

select the stream muxers by intersecting the lists of supported muxers

raulk · 2019-12-11T16:25:45Z

I have not had time to review all the commentary here and the resulting changes. We should move with this rapidly, knowing that this spec is going to continue change. I'm happy to lock in a baseline so we can start a PoC, but not before we see the following. If these are implemented/rejected, please synthesise and explain (reiterating I have no time to go through the commentary and doc, but I have strong ideas):

Sharing session capabilities during connection bootstrapping. I'd like peers to advertise things like compressors, erasure coding, etc. protocols, that can apply to an entire connection or to individual streams.
- IMO this addresses the rationale behind @Stebalien's initial recursive/tunneling proposal.
If peer A has learnt and memorised protocol mappings of peer B in the peerstore, the next time A and B connect, they should be able to verify that the same protocol table still stands. We had discussed incorporating some opaque ID (that is fundamentally different to session resumption, which may or may not be supported by secure channels).
Not sure if the XOR pathway has been implemented ("I want to open a stream for either protocol Xv1, or Xv2, or Xv3, each accompanied with their initial message; it should be possible to have a single message for multiple protocol proposals; otherwise we'd be wasteful in some scenarios").

marten-seemann · 2019-12-11T16:42:30Z

@raulk

Sharing session capabilities during connection bootstrapping. I'd like peers to advertise things like compressors, erasure coding, etc. protocols, that can apply to an entire connection or to individual streams.

Compressors came up on #230. It was not part of our requirement document, and to be honest, I don't really understand the use case well enough to even write down the requirements for this, let alone a specification.

If peer A has learnt and memorised protocol mappings of peer B in the peerstore, the next time A and B connect, they should be able to verify that the same protocol table still stands. We had discussed incorporating some opaque ID (that is fundamentally different to session resumption, which may or may not be supported by secure channels).

We discussed this, and decided to remove this from the specification. The reason is that the cost of sending the string for one round-trip (and learning a new ID for this protocol) is minor, and doesn't justify the cost of indexing a mapping and verifying that this index is still valid.

Not sure if the XOR pathway has been implemented ("I want to open a stream for either protocol Xv1, or Xv2, or Xv3, each accompanied with their initial message; it should be possible to have a single message for multiple protocol proposals; otherwise we'd be wasteful in some scenarios").

It's possible to Offer multiple protocols in the same message, but it's not possible to send application data for each protocol. Therefore, it will take a single round-trip (once per connection) to learn which protocol version the peer supports. This leads to a much simpler wire format of ms2.0.

raulk · 2019-12-11T16:51:31Z

We discussed this, and decided to remove this from the specification. The reason is that the cost of sending the string for one round-trip (and learning a new ID for this protocol) is minor, and doesn't justify the cost of indexing a mapping and verifying that this index is still valid.

I disagree here. This assertion can be sent in early data, therefore amortised in terms of segmentation. A normal IPFS peer will need to agree on identify, dht, pubsub, bitswap, etc. Imagine the network is spotty and this peer keeps connecting and reconnecting frequently. Why would we want to renegotiate protocol indices every time, when we can exchange opaque table IDs upfront and just assert that we both maintain the same table?

It's possible to Offer multiple protocols in the same message, but it's not possible to send application data for each protocol. Therefore, it will take a single round-trip (once per connection) to learn which protocol version the peer supports. This leads to a much simpler wire format of ms2.0.

This was captured in the design doc. "A simpler wire format" is not a convincing answer for a feature that's required; of course things are going to be simple if you remove features.

Aren't we talking about an optional bytes field? I don't understand where the complication is.

raulk · 2019-12-11T17:00:17Z

Compressors came up on #230.

Well, you merged that PR without consensus. And now we're having the discussion again, which was predictable ;-)

It's a mistake to make the early data a multiselect 2.0 message. Early data forms part of connection bootstrapping, which is really a process that's decoupled from ms2.0 -- that's the way we're designing it.

Therefore, connection bootstrapping needs its own specific payload, which IMO should include:

enum CapabilityType {
  MULTIPLEXER
  COMPRESSOR
  ERASURE_CODER
}

enum Scope {
  CONN
  STREAM
  BOTH
}

message SessionCapGroup {
  CapabilityType type = 1;
  repeated string supported = 2;
  repeated Scope scopes = 3;
}

message ConnectionBootstrap {
  fixed32 protocol_table_id = 1;   // an opaque string that identifies the version of the protocol table we are using.
  repeated SessionCapGroup session_capabilities = 2;
}

marten-seemann · 2019-12-11T17:25:01Z

I disagree here. This assertion can be sent in early data, therefore amortised in terms of segmentation. A normal IPFS peer will need to agree on identify, dht, pubsub, bitswap, etc. Imagine the network is spotty and this peer keeps connecting and reconnecting frequently. Why would we want to renegotiate protocol indices every time, when we can exchange opaque table IDs upfront and just assert that we both maintain the same table?

Because the overhead of it is a few bytes, for a single roundtrip. I suggest we first show that this is actually a problem before designing a complicated solution. Synchronizing state across connections is not trivial, and will add significant complexity, both in terms of specification as well as implementation.
If we can show that this complexity is actually justified by real-world benefits, the protobuf-based format of ms2.0 will allow us to define a new message, which could (for example) announce the ID(s) of a remembered mapping. Peers that are still on the old ms2.0 will ignore this message, and peers that already support this update can then send a confirmation protobuf. This is a very smooth upgrade path.

It's possible to Offer multiple protocols in the same message, but it's not possible to send application data for each protocol. Therefore, it will take a single round-trip (once per connection) to learn which protocol version the peer supports. This leads to a much simpler wire format of ms2.0.

This was captured in the design doc. "A simpler wire format" is not a convincing answer for a feature that's required; of course things are going to be simple if you remove features.

It was marked "tentative" in the design doc, so it's hard to argue that this is a feature that was a hard requirement in the protocol design. The only comment I received about this tentative point was that it would be complex. Considering that we don't have the concept of protocol upgrades at all in multistream 1 at all, I'm not sure how we know that improving the current situation to a one round-trip per connection wait for protocol upgrades would be prohibitively expensive.

Aren't we talking about an optional bytes field? I don't understand where the complication is.

I don't think it's that easy. First of all, it breaks with the concept of ms2.0 to use protobufs for protocol messages, but not application data. Furthermore, it's unclear how this interacts with stream flow control. We might also want to think about an API that would generate this kind of data first.

Compressors came up on #230.

Well, you merged that PR without consensus. And now we're having the discussion again, which was predictable ;-)

I merged #230 into this PR, not into master. Consensus would have been nice, but difficult to achieve if so few people actually review the PR. Compressors were unrelated to #230 anyway, so this is the better place to discuss this anyway. Or at least as far as I understand them. This is a totally new requirement to me, I don't know how they're supposed to work, so I have no idea how I should argue about a specific wire format at this point.

add a multiselect 2.0 spec

6f0b02d

yusefnapora reviewed Nov 5, 2019

View reviewed changes

connections/multiselect2.md Outdated Show resolved Hide resolved

yusefnapora reviewed Nov 5, 2019

View reviewed changes

vyzo requested changes Nov 5, 2019

View reviewed changes

bigs reviewed Nov 6, 2019

View reviewed changes

raulk reviewed Nov 13, 2019

View reviewed changes

enable Simultaneous Open by assigning roles based on the H(sent messa…

73afa0b

…ges)

marten-seemann requested a review from vyzo November 15, 2019 06:38

vyzo reviewed Nov 15, 2019

View reviewed changes

marten-seemann added 3 commits November 23, 2019 16:05

apply Raul's suggestions in the secure channel section

7e23e02

expand on Noise and Early Data

0780f9e

use SHA-256(peer id) to determine roles in simultaneous open

205dfca

marten-seemann mentioned this pull request Nov 24, 2019

handle TCP simultaneous open (option 4) libp2p/go-libp2p-tls#38

Open

bigs approved these changes Nov 26, 2019

View reviewed changes

Kubuxu self-requested a review November 26, 2019 09:01

tomaka reviewed Nov 26, 2019

View reviewed changes

connections/multiselect2.md Show resolved Hide resolved

marten-seemann requested a review from Stebalien November 26, 2019 16:08

raulk requested changes Nov 27, 2019

View reviewed changes

apply Raul's editorial suggestions

b13bec0

Co-Authored-By: Raúl Kripalani <raul@protocol.ai>

jacobheun reviewed Nov 27, 2019

View reviewed changes

apply @jacobheun's suggestions

f7120a3

Co-Authored-By: Jacob Heun <jacobheun@gmail.com>

marten-seemann added 4 commits December 2, 2019 10:58

move the protobuf definitions to a separate section

f53a644

editorial tweaks to the protocol description

1a5bfe3

select stream multiplexers by list intersection

572f533

remove the option to send multiple protocols in Offer

86030ed

marten-seemann mentioned this pull request Dec 2, 2019

select the stream muxers by intersecting the lists of supported muxers #230

Merged

Revert "remove the option to send multiple protocols in Offer"

0f4b930

This reverts commit 86030ed.

raulk mentioned this pull request Dec 6, 2019

noise-libp2p: introduce "handshake seal"; more. #234

Closed

marten-seemann mentioned this pull request Dec 8, 2019

noise: define API for early data libp2p/go-libp2p#1537

Closed

marten-seemann added 2 commits December 11, 2019 19:56

clarify that the client's muxer preference takes precedence

2c08388

Merge pull request #230 from libp2p/ms2-stream-muxer

cf45d4a

select the stream muxers by intersecting the lists of supported muxers

vyzo mentioned this pull request Apr 27, 2020

attacker initiated MITM? LeastAuthority/go-libp2p-pubsub#3

Open

marten-seemann added the protocol-select label Jul 25, 2021


		Note that this negotiation scheme allows peers to negotiate a "monoplexed" connection, i.e. a connection that doesn't use any stream multiplexer. Endpoints can offer support for monoplexed connections by offering the `/monoplex` stream multiplexer.

		TODO: Do we need to define a way to send an error code / error string? Or do we have something like that in libp2p already?


		#### 0-RTT

		When using 0-RTT session resumption as offered by TLS 1.3 and some variants of Noise (TODO: specify which), the endpoints MUST remember the negotiated stream multiplexer used on the original connection. This ensures that the client can send application data in the first flight when resuming a connection.


		### Handshake Protocol Selection

		Handshake protocols are not being negotiated, but are announced in the peers' multiaddrs.


		Handshake protocols are not being negotiated, but are announced in the peers' multiaddrs.

		TODO: Do we need to describe the format here? I guess we don't, but we will probably need another document for that change, and we can link to it from here.


		Since secio doesn't provide this property, secio cannot be used with Multiselect 2.0.
		To determine the roles in the second handshake attempt, endpoints calculate the SHA-256 hash of the handshake messages that were sent and received (including any error message(s) that the handshake protocol might have sent) during the failed handshake attempt.


		TLS as well as Noise will fail the handshake if both endpoints act as clients. In case of such a handshake failure, the two endpoints need to restart the handshake. Endpoints MUST NOT close the underlying TCP connection in this case. Implementations SHOULD specifically test for this type of handshake failure, and not treat any handshake failure as a potential Simultaneous Open.

		To determine the roles in the second handshake attempt, endpoints compare the SHA-256 hashes of their peer IDs. The peer with the numerically smaller hash value acts as a client in the second handshake attempt, the peer with the numerically larger hash value acts as a server.


		TODO: Do we need to describe the format here? I guess we don't, but we will probably need another document for that change, and we can link to it from here.

		Peers advertising a multiaddr that includes a handshake protocol MUST support Multiselect 2.0 as described in this document.


		![](handshake.png)

		Handshake protocols (or implementations of handshake protocols) that don't support sending of Early Data will have to run the stream multiplexer selection after the handshake completes.

	While Noise in principle allows sending of unencrypted data, endpoints MUST NOT use this to send their list of stream multiplexers. An endpoint MAY send it as soon it is possible to send encrypted data, even if the peers' identity is not verified at that point.
	While Noise in principle allows sending of unencrypted data, endpoints MUST NOT use this to send their list of stream multiplexers. An endpoint MAY send it as soon as it is possible to send encrypted data, even if the peers' identity is not verified at that point.

add a multiselect 2.0 spec #227

Are you sure you want to change the base?

add a multiselect 2.0 spec #227

Conversation

marten-seemann commented Nov 4, 2019

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

vyzo left a comment

Choose a reason for hiding this comment

vyzo commented Nov 5, 2019

marten-seemann commented Nov 5, 2019

vyzo commented Nov 5, 2019

marten-seemann commented Nov 6, 2019

vyzo commented Nov 6, 2019

marten-seemann commented Nov 6, 2019 • edited Loading

bigs left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

raulk left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

marten-seemann commented Nov 15, 2019

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

marten-seemann Nov 24, 2019 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

tomaka Nov 26, 2019 • edited Loading

Choose a reason for hiding this comment

tomaka Nov 26, 2019 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

raulk Nov 27, 2019 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

raulk left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

marten-seemann Nov 27, 2019 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

raulk commented Dec 11, 2019

marten-seemann commented Dec 11, 2019

raulk commented Dec 11, 2019

raulk commented Dec 11, 2019 • edited Loading

marten-seemann commented Dec 11, 2019

marten-seemann commented Nov 6, 2019 •

edited

Loading

marten-seemann Nov 24, 2019 •

edited

Loading

tomaka Nov 26, 2019 •

edited

Loading

tomaka Nov 26, 2019 •

edited

Loading

raulk Nov 27, 2019 •

edited

Loading

marten-seemann Nov 27, 2019 •

edited

Loading

raulk commented Dec 11, 2019 •

edited

Loading