Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] [RFC] Multistream-2.0 #95

Closed
wants to merge 15 commits into from
Closed
41 changes: 41 additions & 0 deletions multistream-2.0/retrospective.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,41 @@
# Multistream-Select 1.0.0 Retrospective

This short document aims to motivate the need for a new stream negotiation
protocol.

There are 5 concrete issues with multistream select.

multistream-select:

1. requires at least one round trip to be sound.
2. negotiates protocols in series instead of in parallel.
3. doesn't provide any way to determine which side (possibly both) initiated the
connection/negotiation.
4. is bandwidth inefficient.
5. punishes long, descriptive, protocol names.

We ignore 1 and just accept that the protocol has some soundness issues as
actually *waiting* for a response for a protocol negotiation we know will almost
certainly succeed would kill performance.

As for 2, we make sure to remember protocols known to be spoken by the remote
endpoint so we can try to negotiate a known-good protocol first. However, this
is still inefficient.

Issue 3 gets us in trouble with TCP simultaneous connect. Basically, we need a
protocol where both sides can propose a set of protocols to speak and then
deterministically select the *same* protocol. Ideally, we'd also *expose* the
fact that both sides are initiating to the user.

By 4, I mean that we repeatedly send long strings (the protocol names) back and
forth. While long strings *are* more user friendly than, e.g., port numbers,
they're, well, long. This can introduce bandwidth overheads over 30%.

Issue 5 is a corollary of issue 4. Because we send these protocol names *every*
time we negotiate, we don't, e.g., send longer, better protocol names like:

* /ai/protocol/p2p/bitswap/1.0
* /ipfs/QmId.../bitswap/1.0

However, multistream-select was *explicitly designed* with this use-case in
mind.
319 changes: 319 additions & 0 deletions multistream-2.0/spec.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,319 @@
# Multistream 2.0

This proposal describes a replacement protocol for multistream-select.

## Protocols

This document proposes 5 new, micro-protocols with two guiding principles:

1. Composition over complexity.
2. Every byte and round-trip counts.

This document *does not*, in fact, propose a protocol *negotiation* protocol.
Instead, it proposes a set of stream/protocol management protocols that can be
composed to flexibly negotiate protocols.

First, this document proposes 4 protocol "negotiation" protocols. "Negotiation"
is in quotes because none of these protocols actually involve negotiating
anything.

1. `multistream/advertise`: Inform the remote end about which protocols we
speak. This should partially replace the current identify protocol.
2. `multistream/use`: Selects the stream's protocol using a multicodec.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not a fan of these names. How about multistream/use-muilticodec, multistream/use-dynamic, multistream/use-contextual?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Renamed use -> multicodec, dynamic -> string, contextual -> dynamic.

3. `multistream/dynamic`: Selects the stream's protocol using a string protocol name.
4. `multistream/contextual`: Selects the stream's protocol using a protocol ID
defined by the *receiver*, valid for the duration of the "session"
(underlying connection). To use this, the *receiver* must have used the
`multistream/advertise` To inform the initiator of *it's* mapping between
protocols and contextual IDs.

Second, this document proposes an auxiliary protocol that can be used with the 4
multistream protocols to actually negotiate protocols. This is *primarily*
useful (a) in packet-based protocols (without sessions) and (b) when initially
negotiating a transport session (before protocols have been advertised and the
stream multiplexer has been configured).

1. `serial-stream`: A simple stream "multiplexer" that can multiplex multiple
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we fold these into my eventual work on stream multiplexing? should this be an evolution of mplex?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The idea is to define a simple multiplexer that every single implementation will support, forever. This is designed to be the dumbest, simplest, worst multiplexer in the world so we can have some multiplexer while we negotiate a real multiplexer.

streams *in serial* over the same connection. That is, it allows us to
negotiate a protocol, use it, and then return to multistream. It also allows
us to speculatively choose a single protocol and then drop back down to
multistream if that doesn't work.

All peers *must* implement `multistream/use` and *should* implement
`serial-stream`. This combination will allow us to apply a series of quick
connection upgrades (e.g., to multistream 3.0) with no round trips and no funny
business (learn from past mistakes).

Notes:

1. The "ls" feature of multistream has been removed. While useful, this really
should be a *protocol*. Given the `serial-stream` protocol, this shouldn't be
an issue as we can run as many sub-protocols over the same stream as we want.
2. To reduce RTTs, all protocols are unidirectional.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does this mean in practice?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated.

3. These protocols were *also* designed to eventually support packet protocols.
4. We considered a `speculative-stream` protocol where the initiator
speculatively starts multiple streams and the receiver acts on at most one.
This would have allowed for 0-RTT worst-case protocol negotiation but was
deemed too complicated for inclusion in the core spec.

### Multistream Advertise

Unspeced (for now). Really, we just need to send a mapping of protocol
names/codecs to contextual IDs (and may be some service discovery information).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We also need the option to add a mapping later on, right?
I'm ok with not specifying the wire format (for now), but we should define exactly what we expect to be getting out of multistream/advertise.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree. We need a way to add/remove mappings (although we shouldn't reuse them within a session).

This is the subset of identify needed for protocol negotiation.

### Multistream Use

The `multistream/use` protocol is simply two varint multicodecs: the
multistream-use multicodec followed by the multicodec for the protocol to be
used. This protocol supports unidirectional streams. If the stream is
bidirectional, the receiver must acknowledge a successful protocol negotiation
by responding with the same multistream-use protocol sequence.

Every stream starts with multistream-use. Every other protocol defined here will
be assigned a multicodec and selected with `multistream/use.`

This protocol should *also* be trivial to optimize in hardware simply by prefix
matching (i.e., matching on the first N (usually 16-32) bits of the
stream/message).

### Multistream Dynamic
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we really have a use case for this? Or wouldn't it be easier for peers to first announce a new code point, and then use it with multistream/use?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Privacy, flexibility, feature parity with multistream 1.0. For example, I might listen on a non-enumerable set of protocols.

Concretely: We can use multistream to connect to peers within peers by connecting to the multistream endpoint /p2p/InnerPeerId.

Or wouldn't it be easier for peers to first announce a new code point, and then use it with multistream/use

Note: multistream/use is for multicodecs. For custom protocols, we'd use multistream/contextual.


The `multistream/dynamic` protocol is like the `multistream/use` protocol
*except* that it uses a string to identify the protocol. To do so, the initiator
simply sends a varint length followed by the name of the protocol.

Including the `multistream/use` portion, the initiator would send:

```
<multistream/use><multistream/dynamic><length(varint)><name(string)>
```

Note: This used to use a fixed-width 16 bit number for a length. However, a
varint *really* isn't going to cost us much, if anything, in terms of
performance as most protocol names will be <= 128 bytes long. On the other hand,
using different number formats everywhere *will* cost us in terms of complexity.

### Multistream Contextual

The `multistream/contextual` protocol is used to select a protocol using a
*receiver specified*, session-ephemeral protocol ID. These IDs are analogues of
ephemeral ports.

In this protocol, the stream initiator sends a varint ID specified by the
*receiver* to the receiver.

Format:

```
<multistream/use><multistream/contextual><id(varint)>
```

The ID 0 is reserved for saying "same protocol" on a bidirectional stream. The
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need this "same protocol" message? It doesn't really make sense to speak ping in one direction of the stream and identify in the other direction, does it?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good question... is this an artifact of xor?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should probably just say "ID 0 means YES".

The issues are:

  1. These protocols all have unidirectional variants.
  2. Who initiated the stream isn't, unfortunately, always unambiguous.
  3. These contextual IDs are relative and, even if you know my contextual ID for a protocol, I may not know yours.

Given these three issues:

  1. I can't use your ID because I may not know it.
  2. I can't use mine because your multistream muxer is expecting your IDs.

Therefore, I'm using 0.

However, there's probably a better way to say this.

receiver of a bidirectional stream can't reuse the same contextual ID that the
initiator used as this contextual ID is relative *to* the receiver. Really, this
last rule *primarily* exists to side-step the TCP simultaneous connect issue.

This protocol has *also* been designed to be hardware friendly:

1. Hardware can compare the first 16 bits of the message against
`<multistream/use><multistream/contextual>`.
2. It can then route the message based on the contextual ID. The fact that these
IDs are chosen by the *receiver* means that the receiver can reuse the same
IDs for all connected peers (reusing the same hardware routing table).

### Serial Stream

The `serial-stream` protocol is the simplest possible stream multiplexer.
Unlike other stream multiplexers, `serial-stream` can only multiplex streams
in *serial*. That is, it has to close the current stream to open a new one.

The protocol is:

```
<header (signed 16 bit int)>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Endianess of this integer should be defined or endianness of all integers in the document should be defined.

At this time I'm assuming network byte order/big-endian.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. Network order.

<body>
```

Where the header is:

* -2 - Send a reset and return to multistream. All queued data (remote and
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

presumably this reset is part of the multiplexing protocol?

edit: or is this to say -2 is a reset?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

edit: or is this to say -2 is a reset?

Yes.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So the distinction is that -2 is an abnormal end and -1 is a normal end?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. Updated.

local) should be discarded.
* -1 - Close: Send an EOF and return to multistream.
* 0 - Rest: Ends the reuse protocol, transitioning to a direct stream.
* >0 - Data: The header indicates the length of the data.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

>0 is interpreted as formatting, not the literal > character


We could also use a varint but it's not really worth it. The 16 bit integer
makes implementing this protocol trivial, even in hardware.

Why: This allows us to:

1. Try protocols and fall back on others.
2. More importantly, it allows us to speak a bunch of protocols before setting
up a stream multiplexer. Specifically, we can use this for
`multistream/advertise` to send an advertisement as early as possible.

## Upgrade Path

#### Short term

The short-term plan is to first negotiate multistream 1.0 and *then* negotiate
an upgrade. That is, on connect, the *initiator* will send:

```
<len>/multistream/1.0.0\n
<len>/multistream/2.0.0\n
```

As a batch. It will then wait for the other side to respond with either:

```
<len>/multistream/1.0.0\n
<len>na\n
```

in which case it'll continue using multistream 1.0, or:

```
<len>/multistream/1.0.0\n
<len>/multistream/2.0.0\n
```

in which case it'll switch to multistream 2.0.

Importantly: When we switch to multistream 2.0, we'll tag the connection (and
any sub connections) with the multistream version. This way, we never have to do
this again.

## Example

So, that was way too much how and not enough why or WTF? Let's try an example
where,

1. The initiator supports TLS1.3 and SECIO.
2. The receiver only supports TLS1.3.
3. They both support yamux.
4. They both support DHT.
5. secio and tls have multicodecs but yamux and dht don't.

If we're still in the transition period, the initiator would start off by sending:

```
<len>/multistream/1.0\n
<len>/multistream/2.0\n
```

If the receiver DOES NOT support multistream 2.0, it will reply with:

```
<len>/multistream/1.0\n
<len>na\n
```

At this point, the client will fall back on multistream 1.0.

Otherwise, the receiver will send back...

```
<len>/multistream/1.0\n
<len>/multistream/2.0\n
```

...to complete the upgrade.

We're now in multistream 2.0 land. Once we're done with the transition period,
we'll start here to skip a round-trip.

Now that we're using multistream 2.0, the initiator will send, in a single
packet:

```
<multistream/use (multicodec)><serial-stream (multicodec)> // use serial-stream to make the stream recoverable
<len> // serial-stream message framing
<multistream/use (multicodec)><multistream/advertise (multicodec)> // select advertise protocol
supported security protocols... //
-1 // return to multistream (EOF)

<multistream/use (multicodec)><serial-stream (multicodec)> // open a new serial-stream
<len>
<multistream/use (multicodec)><tls (multicodec)> // select TLS
<initial tls packet...> // initiate TLS
```

The receiver will respond with:

```
<multistream/use (multicodec)><serial-stream (multicodec)> // respond to serial stream
<len>
<multistream/use (multicodec)><multistream/advertise (multicodec)> // select advertise protocol
security protocols...
-1 // return to multistream (EOF)

<multistream/use (multicodec)><serial-stream (multicodec)> // respond to second serial stream
0 // transition to a normal stream.
<multistream/use (multicodec)><tls (multicodec)> // select TLS
<response tls packet...> // complete TLS handshake
```

This:

1. Responds to the advertisement, also advertising available security protocols.
2. Accepts the TLS stream.
3. Finishes the TLS handshake.

If the receiver had *not* supported TLS, it would have reset the serial-stream.
In that case, the initiator would have used the protocols advertised by the
receiver to select an appropriate security protocol.

Finally, the initiator will finish the TLS negotiation, send a advertise packet,
*optimistically* negotiate yamux, and sends the DHT request.

```
0 // transition to a normal stream.

<tls client auth...> // finish TLS

<multistream/use (multicodec)><serial-stream (multicodec)> // use serial-stream to make the stream recoverable
<len> // serial-stream message framing
<multistream/use (multicodec)><multistream/advertise (multicodec)> // select advertise protocol
<advertise data> // comlete advertise information (protocols, etc.)
-1 // return to multistream (EOF)

<multistream/use (multicodec)><serial-stream (multicodec)> // open a new serial-stream
<len>
<multistream/use (multicodec)><multistream/dynamic (multicodec)> // select multistream/dynamic
<len>/yamux/1.0.0 // select yamux
<new yamux stream> // create the stream
<multistream/use (multicodec)><multistream/dynamic (multicodec)> // select multistream/dynamic
<len>/ipfs/kad/1.0.0 // select kad dht 1.0
<dht request...> // send the DHT request
```

And the receiver will send:

```
<multistream/use (multicodec)><serial-stream (multicodec)> // use serial-stream to make the stream recoverable
<len> // serial-stream message framing
<multistream/use (multicodec)><multistream/advertise (multicodec)> // select advertise protocol
<advertise data> // comlete advertise information (protocols, etc.)
-1 // return to multistream (EOF)

<multistream/use (multicodec)><serial-stream (multicodec)> // open a new serial-stream
-1 // transition to that stream (we speak yamux)

<multistream/use (multicodec)><multistream/dynamic (multicodec)> // select multistream/dynamic
<len>/yamux/1.0.0 // select yamux
<yamux stream 1> // respond to the new yamux stream
<multistream/use> // select multistream/dynamic
<multistream/dynamic>
<len>/ipfs/kad/1.0.0 // select kad dht
<dht response...> // send the DHT response
```

Note: Ideally, we'd be able to avoid the optimistic yamux negotiation. However,
to do that, some protocol information will have to be embedded in the TLS
negotiation and exposed through a connection-level `Stat` method.

Alternatively, we could choose to include this information in the advertisement
sent *before* the security transport. However, that has some security
implications.