Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a Stream Migration spec #406

Open
wants to merge 12 commits into
base: master
Choose a base branch
from

Conversation

MarcoPolo
Copy link
Contributor

Initial pass at creating a spec for #328. I'll start a PoC as well to start fleshing out some of the ideas and check my understanding.

@MarcoPolo
Copy link
Contributor Author

r? @marten-seemann for the first pass, I'll tag more folks as this progresses. (also I don't think I have permissions to add reviewers to the PR).

Copy link
Contributor

@marten-seemann marten-seemann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a good start!

A few thoughts:

  • We need to define how the stream labels (A and B) are assigned (multistream).
  • I’m wondering if each side should use their own namespace for streams, or if they should share a namespace (e.g. by say that client/server use even/odd stream IDs). Maybe then it would be possible to migrate streams that the peer initiated (is this even desireable?).

connections/stream-migration.md Outdated Show resolved Hide resolved
connections/stream-migration.md Outdated Show resolved Hide resolved
connections/stream-migration.md Outdated Show resolved Hide resolved
plantuml stream-migration.md -o stream-migration -tsvg
```
</details>

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens if the peer misbehaves, e.g. when B just doesn’t send an EOF on stream B? While A can “consider” the stream as closed, it still needs to be closed explicitly, otherwise the stream multiplexer can’t garbage-collect the stream after it has been EOFed from both sides.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think if the Responder misbehaves and doesn't close B then the stream migration hasn't finished since the new stream isn't in the same state as the old stream. I'm not sure what else we can do besides consider it not spec-compliant.

connections/stream-migration.md Outdated Show resolved Hide resolved
@MarcoPolo
Copy link
Contributor Author

I’m wondering if each side should use their own namespace for streams, or if they should share a namespace (e.g. by say that client/server use even/odd stream IDs). Maybe then it would be possible to migrate streams that the peer initiated (is this even desireable?).

Can we rely on the underlying stream to give us a stream ID we can use for identifying streams?

What do you mean by migrating streams that the peer initiated? If I'm migrating a stream from a connection a remote peer started to a connection I started should that matter? I'm just changing how the bytes are getting there but everything else should be the same? Maybe there's something I'm missing here.

@marten-seemann
Copy link
Contributor

Can we rely on the underlying stream to give us a stream ID we can use for identifying streams?

Probably not. While stream IDs are unique for every, they're not across muxers. It's probably easier if we introduce a new ID here.

What do you mean by migrating streams that the peer initiated? If I'm migrating a stream from a connection a remote peer started to a connection I started should that matter? I'm just changing how the bytes are getting there but everything else should be the same?

Maybe I'm just confused by the word "initiator" here. Do we mean the initiator of the connection or the initiator of the stream?

@MarcoPolo
Copy link
Contributor Author

I think this finally clicked for me yesterday. The stream migration protocol is a prefix on top of another protocol. That’s how you can send the identify this stream as id=A message. I’ll update the spec on Monday.

@MarcoPolo MarcoPolo marked this pull request as draft April 16, 2022 15:26
### Stream migration protocol id

The stream migration protocol id should follow the format of
`/streamMigration/1.0.0/<streamID>`. The stream should be an int.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/libp2p/stream-migration/1.0.0

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/libp2p/stream-migration/, we can still version this later if necessary.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The protocol name can't include the stream ID. The stream ID is the payload of this protocol.

Copy link
Contributor Author

@MarcoPolo MarcoPolo Apr 20, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The protocol name can't include the stream ID. The stream ID is the payload of this protocol.

Why not? This way you wouldn't need a separate payload message.

The initiator would send the stream-migration protocol ID+stream id, then can send the underlying protocol right away without having to wait for a response.

The responder would read the stream migration protocol and ack it (by echoing back as in multistream select. Note this doc doesn't say this part yet.), then continue negotiating the underlying protocol. If the other node for some reason doesn't support stream-migration (even if we thought it did), it would echo back na, and continue as before.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah, it could be because we cache the protocols seen on the other side, correct?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, exactly. Even worse, we send an Identify Delta messages for every new protocol that we add.

Also, logically speaking, the stream ID is not part of the protocol ID, but it's a payload of the stream migration protocol. Counting the bytes on the wire, it makes no (or at least not more than a few byte) difference if it's in the protocol name or in the payload.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure this was mentioned before: For the sake of evolvability of the protocol in the future, can the stream ID be send embedded in a Protobuf? That way we can add new fields in the future.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good call out. I did this in the poc, but I’ll update the spec to include this.

connections/stream-migration.md Outdated Show resolved Hide resolved
connections/stream-migration.md Outdated Show resolved Hide resolved
connections/stream-migration.md Outdated Show resolved Hide resolved
### Stream migration protocol id

The stream migration protocol id should follow the format of
`/streamMigration/1.0.0/<streamID>`. The stream should be an int.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The protocol name can't include the stream ID. The stream ID is the payload of this protocol.

match the swarm's behavior in best connection selection.

Note that it's not required that all implementations (and all versions) follow
the same heuristics since the initiator is driving the migration and specifies
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Below I am assuming that Initiator refers to the connection initiator, not the stream initiator. Please correct me in case I am misunderstanding this @MarcoPolo.

Say there are two nodes A and B. Connection AB is initiated by A to B. Conneciton BA is initiated by B to A.

Say that A and B follow different heuristics to pick the best connection. A chooses AB as the best connection, B chooses BA as the best connection.

If I understand the above correctly, this would result in A moving all the streams it created to AB and B moving all the streams it created to BA. Both connections would thus stay alive.

Potential solution: Instead of allowing both A and B to migrate streams, how about delegating the decision making to the peer with the lower peer ID, e.g. in this case A?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes!

I was thinking about this this weekend and came up with a similar solution. Glad to see you also came to the same conclusion. I'll update this spec to make this explicit.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Few thoughts on this:

  1. Currently each peer chooses its own IDs for streams, i.e. there are two distinct spaces of stream IDs. If we want to allow the receiver of a stream to migrate that stream, we need a single stream ID space. One way to realize this would be to mandate the client (roles as seen by the stream muxer) to use odd and the server to use even stream IDs.
  2. I don't think this document should describe how peers would choose AB over BA. This document should only describe how to migrate one libp2p stream from one muxer stream to another. For all that this spec cares about, those streams might (or might not) live on the same underlying connection. We can then use the stream migration protocol as a building block to converge onto one connection (and the peer ID comparison is quite a neat idea, I like it!), but that should probably be described in a different document.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Yes. I was considering adding a boolean flag that would indicate which peer initially identified the stream it's referencing when migrating (was I the initiator of the from stream?). This is the same as using even and odd numbers, since that scheme effectively encodes this boolean in the least significant bit. I'm fine with either way. Maybe it's a little easier to think about even and odds, so I'll do that.

  2. Agreed that describing how to sort connections is out of the scope of this document (I imagine that spec to iterate more and and possibly have more subtle details). But I do think this spec should define who is responsible for doing the stream migration. If we end up in the situation where we have two identical connections (A dialed B and B dialed A at roughly the same time) we should describe who is in charge of doing the stream migration. By defining which node starts the stream migration we simplify this protocol and also avoid having to handle cases where both sides start stream migration at the same time.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Potential solution: Instead of allowing both A and B to migrate streams, how about delegating the decision making to the peer with the lower peer ID, e.g. in this case A?"

Wouldn't this create a biais toward lower peerIDs? Maybe we can hash the concatenation of the two peerIDs (the lower first). If the hash is even, use the lowest. Else use the highest. That way it is deterministic but no peerID is systematically favored over another.

Copy link
Contributor Author

@MarcoPolo MarcoPolo Apr 25, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe it's not worth the extra complexity though since for a random ID A there's a 50% odd that it's less than another random id B. (since it's equiprobable that B is smaller).

Copy link
Contributor Author

@MarcoPolo MarcoPolo Apr 25, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah on second thought I agree with Max. I actually don't see the benefits since it's still 50% odds either way. Updated 49c0597

Copy link
Contributor

@marten-seemann marten-seemann Apr 25, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm still not convinced that we should specify anything here at all. Stream migration is a general feature, a building block.

The use case we have in mind now is migrating all streams from one connection to another, but we might come up with other use cases in the future. I'd prefer to have stream migration just be a thing that any node, regardless of its peer ID, can use in principle.
For the specific use case of converging on a single connection, comparing peer IDs seems reasonable.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let me try to rephrase to see if I understand:

  1. The protocol should only specify how a node would perform a stream migration.
  2. It doesn't define who starts the migration or why.
  3. Consolidating connections would be a layer on top of this that defines which nodes is in charge of migrating streams to empty and close connections.

We just have to make sure that what we design here doesn't block point 3.

If that seems accurate, then I agree we don't need this in here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also @marten-seemann to highlight a recent change re:

Currently each peer chooses its own IDs for streams, i.e. there are two distinct spaces of stream IDs. If we want to allow the receiver of a stream to migrate that stream, we need a single stream ID space. One way to realize this would be to mandate the client (roles as seen by the stream muxer) to use odd and the server to use even stream IDs.

I've specced something similar, except the lower peer id node uses even and the higher peer id node uses odds. This let's us avoid having to rely on the stream muxer to give us this role. And it also works across connections (it gets confusing if the stream muxer says we are the client on connection and the server on the other).

@MarcoPolo MarcoPolo marked this pull request as ready for review April 27, 2022 10:10
oneof type {
Label label = 1;
Migrate migrate = 2;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this need an AckMigrate option?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah yes, thank you

initial stream `1` was half closed, then the final migrated stream `2` should
also be half closed. Note this may involve an extra step by one of the nodes.
If a node, had closed writes to its old stream before migration it should also
close writes to the new stream after migration.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should probably call out what happens if a node sends new data if the stream was already closed in that direction: that is a connection error.

state of the old stream.

The protocol should only be used when the initiator knows the responder
understands the stream-migration protocol. Otherwise we waste 1 round trip.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
understands the stream-migration protocol. Otherwise we waste 1 round trip.
understands the stream-migration protocol. In that case the negotiation of the stream migration protocol can be pipelined with the negotiation of the application protocol, and therefore doesn't cost any additional round trips.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: 🥞 Todo
Status: Triage
Development

Successfully merging this pull request may close these issues.

5 participants