Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add start/stop extension markers for fragment chain #1597

Merged
merged 11 commits into from
Dec 5, 2024

Conversation

wyfo
Copy link
Contributor

@wyfo wyfo commented Nov 15, 2024

A fragment chain must begin with a start marker, otherwise the whole chain will be dropped.

When a fragmented message is dropped by congestion control, a final fragment with stop marker must be emitted. Because no batch is available, an ephemeral batch, i.e. not recycled in the pipeline after, is created.

A fragment chain must begin with a start marker, otherwise the whole chain will be dropped.

When a fragmented message is dropped by congestion control, a final fragment with stop marker must be emitted.
Because no batch is available, an ephemeral batch, i.e. not recycled in the pipeline after, is created.
Copy link

PR missing one of the required labels: {'documentation', 'internal', 'new feature', 'dependencies', 'breaking-change', 'bug', 'enhancement'}

@wyfo wyfo added the enhancement Existing things could work better label Nov 15, 2024
I've checked, it doesn't change the size of `WBatch`
@wyfo wyfo force-pushed the feat/fragment_marker branch from 028fd8e to ad8d0a4 Compare November 15, 2024 16:25
@wyfo
Copy link
Contributor Author

wyfo commented Nov 15, 2024

@Mallets I've moved ephemeral flag out of BatchConfig, see the second commit.

@Mallets
Copy link
Member

Mallets commented Nov 18, 2024

I think for backward compatibility we still need to address two main points:

  1. Zenoh-Pico should be updated accordingly and implement the start and stop markers
  2. To avoid protocol breaking, the RX side should be able to handle also the case where no start marker is sent. This could be done lazily.

Finally, I wonder if we could add the information about the total message length in the start marker. The total length could be eventually used to preallocate a single buffer to copy the message on. I understand this would require changing quite a bit the deserialisation code. As well, it's still unclear to me the real impact on performance.

@wyfo
Copy link
Contributor Author

wyfo commented Nov 18, 2024

To avoid protocol breaking, the RX side should be able to handle also the case where no start marker is sent. This could be done lazily.

It's handled by dropping the fragment. I don't see a way to provide backward compatibility here, that was my main concern about this change.

Finally, I wonder if we could add the information about the total message length in the start marker. The total length could be eventually used to preallocate a single buffer to copy the message on.

I don't think it's a good idea regarding routing, because the message would be refragmented right after. However, on the client side, it would allow to avoid ZBytes fragmentation, which may be a big plus. But if you want to avoid copy, the deserialization code will indeed need a deep refactor.

@wyfo
Copy link
Contributor Author

wyfo commented Nov 18, 2024

Still, adding the size on start extension doesn't cost anything and is still a nice information to have when debugging, so why not adding it.

@Mallets
Copy link
Member

Mallets commented Nov 18, 2024

It's handled by dropping the fragment. I don't see a way to provide backward compatibility here, that was my main concern about this change.

It's about improving the behaviour for future Zenoh versions (especially in patch release) without requiring dropping old versions (i.e. that would require bumping the protocol version). A viable option is to start by default in no start marker mode and as soon as start marker is detected we switch mode. Alternatevely, we could handle this in a Init/Open extension.

@Mallets
Copy link
Member

Mallets commented Nov 18, 2024

Still, adding the size on start extension doesn't cost anything and is still a nice information to have when debugging, so why not adding it.

It costs bytes on the network. If we don't have a use for it, let's not send it.

@wyfo
Copy link
Contributor Author

wyfo commented Nov 18, 2024

A viable option is to start by default in no start marker mode and as soon as start marker is detected we switch mode. Alternatevely, we could handle this in a Init/Open extension.

I prefer an Init extension here, otherwise, if you loose the first fragment, it's already too late. Do I implement it?

@Mallets
Copy link
Member

Mallets commented Nov 18, 2024

A viable option is to start by default in no start marker mode and as soon as start marker is detected we switch mode. Alternatevely, we could handle this in a Init/Open extension.

I prefer an Init extension here, otherwise, if you loose the first fragment, it's already too late. Do I implement it?

After a second thought I also prefer the Init/Open approach. Go for it :)

@Mallets Mallets mentioned this pull request Nov 18, 2024
wyfo added a commit to ZettaScaleLabs/zenoh-pico that referenced this pull request Nov 25, 2024
@wyfo wyfo requested a review from Mallets November 25, 2024 06:44
@wyfo
Copy link
Contributor Author

wyfo commented Nov 25, 2024

@Mallets I came to the conclusion that both ends don't need to agree on a patch version, i.e. selecting the minimum of both. In fact, each end can just keep the other's patch version, and let the code do the rest.
For example, if A has patch = 1 and B has patch = 0, after init syn/ack exchange, PatchType::has_fragmentation_start_stop will return false on A, and on B side, there will be no PatchType, so both "agree" to not use fragmentation markers.
On the other hand, if B has patch = 2 this time, then PatchType::has_fragmentation_start_stop will return true on both sides, and an hypothetical PatchType::has_new_feature will return false on B, so again, there is a de facto "agreement" on the patch version.

@Mallets
Copy link
Member

Mallets commented Nov 25, 2024

@Mallets I came to the conclusion that both ends don't need to agree on a patch version, i.e. selecting the minimum of both. In fact, each end can just keep the other's patch version, and let the code do the rest. For example, if A has patch = 1 and B has patch = 0, after init syn/ack exchange, PatchType::has_fragmentation_start_stop will return false on A, and on B side, there will be no PatchType, so both "agree" to not use fragmentation markers. On the other hand, if B has patch = 2 this time, then PatchType::has_fragmentation_start_stop will return true on both sides, and an hypothetical PatchType::has_new_feature will return false on B, so again, there is a de facto "agreement" on the patch version.

I agree with the reasoning. However, the Init/Open role is to also negotiate parameters and setting the minimum value in the message will make clear which version is chosen also when looking at Wireshark. Also, in the Cookie the agreed patch version should stored, so it's more about being explicit rather than implicit. Does it make sense?

@wyfo
Copy link
Contributor Author

wyfo commented Nov 25, 2024

I agree with the reasoning. However, the Init/Open role is to also negotiate parameters and setting the minimum value in the message will make clear which version is chosen also when looking at Wireshark. Also, in the Cookie the agreed patch version should stored, so it's more about being explicit rather than implicit. Does it make sense?

It does. I will roll back the minimum check. Then I will have to find a proper error code in zenoh-pico...

@wyfo wyfo requested a review from Mallets November 25, 2024 11:52
@wyfo wyfo force-pushed the feat/fragment_marker branch from 784cb4c to e4883f4 Compare November 25, 2024 11:54
@wyfo
Copy link
Contributor Author

wyfo commented Nov 25, 2024

@Mallets Should we store minimum patch version in TransportMulticastPeer, to be "consistent" with the unicast workflow?

@Mallets
Copy link
Member

Mallets commented Nov 25, 2024

@Mallets Should we store minimum patch version in TransportMulticastPeer, to be "consistent" with the unicast workflow?

I'd say yes.

@Mallets
Copy link
Member

Mallets commented Nov 27, 2024

This PR LGTM. However, I prefer to wait for having eclipse-zenoh/zenoh-pico#813 tested and approved before merging.

wyfo added a commit to ZettaScaleLabs/zenoh-pico that referenced this pull request Dec 4, 2024
wyfo added a commit to ZettaScaleLabs/zenoh-pico that referenced this pull request Dec 4, 2024
@wyfo
Copy link
Contributor Author

wyfo commented Dec 5, 2024

The lint error came from the recently 1.83 release. I will fix it while resolving the conflict. By the way, zeno-pico PR has been approved.

wyfo added 2 commits December 5, 2024 07:45
# Conflicts:
#	io/zenoh-transport/src/common/pipeline.rs
@Mallets Mallets merged commit 64e8caa into eclipse-zenoh:main Dec 5, 2024
13 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Existing things could work better
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants