Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[multistream-select] Require remaining negotiation data to be flushed. #1781

Merged
merged 4 commits into from
Oct 1, 2020

Conversation

romanb
Copy link
Contributor

@romanb romanb commented Sep 30, 2020

There appears to still be an edge-case in multistream-select whereby the remaining data to send w.r.t. protocol is successfully written before a poll_read on a Negotiated stream, but where the subsequent poll_flush() is pending. Now remaining is empty and the next poll_read() will go straight to reading from the underlying I/O stream, despite the flush not having happened
yet, which can lead to a form of deadlock during protocol negotiation. This seems to be the cause for the sporadic upgrade timeouts in #1629.

It was a happy coincidence that I looked into #1629 again just before async-io-1.1.5 was released, because up to 1.1.4, write operations would occasionally yield without really needing to, like this. 1.1.5 removed that random yielding for write operations but it is exactly these random yields that provoke the issue in the setup provided by #1629 (i.e. this code).

Rather than complicating the existing code further in order to accommodate for this case (e.g. by wrapping the remaining bytes in an Option to distinguish whether they have been flushed or not), it seems preferable to simplify the code by giving up on this optimisation that only affects the last negotiation protocol message sent by the "listener". So we give up on the ability to combine data sent by the "listener" immediately after protocol negotiation together with the final negotiation frame in the same transport-level frame/packet. That seems acceptable. The code simplifications should be convincing.

Roman S. Borschel added 2 commits September 30, 2020 15:20
There appears to still be an edge-case whereby the
`remaining` data to send w.r.t. protocol negotiation to send
is successfully written before a `poll_read` on a `Negotiated` stream,
but where the subsequent `poll_flush()` is pending.
Now `remaining` is empty and the next `poll_read()`
will go straight to reading from the underlying
I/O stream, despite the flush not having happened
yet, which can lead to a form of deadlock during
protocol negotiation.

Rather than complicating the existing code further in
order to accommodate for this case, it seems preferable
to simplify the code by giving up on this optimisation
that only affects the last negotiation protocol message
sent by the "listener". So we give up on the ability
to combine data sent by the "listener" immediately
after protocol negotiation together with the final
negotiation frame in the same transport-level frame/packet.
@romanb
Copy link
Contributor Author

romanb commented Sep 30, 2020

There appears to still be an edge-case in multistream-select whereby the remaining data to send w.r.t. protocol is successfully written before a poll_read on a Negotiated stream, but where the subsequent poll_flush() is pending. Now remaining is empty and the next poll_read() will go straight to reading from the underlying I/O stream, despite the flush not having happened
yet, which can lead to a form of deadlock during protocol negotiation. This seems to be the cause for the sporadic upgrade timeouts in #1629.

To make this a bit more concrete: If Negotiated::poll_read still has remaining data to send despite being in state Completed, it calls Negotiated::poll() to complete protocol negotiation. Then we try to flush() which writes the remaining negotiation data. However, if the final flush is Pending, remaining is already empty and on the next Negotiated::poll_read() we go straight to reading from the underlying I/O stream, but the remote may still be waiting for the remaining negotiation data/frame before sending and that still sits around unflushed in the buffer. The sporadic, random yields by async-io <= 1.1.4 for write operations, including poll_flush() is what made this more likely to occur every now and then.

Copy link
Member

@mxinden mxinden left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great catch! Implementation looks good to me.

So we give up on the ability to combine data sent by the "listener" immediately after protocol negotiation together with the final negotiation frame in the same transport-level frame/packet.

For e.g. TCP connections where TCP_NODELAY is not set, hence Nagle's algorithm still being active, final negotiation frames might still be bundled with upper layer payloads in the same transport-level frame, no?

@romanb
Copy link
Contributor Author

romanb commented Oct 1, 2020

For e.g. TCP connections where TCP_NODELAY is not set, hence Nagle's algorithm still being active, final negotiation frames might still be bundled with upper layer payloads in the same transport-level frame, no?

I think so, yes.

@romanb romanb merged commit 8cec457 into libp2p:master Oct 1, 2020
@romanb romanb deleted the multiselect-flush-remaining branch October 1, 2020 10:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants