Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[QUIC] Call silent shutdown in case CloseAsync failed. #96807

Merged
merged 8 commits into from
Jan 26, 2024

Conversation

ManickaP
Copy link
Member

  1. Makes cancellation in ValueTaskSource temporary, which enables receiving the expected MsQuic signal even after cancellationToken fired. In this case SHUTDOWN_COMPLETE.
  2. If ValueTaskSource is cancelled, remember what happened to it before the caller read the OCE so that it can be restored to the task afterwards.
  3. When the OCE is delivered, restore ValueTaskSource to the remembered state (awaiting or completed with the result).
  4. Lastly, examine the state of _shutdownTcs in Dispose and decide whether the silent shutdown needs to be called.

Changes are in ValueTaskSource and QuicConnection. Other changes are cosmetic, not functional (comments, code formatting).

Fixes #78641

I also did a manual testing with cancellation between CloseAsync --> connection.Shutdown and SHUTDOWN_COMPLETED by inserting delay to the event handler, but this as this is timing sensitive it's impossible to do reliably in a test.

@ghost
Copy link

ghost commented Jan 10, 2024

Tagging subscribers to this area: @dotnet/ncl
See info in area-owners.md if you want to be subscribed.

Issue Details
  1. Makes cancellation in ValueTaskSource temporary, which enables receiving the expected MsQuic signal even after cancellationToken fired. In this case SHUTDOWN_COMPLETE.
  2. If ValueTaskSource is cancelled, remember what happened to it before the caller read the OCE so that it can be restored to the task afterwards.
  3. When the OCE is delivered, restore ValueTaskSource to the remembered state (awaiting or completed with the result).
  4. Lastly, examine the state of _shutdownTcs in Dispose and decide whether the silent shutdown needs to be called.

Changes are in ValueTaskSource and QuicConnection. Other changes are cosmetic, not functional (comments, code formatting).

Fixes #78641

I also did a manual testing with cancellation between CloseAsync --> connection.Shutdown and SHUTDOWN_COMPLETED by inserting delay to the event handler, but this as this is timing sensitive it's impossible to do reliably in a test.

Author: ManickaP
Assignees: -
Labels:

area-System.Net

Milestone: -

@ghost
Copy link

ghost commented Jan 10, 2024

Tagging subscribers to this area: @dotnet/ncl
See info in area-owners.md if you want to be subscribed.

Issue Details
  1. Makes cancellation in ValueTaskSource temporary, which enables receiving the expected MsQuic signal even after cancellationToken fired. In this case SHUTDOWN_COMPLETE.
  2. If ValueTaskSource is cancelled, remember what happened to it before the caller read the OCE so that it can be restored to the task afterwards.
  3. When the OCE is delivered, restore ValueTaskSource to the remembered state (awaiting or completed with the result).
  4. Lastly, examine the state of _shutdownTcs in Dispose and decide whether the silent shutdown needs to be called.

Changes are in ValueTaskSource and QuicConnection. Other changes are cosmetic, not functional (comments, code formatting).

Fixes #78641

I also did a manual testing with cancellation between CloseAsync --> connection.Shutdown and SHUTDOWN_COMPLETED by inserting delay to the event handler, but this as this is timing sensitive it's impossible to do reliably in a test.

Author: ManickaP
Assignees: ManickaP
Labels:

area-System.Net.Quic

Milestone: -

@rzikm
Copy link
Member

rzikm commented Jan 15, 2024

I can still get DisposeAsync to throw like this.

var (clientOptions, _, listenerOptions) = RuntimeSandbox.Helpers.GetDefaultOptions();

await using var pair = await RuntimeSandbox.Helpers.GetConnectedPair(clientOptions, listenerOptions);
var (client, server) = pair;

var cts = new CancellationTokenSource();
var task = client.CloseAsync(0).AsTask();

cts.Cancel(); // throws even without this line
await client.DisposeAsync();
// Unhandled exception. System.InvalidOperationException: Operation is not valid due to the current state of the object.
//    at System.Threading.Tasks.Sources.ManualResetValueTaskSourceCore`1.OnCompleted(Action`1 continuation, Object state, Int16 token, ValueTaskSourceOnCompletedFlags flags)
//    at System.Net.Quic.ValueTaskSource.System.Threading.Tasks.Sources.IValueTaskSource.OnCompleted(Action`1 continuation, Object state, Int16 token, ValueTaskSourceOnCompletedFlags flags)
//    at System.Runtime.CompilerServices.AsyncTaskMethodBuilder`1.AwaitUnsafeOnCompleted[TAwaiter](TAwaiter& awaiter, IAsyncStateMachineBox box)
// --- End of stack trace from previous location ---
//    at System.Threading.Tasks.Task.<>c.<ThrowAsync>b__128_1(Object state)
//    at System.Threading.QueueUserWorkItemCallbackDefaultContext.Execute()
//    at System.Threading.ThreadPoolWorkQueue.Dispatch()
//    at System.Threading.PortableThreadPool.WorkerThread.WorkerThreadStart()
//    at System.Threading.Thread.StartCallback()

Do we care? If not then LGTM.

@CarnaViire
Copy link
Member

CarnaViire commented Jan 15, 2024

@rzikm it seems like it's a different case, not related to this cancellation issue? (do I understand correctly that this "Operation is not valid due to the current state of the object" happens in main, and continues to happen after the PR)?

Ok, I've re-read the stack-trace, and now I don't know if it's related or not 😅 but I'm curious whether this was happening before.

But I think we should care 🤔 I don't think it's ok to throw if dispose and/or close are called in parallel.

@ManickaP
Copy link
Member Author

I can still get DisposeAsync to throw like this.

Just FYI, this happens without this change as well. I'll try looking into it if it's something small that I could put into this PR and if not I'll file separate issue.

@rzikm
Copy link
Member

rzikm commented Jan 17, 2024

But I think we should care 🤔 I don't think it's ok to throw if dispose and/or close are called in parallel.

I think we should never throw from Dispose(Async)

@ManickaP ManickaP force-pushed the mapichov/quic-vts-cancellation branch from e56c45d to 7438127 Compare January 18, 2024 10:10
@ManickaP
Copy link
Member Author

Radek's problem fixed and added test for that.
I have reverted changes in ValueTaskSource and replaced the _shutdownTcs with TaskCompletionSource as Radek's use-case uncovered that we await the same task from 2 points (CloseAsync and DisposeAsync) which is OK if done serially but not in parallel. The easiest way out was to swap it for ordinary Task.

@ManickaP ManickaP requested a review from CarnaViire January 18, 2024 10:14
@@ -467,7 +467,7 @@ public ValueTask CloseAsync(long errorCode, CancellationToken cancellationToken
{
ObjectDisposedException.ThrowIf(_disposed == 1, this);

if (_shutdownTcs.TryInitialize(out ValueTask valueTask, this, cancellationToken))
if (Interlocked.CompareExchange(ref _shutdownTcs, new TaskCompletionSource(TaskCreationOptions.RunContinuationsAsynchronously), null) == null)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are creating an instance here every time regardless whether the exchange happened or not... is that expected?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did not bother with optimizing it. It'll be at most 2 instances (in Gen0) in vain per connection lifetime. I can change it to use another field as a sentinel (e.g. int shutdownCalled) or to preallocate this in other field for a swap. Or just not guard against multiple Shutdown calls 🤷

Copy link
Member

@CarnaViire CarnaViire left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@ManickaP
Copy link
Member Author

Just FYI, using ordinary TCS brought back an issue with GC eating non-rooted QuicConnection while waiting on SHUTDOWN_COMPLETE, so I'll be re-thinking this once again. I'll ping you for re-review when the fix is up.

@ManickaP ManickaP force-pushed the mapichov/quic-vts-cancellation branch from e196f02 to 85cd4eb Compare January 26, 2024 11:16
@ManickaP
Copy link
Member Author

ManickaP commented Jan 26, 2024

Failure is unrelated and known: #97546

@ManickaP ManickaP merged commit 2f96e88 into dotnet:main Jan 26, 2024
109 of 111 checks passed
@ManickaP ManickaP deleted the mapichov/quic-vts-cancellation branch January 26, 2024 13:37
@github-actions github-actions bot locked and limited conversation to collaborators Feb 26, 2024
@karelz karelz added this to the 9.0.0 milestone May 14, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

QuicConnection.DisposeAsync throws OperationCanceledException
4 participants