Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[QUIC] Fix native crashes and heap corruption #74611

Closed
wants to merge 2 commits into from

Conversation

CarnaViire
Copy link
Member

There were races between disposing SendBuffers and MsQuic handles and using them from parallel threads. Most prominent in stress tests in configuration -cancelRate 1 -ops 9 (POST Duplex Dispose).

After the change, I didn't see any crashes in stress in a couple of hours, will keep them running in the meantime.

Change has no impact on perf.

Fixes #72696

@CarnaViire CarnaViire requested a review from a team August 25, 2022 22:12
@ghost ghost assigned CarnaViire Aug 25, 2022
@ghost
Copy link

ghost commented Aug 25, 2022

Tagging subscribers to this area: @dotnet/ncl
See info in area-owners.md if you want to be subscribed.

Issue Details

There were races between disposing SendBuffers and MsQuic handles and using them from parallel threads. Most prominent in stress tests in configuration -cancelRate 1 -ops 9 (POST Duplex Dispose).

After the change, I didn't see any crashes in stress in a couple of hours, will keep them running in the meantime.

Change has no impact on perf.

Fixes #72696

Author: CarnaViire
Assignees: -
Labels:

area-System.Net.Http

Milestone: -

}
}

public void SafeCall(Action<MsQuicSafeHandle> call)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This generally should not be need. runtime would automatically grab reference on pinvoke if we pass in safe handle. .... But we don't. It seems like we lost that when switching to generated API. This is all hidden as we pass the safe handle around everywhere and then we don't use it when we should and we pass IntPtr to quic.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not thrilled about the fact that this leads to closure allocation per call in many cases (StreamSend, EnableReceive, ....). I think we should consider (possibly in follow-up) either

  • writing wrappers for MsQuicApi which does this for us (i.e. duplicating what P/Invoke does)
  • look for a way to have the source generator generate them for us.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with both comments, and I believe we should look for a way to solve this in generated API, otherwise it is not sustainable. If we would be able to do that in time, it would be ideal.

Copy link
Member

@rzikm rzikm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM for now, we can follow up with interop improvements in .NET 8.0

@@ -90,6 +90,7 @@ public sealed partial class QuicStream
}
};
private MsQuicBuffers _sendBuffers = new MsQuicBuffers();
private object _sendBuffersLock = new object();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

😢

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Discussed offline; it's either this or make MsQuicBuffers class instead of struct

@CarnaViire
Copy link
Member Author

CarnaViire commented Aug 26, 2022

There's a branch with a different approach to safe calls via "generated-like" interop; discussed with @ManickaP that it will be her choice with which solution to proceed
main...CarnaViire:runtime:quic-crash-fix-v2

UPD: opened alternative PR with that branch: #74669

@ghost
Copy link

ghost commented Aug 26, 2022

Tagging subscribers to this area: @dotnet/ncl
See info in area-owners.md if you want to be subscribed.

Issue Details

There were races between disposing SendBuffers and MsQuic handles and using them from parallel threads. Most prominent in stress tests in configuration -cancelRate 1 -ops 9 (POST Duplex Dispose).

After the change, I didn't see any crashes in stress in a couple of hours, will keep them running in the meantime.

Change has no impact on perf.

Fixes #72696

Author: CarnaViire
Assignees: CarnaViire
Labels:

area-System.Net.Quic

Milestone: 8.0.0

ManickaP pushed a commit to CarnaViire/runtime that referenced this pull request Aug 31, 2022
CarnaViire added a commit that referenced this pull request Sep 6, 2022
…terop (#74669)

* Send buffers and handles crash fixes

* Add generated-like interop

* Apply PR feedback from #74611

* Change asserts

* Feedback + moved native methods to their own file

* PR feedback

Co-authored-by: ManickaP <mapichov@microsoft.com>
@CarnaViire
Copy link
Member Author

Closing in favor of #74669

@CarnaViire CarnaViire closed this Sep 6, 2022
CarnaViire added a commit to CarnaViire/runtime that referenced this pull request Sep 7, 2022
…terop (dotnet#74669)

* Send buffers and handles crash fixes

* Add generated-like interop

* Apply PR feedback from dotnet#74611

* Change asserts

* Feedback + moved native methods to their own file

* PR feedback

Co-authored-by: ManickaP <mapichov@microsoft.com>
carlossanlop pushed a commit that referenced this pull request Sep 8, 2022
…terop (#74669) (#75192)

* Send buffers and handles crash fixes

* Add generated-like interop

* Apply PR feedback from #74611

* Change asserts

* Feedback + moved native methods to their own file

* PR feedback

Co-authored-by: ManickaP <mapichov@microsoft.com>

Co-authored-by: ManickaP <mapichov@microsoft.com>
@ghost ghost locked as resolved and limited conversation to collaborators Oct 6, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[HTTP/3] SIGABRT in stress tests
5 participants