-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[.NET 5 regression] Socket.AsyncOperation gets processed too many times #45673
Comments
Tagging subscribers to this area: @dotnet/ncl Issue Details@geoffkizer commented in #37974 (comment):
This should have been be set to runtime/src/libraries/System.Net.Sockets/src/System/Net/Sockets/SocketAsyncContext.Unix.cs Lines 2091 to 2114 in 1d9e50c
Calling Process multiple times can lead to weird issues. When the first call leads to completing the operation, the instance may be reused for another operation by the time Process gets called the second time. cc @karelz @antonfirsov @stephentoub @dotnet/ncl
|
Triage: We will run it by shiproom for backporting. While not super bad, in certain rare situations it can cause quite a lot of headaches and servicing issues. We should check TechEmpower benchmark (just in case, given perf impact). |
@karelz Can you point me to the servicing template? I can't remember where to find it. Details for shiproom: When we get epoll notifications for a particular socket, we can get either a read notification, a write notification, or both. Each notification will cause us to perform the IO and then invoke the user callback. Prior to PR #37974, when we got both notifications, we would dispatch one notification to the thread pool so that both user callbacks can be processed concurrently. Unfortunately, #37974 inadvertently broke this behavior, and instead resulted in the notifications being processed in sequence. This means that the second IO operation and callback won't be invoked until the first callback completes, which could in theory take arbitrarily long. This can lead to unexpected behavior and, at worst, deadlocks. It's probably not that common in practice, but it would be extremely hard to diagnose if it was hit. The fix is straightforward -- we simply revert to the previous correct behavior here, and isolate the changes from #37974 so they do not affect the common path. I believe we should take this fix for 5.0. |
A recent example: #46598. |
Customer ImpactWhen we get epoll notifications for a particular socket, we can get either a read notification, a write notification, or both. Each notification will cause us to perform the IO and then invoke the user callback. Prior to PR #37974, when we got both notifications, we would dispatch one notification to the thread pool so that both user callbacks can be processed concurrently. Unfortunately, #37974 inadvertently broke this behavior, and instead resulted in the notifications being processed in sequence. This means that the second IO operation and callback won't be invoked until the first callback completes, which could in theory take arbitrarily long. This can lead to unexpected behavior and, at worst, deadlocks. It's probably not that common in practice, but it would be extremely hard to diagnose if it was hit. Regression?Yes, caused by #37974 in 5.0. TestingIt's not really possible to write a test for this because it requires epoll signalling both the read and write notification in the same notification, which is timing dependent and very difficult to make happen consistently. RiskLow. The fix is to basically just revert to the previous correct behavior here, and isolate the changes from #37974 so they do not affect the common path. |
I opened #46745 to push this forward. Need a few approvals before the shiproom meeting + we need to run TechEmpower. |
@geoffkizer commented in #37974 (comment):
This should have been be set to
false
when it gets called inHandleEvents
to avoid processing anAsyncOperation
multiple times.runtime/src/libraries/System.Net.Sockets/src/System/Net/Sockets/SocketAsyncContext.Unix.cs
Lines 2091 to 2114 in 1d9e50c
Calling Process multiple times can lead to weird issues. When the first call leads to completing the operation, the instance may be reused for another operation by the time Process gets called the second time.
cc @karelz @antonfirsov @stephentoub @dotnet/ncl
The text was updated successfully, but these errors were encountered: