-
Notifications
You must be signed in to change notification settings - Fork 473
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
crossbeam::channel::Receiver::try_recv can block forever if sending thread is blocked #997
Comments
Also note that a similar block on the sender happens in the bounded channel. Stack trace of sender:
Stack trace of receiver:
|
FWIW, blocking behavior was detected by a user on wasm32 builds (using I tracked it down to |
We've seen this happen as well using Linux (it hangs for ~10 seconds) using In our case the sending thread was the same as the receiver thread but an earlier point (I am not sure why the code was written that way tbh...). |
@rc-andres was your case using a similar thread priority setup? |
See rust-lang/rust#112723 (comment), I think it's unlikely that |
Running on the #1105 branch I no longer see any long |
When I run the reproducer (https://github.com/benhansen-io/mpsc_deadlock_reproducer) on that branch I no longer see the issue! Thanks. |
Using unbounded channels, try_recv can block indefinitely if the sender is stopped at a particular point. This behavior is also exhibited in the std mpsc and in the issue there (rust-lang/rust#112723) I was asked to verify the issue also exists in crossbeam (it does). I'll update the backtraces and code for crossbeam below.
Running the following code (using the crossbeam feature):
Based on the documentation of try_recv:
I would not expect try_recv to block but I see output like the following:
During a period of deadlock I get the following backtraces:
Sending thread:
Receiving thread:
try_recv calling read which calls wait_write thus causing try_recv to wait on the sender seems fundamentally wrong.
The reproducer code above was run on Linux. Full crate code is available at https://github.com/benhansen-io/mpsc_deadlock_reproducer
The issue was originally discovered on a real-time OS where the receiver has a higher priority than the sender (which makes sense for that application). The reproducer code was run on Linux. On Linux, eventually the sending thread will get a time slice so the deadlock isn't forever but on the real-time OS the blocking happens forever.
The text was updated successfully, but these errors were encountered: