-
Notifications
You must be signed in to change notification settings - Fork 13.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Failing future_result nondeterminism violates exclusive_unwrap_conflict's assumptions #4689
Comments
alright, i don't suppose you'd happen to have any logs of the times it failed? I haven't been able to reproduce the test failing (i.e., the test function terminating successfully). Looking over the unwrap implementation I think it's basically impossible for the
so I think it is safe to say that exactly one of the tasks will hit the What's more, the test case itself has the parent thread (the one the test harness will block on) block on its child using a future_result. Either the parent will hit the die above (in which case the test case will surely pass), or the child will hit it, in which case the child's failure should link to the parent through the parent's So far I've experimented adding delays into various parts of the unwrap implementation to see which paths this test actually exercises (the state space is not very large, so it is not a Research Problem :P ), but not seen the bug in any yet. Will continue the search another day. EDIT: reproduction successful. (I'd forgotten that there was a difference in failure semantics between when the tasks are in the main/root taskgroup and when they're running as a #[test]! Argh.) More later. |
Would you believe this is the minimized test case for this bug? No unwrap involved at all.
Depending on the sleep, the test will either work or not (i.e., the problem is when a failing task sends to its future_result before the parent even starts to receive on it, the linked failure doesn't take effect). I'm not sure if this is an intended nondeterminism of anyway one possible way to fix this is to change the last line of the test case |
OK, what's going on here is linked failure doesn't propagate through a As another example, consider this program, whose main task always succeeds. "Fixing" the nondetermism above would make this program sometimes succeed (if the parent recvs before the child sends) and sometimes fail (if the child sends and fails before the parent recvs).
Introducing the extra killed check wouldn't require taking a lock (see #3213). But, I've always punted on the semantics of when a linked failure killing takes effect, and adding this check feels like a patchwork special-case rather than a principled semantics change. @eholk might have input on the matter? |
Adding the |
@bblum Thanks for looking into this. I understand how This is similar to a problem I frequently run into where I expect test cases like this to work (by failing):
|
Yeah, those are annoying cases. It might be nice to have a spawn interface that, while letting the parent go off and do its own thing, also makes the parent block on the child (on all such children spawned this way) before actually task-exiting. This could be done with |
* Backport PR rust-lang#4730 that fix issue rust-lang#4689 * Test files for each Verion One and Two * Simplify per review comment - use defer and matches! * Changes per reviewer comments for reducing indentations
In core,
private::tests::exclusive_unwrap_conflict
fails maybe a few times per month. Linux x86_64. I have no further information.The text was updated successfully, but these errors were encountered: