-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
remove "unhandled task failure" message printing #27722
Conversation
So I guess the flipside is that we should just make it easy for users to launch tasks with wrappers around them; e.g. there should be a |
Or do something like trio: #6283 (comment). Basic premise: tasks must form a tree and all child tasks must terminate before their parent task does. There's an escape hatch, however, where you can spawn a task that outlives the current task by spawning it explicitly into a "nursery" which then becomes responsible for that task. This way errors can propagate up the task tree in a well-defined manner and there's always some task that's responsible for it. Note this comment from the trio docs:
Yes, that's exactly the property we want. It would also mean that cancelling tasks would make sense without chaos: when you cancel a task it cancels all its child tasks; it doesn't matter how many subtasks some task spawned in order to do its work, if you cancel a task they're all neatly killed. |
+1. I think we need a better approach to launching tasks, as Stefan suggested above, before we remove this. |
In considering this, I'm not sure if we'd need a "nursery" object type since we have first-class tasks: if you want to spawn a task that outlives the current task, you could be required to explicitly given it a parent task which is responsible for it (gets its unhandled exceptions, cancels it if cancelled). All you really need is a task tree which exceptions propagate up and cancellations propagate down. With normal blocking APIs, the parent is the caller; with some APIs you can give an explicit parent task. |
I believe the way Trio ensures exceptions are handled is roughly that it requires all tasks to be spawned inside the equivalent of our Put differently, the only way Trio argues against this change is if it has a SUPPRESS_EXCEPTION_PRINTING flag, and that flag has generally been considered a good API. |
My understanding is that if a child task dies, the error propagates into the parent. If the parent handles it and continues, all is well (the child is still dead, but everything else continues). If the parent doesn't handle the child's exception, then it too dies, which kills off all its children—i.e. the sibling tasks of the original child task—and the error propagates further up to the grandparent of the original child. If no task ever handles the exception, it eventually reaches the root of the task tree and the entire process terminates with the child's exception as the cause. Hence the claim "Trio never discards exceptions"—either some task handles an exception or it kills the process. |
I assume that happens when the parent task exits? I.e. when a task exits, it waits on all of its child tasks, unless they were explicitly "detached" somehow. That might be ok. But it still doesn't argue for printing an exception as soon as a task dies before anybody has called |
I'm not really arguing against this PR but I do think that we should do something so that every exception must be handled. It's true that the trio approach does not entail a failed task printing its error as soon as it ends, it just has a comparable effect because if no one catches the error then the root task prints its error and quits.
I'm not 100% clear on that. The parent may be "blocked" until all of its children return which is a somewhat different model than what we've had. But I could be wrong. |
I'm fine with something like this in theory. But the whole problem is defining what it means for nobody to catch the error --- how long do you wait for somebody to catch the error? And Trio does provide an escape hatch:
I'm not sure I fully understand this, but I believe it means you can have a "rogue" nursery sitting around with possibly-failed tasks in it, waiting (possibly forever) for somebody to look at it. |
Does anybody consider this a breaking API change? Seems like a borderline case. |
Triage seems to be ok with this. We can try to do something better in the future, possibly handling it in the gc. |
:,( The end of a useful feature that has helped find numerous bugs and broken tests. I will impatiently await the day when we get that something better in the future! |
Are you ok with printing on gc? Or anything that's not a race condition? The race condition is really what I object to most here. |
I think printing on GC seems like a good compromise. |
Yes, I agree that it is nice to remove the race condition, although since it only happens when the code fails, it seems like it would be rare to observe in practice. Handling it in the GC (or at least, spawning a task to do the printing) seems good. |
This is a repeat of #12736; I'm going to propose this again.
task_done_hook
entirely. It's bad because every task has to pay for everything in that hook whether it needs the feature or not. Instead there can be just one mechanism (waiting in a task's completion queue) you can use to do whatever you need when a task exits. I plan to removeTASKDONE_HOOKS
as well in a future PR.SUPPRESS_EXCEPTION_PRINTING
flag sometimes is just silly. No such thing should ever exist.