-
-
Notifications
You must be signed in to change notification settings - Fork 30.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Runtime finalization assumes all other threads have exited. #80657
Comments
Among the first 3 things that happen in Py_FinalizeEx() are, in order:
At that point the only remaining Python threads are:
The next time any of those threads (aside from main) acquire the GIL, we expect that they will exit via a call to PyThread_exit_thread() (caveat: issue bpo-36475). However, we have no guarantee on when that will happen, if ever. Such lingering threads can cause problems, including crashes and deadlock (see issue bpo-36469). I don't know what else we can do, beyond what we're already doing. Any ideas? |
FYI, I've opened bpo-36477 to deal with the subinterpreters case. |
Adding to the list:
|
Problems with lingering threads during/after runtime finalization continue to be a problem. I'm going to use this issue as the focal point for efforts to resolve this. Related issues:
|
Analysis by @pconnell:
tl;dr daemon threads and external C-API access during/after runtime finalization are causing crashes. |
To put it another way: (from bpo-33608#msg358748)
Adding to that list:
|
So I see 3 things to address here:
Possible solutions (starting point for discussion):
Regarding daemon threads, the docs already say "Daemon threads are abruptly stopped at shutdown." [1] So let's force them to stop. Can we do that? If we *can* simply kill the threads, can we do so without leaking resources? Regardless, the mechanism we currently use (check for finalizing each(?) time through the eval loop) mostly works fine. The problem is when C code called from Python in a daemon thread blocks long enough that it makes C-API calls (or even the eval loop) *after* we've started cleaning up the runtime state. So if there was a way to interrupt that blocking code, that would probably be good enough. The other two possible solutions are, I suppose, a bit more drastic. What are the alternatives? [1] https://docs.python.org/3/library/threading.html#thread-objects |
I think the answer is to document a bit more clearly that they can pose all kinds of problems. Perhaps we could even display a visible warning when people create daemon threads.
We could run the "join non-daemon threads" routine a *second time* after atexit handlers have been called. It probably can't hurt (unless people do silly things?).
This one I don't know how to handle. By construction, a non-Python thread can do anything it wants, and we cannot add guards against this at the beginning of each C API function. I think that when someone calls the C API, we're clearly in the realm of "consenting adults". |
Perhaps we need a threading.throw() API, similar to the one we have for generators and coroutines? If we had that, then Py_FinalizeEx() could gain a few new features:
Adding that would require an enhancement to the PendingCall machinery, though, since most pending calls are only processed in the main thread (there's no way to route them to specific child threads). A simpler alternative would be to have an atomic "terminate_threads" counter in the ceval runtime state that was incremented to 1 to request that SystemExit be raised in daemon threads, and then to 2 to request that SystemExit be raised in all still running threads. When a thread received that request to exit, it would set a new flag in the thread state to indicate it was terminating, and then raise SystemExit. (The thread running Py_FinalizeEx would set that flag in advance so it wasn't affected, and other threads would use it to ensure they only raised SystemExit once). The runtime cost of this would just be another _Py_atomic_load_relaxed call in the eval_breaker branch. (Probably inside |
Thinking about that idea further, I don't think that change would help much, since the relevant operations should already be checking for thread termination when they attempt to reacquire the GIL. That means what we're missing is:
|
related to the above linked issues and ongoing thoughts about how horrible daemon threads are and wishing they didn't exist... I was pondering an if we could inject the equivalent of an uncatchable exception (meaning even bare except: would not handle it) into all daemon and non-python threads via their tstate so that their code would uncerimoniously exit no matter what. That's as close to a SIGKILL for a thread can we can cleanly get. Perhaps this only makes sense to do after Nick's idea of a SystemExit in the threads and waiting a little while. |
You mean in addition to the trick we do in ceval_gil.c, with |
I was thinking of if we could do something before we get to the point of destroying the thread states - the only safe exit from a thread is for them to actually return all the way up their call stack on their own. |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: