-
-
Notifications
You must be signed in to change notification settings - Fork 30.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
deadlocked child process after forking on pystate.c's head_mutex #74580
Comments
A forked process (via os.fork) can inherit a locked Child Process (deadlocked): #0 0x00007f1a4da82e3c in futex_abstimed_wait_cancelable (private=0, abstime=0x0, expected=0, futex_word=0x7f1a4c2964e0) at ../sysdeps/unix/sysv/linux/futex-internal.h:205 The parent process has a race between one thread calling The path from PyGILState_Ensure -> head_mutex looks like this: #0 new_threadstate (interp=0x7fb5fd483d80, init=init@entry=1) at Python/pystate.c:183 ---- Possible fix? A simple fix would be to, inside PyOS_AfterFork, reset/unlock pystate.c's head_mutex if it's already locked. Unclear if this is related to: https://bugs.python.org/issue28812 |
You cannot safely use Python's os.fork() in a process that has threads. Because of POSIX. The CPython interpreter is by definition not async signal safe so no Python code may safely be executed after the fork() system call if the process contained _any_ threads at fork() system call time. The only reliable recommendation: *Never use os.fork()* (if your process _might ever_ contain any threads now or in the future started by your or any of your dependencies). The closest thing to a fix to this is a bunch of band-aids to setup atfork functions to clear the state of known locks in the forked child. -- But even that can't guarantee things because that can't know about all locks. The C standard library can contain locks. malloc() for example. Any C/C++ extension modules can contain locks of their own. The problem isn't solvable within CPython. Adding a clear of pystate.c's head_mutex after forking makes sense. That may even get you further. But you will encounter other problems of the same nature in the future. related: https://bugs.python.org/issue6721 and https://bugs.python.org/issue16500 |
Thanks to everyone jumping in. I need no convincing that mixing forks and threads isn't just a problem but a problem factory. Given that the rest of this code seems to try to avoid similar deadlocks with similar mutexes, I figured we'd want to include this mutex to make a best-effort at being safe here. That made it worth reporting. To be sure, I still believe that the application code that led us here needs deeper fixes to address the fork/thread problems. |
+ head_mutex = NULL; Shouldn't we free memory here? |
Would PyThread_free_lock (effectively sem_destroy()) work without (additional) problems? |
Gregory P. Smith added the comment: Would PyThread_free_lock (effectively sem_destroy()) work without If I recall correctly, no, you can get issues if the lock is still |
Another alternative *might* be to check if the lock is locked (non-blocking acquire?) and release it if so. Under the normal assumption that we are the only thread running immediately post-fork(). I'm not sure that can be guaranteed reliable given that other C code could've used pthread_atfork to register an "after" fork handler that spawns threads. But that should be rare, and nothing here can really fix the underlying "programmer has mixed fork and threads" issue. Only ameliorate it. But i'm not sure this post fork memory leak is really a problem other than potentially showing up in memory sanatizer runs involving forked children. The scenario where it would be an actual leak is if a process does serial fork() calls with the parent(s) dying rather than forking new children from a common parent. That would grow the leak as each child would have an additional lock allocated (a few bytes). I don't _believe_ that kind of process pattern is common (but people do everything). |
If you'd like to fix the miniscule leak this introduces, feel free but I don't think it's worth the additional complexity. Closing this for now as it solved an issue for us internally and we haven't observed any memory-related issues due to it. |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: