Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Subprocess.wait_for_exit never resolves if process terminated before it is called #3364

Open
meffmadd opened this issue Feb 29, 2024 · 3 comments
Labels

Comments

@meffmadd
Copy link

The wait_for_exit method eventually calls os.waitpid, which throws a ChildProcessError if no process with the specified pid exists. This exception is caught, the function just returns, and the Future never resolves.

if __name__ == '__main__':
    import asyncio
    from tornado.process import Subprocess

    async def f():
        p = Subprocess("ls")
        p.proc.wait()
        return await p.wait_for_exit()

    loop = asyncio.get_event_loop()
    loop.run_until_complete(f())

return

Instead, the process could be retrieved from the _waiting dict and the return code could be accessed from the object directly.

@bdarnell
Copy link
Member

bdarnell commented Mar 3, 2024

As long as Tornado is the only thing that touches the SIGCHLD handler, the child process is guaranteed to exist (in a zombie state) until os.waitpid (or another wait function) is called on it once. Could something else be installing a SIGCHLD handler, ignoring SIGCHLD, or calling os.wait directly? (In this example it seems wrong to use both p.proc.wait() and p.wait_for_exit() but if you've seen it in the wild I doubt it's that simple)

@meffmadd
Copy link
Author

meffmadd commented Mar 4, 2024

Ok, interesting! As far as I understand, nothing else is happening. We were creating the subprocess (calling git init) and immediately called wait_for_exit on the following line.

However, we used the Subprocess class outside the tornado process (in a Celery worker instance). Could this have anything to do with it? We have now switched to Popen and wait, which works as intended. The docs don't mention anything about this.

@bdarnell
Copy link
Member

However, we used the Subprocess class outside the tornado process (in a Celery worker instance).

I'm confused (maybe because I've never used celery) - there's still a tornado IOLoop (or asyncio event loop) running in the worker process, right? If not you wouldn't be able to await p.wait_for_exit().

Celery does touch the SIGCLD handler although I'm not sure if it's possible for this to be run in a way that would clobber tornado's SIGCLD handler. It doesn't appear to call asyncio's set_child_watcher function but if something in the stack does, it could also cause this problem.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants