Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Docker signal #522

Closed
wants to merge 14 commits into from
Closed

Docker signal #522

wants to merge 14 commits into from

Conversation

euri10
Copy link
Member

@euri10 euri10 commented Dec 11, 2019

Attempt to fix #364 by keeping a list of child pids and killing them gracefully

The test added is ugly for several reasons, it has cache issues, it sleeps, well it's not good imho and served more as a debug tool, I kept it just in case you want to try but would remove it for sure.
Should you want me to keep it it would need some more work, both in travis and in the test. So it runs locally but will make travis fail, let me know if you want me to add it for real.

with the change a container using the --reload flag is gracefully stopped with ctrl+c
same for the --worker flag

@tomchristie
Copy link
Member

Ok. Let’s start by getting this PR without the test, and can review from there.

@euri10
Copy link
Member Author

euri10 commented Dec 12, 2019

maybe a test could be like this for instance @tomchristie

def test_multiprocess_run2():
    config = Config(app=None, workers=2)
    supervisor = Multiprocess(config, target=run, sockets=[])
    supervisor.startup()
    os.kill(supervisor.pid, signal.SIGINT)
    supervisor.shutdown()

@euri10
Copy link
Member Author

euri10 commented Dec 12, 2019

we could also make use of a more detailed signal_handler like this

    def signal_handler(self, sig, frame):
        """
        A signal handler that is registered with the parent process.
        """
        logger.setLevel(logging.DEBUG)
        for child_pid in self.child_pids:
            logger.debug(f"Attempt at killing child PID {child_pid}")
            try:
                os.kill(child_pid, signal.SIGINT)
                (pid, status) = os.waitpid(child_pid, 0)
                if pid == child_pid:
                    logger.debug(f"{pid}: {status}")
                    if os.WIFEXITED(status):
                        logger.debug("process returning status exited via the exit() system call")
                    elif os.WIFSIGNALED(status):
                        logger.debug("process returning status was terminated by a signal")
                    elif os.WIFSTOPPED(status):
                        logger.debug("process returning status was stopped")
            except Exception as e:
                logger.error(f"Cant kill child PID {child_pid}: {e}")
        self.should_exit.set()

@gnat
Copy link

gnat commented Dec 20, 2019

LGTM with or without. Excited to use this in mainline.

for child_pid in self.child_pids:
try:
os.kill(child_pid, signal.SIGINT)
finished = os.waitpid(child_pid, 0)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess we don't need the finished = part of this, right? (Since we're not using it anywhere?)

Copy link
Member Author

@euri10 euri10 Apr 9, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's been a while but I think we need it, see this comment in the blog post that inspired me to tackle this (https://technology.amis.nl/2019/06/06/graceful-shutdown-of-forked-workers-in-python-and-javascript-running-in-docker-containers/):

Do not forget to wait until the worker is finished with finished = os.waitpid(worker_pid, 0) or else the master might be finished before the worker causing the worker to be killed in a not so graceful matter.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

or even better we could replace it with what's above #522 (comment) which would help debugging corner cases

Copy link
Member

@florimondmanca florimondmanca Apr 9, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think Tom meant we don't need to store the result in a variable, since we don't do anything with it — os.waitpid(child_pid, 0) should be enough.

Copy link
Member

@florimondmanca florimondmanca left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nice!

uvicorn/supervisors/multiprocess.py Outdated Show resolved Hide resolved
os.kill(child_pid, signal.SIGINT)
finished = os.waitpid(child_pid, 0)
except Exception as e:
logger.error(f"Cant kill child PID {child_pid}: {e}")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any chance this logic can be factored out to a utility, eg shutdown_children(pids)? Or maybe even a ProcessTracker helper class that both supervisors would use. Suggesting this because the implementation here and in the multi process supervisor are basically duplicates.

Copy link
Member Author

@euri10 euri10 Apr 9, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did it, still pass the tests on the docker containers, not sure about the super() calls etc, it always hurts my brain

edit: left 2 versions of signal_handler just in case, will remove the one not chosen

Copy link
Member

@florimondmanca florimondmanca Apr 9, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm. I'm not sure about the refactor. In fact, I actually don't think we need any other abstraction than a shutdown_subprocess() helper.

Examining the existing source code, I also think we don't even need a new child_pids list — multiprocess has the list of processes, and reloaders store the reload process, so we can use those directly.

Here's the diff patch I ended up with working through this locally, WDYT?

diff --git a/uvicorn/subprocess.py b/uvicorn/subprocess.py
index 735a513..f583db2 100644
--- a/uvicorn/subprocess.py
+++ b/uvicorn/subprocess.py
@@ -5,6 +5,7 @@ starting child processes.
 import multiprocessing
 import os
+import signal
 import sys
 
 multiprocessing.allow_connection_pickling()
 spawn = multiprocessing.get_context("spawn")
@@ -59,3 +60,13 @@ def subprocess_started(config, target, sockets, stdin_fileno):
 
     # Now we can call into `Server.run(sockets=sockets)`
     target(sockets=sockets)
+
+
+def shutdown_subprocess(pid):
+    """
+    Helper to attempt cleanly shutting down a subprocess. May fail with an exception.
+
+    * pid - Process identifier.
+    """
+    os.kill(pid, signal.SIGINT)
+    os.waitpid(pid, 0)
diff --git a/uvicorn/supervisors/basereload.py b/uvicorn/supervisors/basereload.py
index 7dbe4dd..21b82c0 100644
--- a/uvicorn/supervisors/basereload.py
+++ b/uvicorn/supervisors/basereload.py
@@ -5,7 +5,7 @@ import threading
 
 import click
 
-from uvicorn.subprocess import get_subprocess
+from uvicorn.subprocess import get_subprocess, shutdown_subprocess
 
 HANDLED_SIGNALS = (
     signal.SIGINT,  # Unix signal 2. Sent by Ctrl+C.
@@ -27,6 +27,11 @@ class BaseReload:
         """
         A signal handler that is registered with the parent process.
         """
+        try:
+            shutdown_subprocess(self.process.pid)
+        except Exception as exc:
+            logger.error(f"Could not stop reload process {self.process.pid}: {exc}")
+
         self.should_exit.set()
 
     def run(self):
diff --git a/uvicorn/supervisors/multiprocess.py b/uvicorn/supervisors/multiprocess.py
index 94f7238..2dce1bc 100644
--- a/uvicorn/supervisors/multiprocess.py
+++ b/uvicorn/supervisors/multiprocess.py
@@ -5,7 +5,7 @@ import threading
 
 import click
 
-from uvicorn.subprocess import get_subprocess
+from uvicorn.subprocess import get_subprocess, shutdown_subprocess
 
 HANDLED_SIGNALS = (
     signal.SIGINT,  # Unix signal 2. Sent by Ctrl+C.
@@ -28,6 +28,12 @@ class Multiprocess:
         """
         A signal handler that is registered with the parent process.
         """
+        for process in self.processes:
+            try:
+                shutdown_subprocess(process.pid)
+            except Exception as exc:
+                logger.error(f"Could not stop child process {process.pid}: {exc}")
+
         self.should_exit.set()
 
     def run(self):

You can copy this and git apply file.patch to get the changes. :)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm sorry @florimondmanca but I really think maintaing child pids is key in the above issue
I'll rebase off master and put a rather annoying to run test that wont pass with the above patch but succeeds with changes I propose

@florimondmanca
Copy link
Member

florimondmanca commented Apr 9, 2020

@euri10 There are merge conflicts due to the recent merging of #482 (adds BaseReloader), maybe that's why CI didn't trigger. My patch in #522 (comment) was made from master.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

docker-compose graceful shutdown of uvicorn --reload
4 participants