Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Close listening socket immediately after stop signal received #922

Closed
tahajahangir opened this issue Oct 25, 2014 · 14 comments
Closed

Close listening socket immediately after stop signal received #922

tahajahangir opened this issue Oct 25, 2014 · 14 comments

Comments

@tahajahangir
Copy link

When stopping gunicorn, server should wait for current requests to complete, but this may take several seconds (even 30 seconds, with the default timeout). Clients connecting in this interval, will wait forever, and will receive connection-reset error after server successfully stopped.

When using two or more gunicorn servers behind a reverse proxy (like nginx), this behavior leads to unreasonable raise in total upstream time (normally, real clients won't wait for 30 seconds, and will close tab, or hit refresh).

This also reduces impact of #901 (locked connections). Also related to #908

@tilgovi
Copy link
Collaborator

tilgovi commented Oct 27, 2014

Short upstream timeouts reduce the impact, but is not ideal.

As @benoitc has pointed out, we cannot simply close the socket on any graceful shutdown because the worker might be shutting down after a new master is spawned with USR2.

@tahajahangir
Copy link
Author

Normally, spawning (forking) new master should be before stopping current master. After fork, it's totally safe to close listening socket. The socket will remain open in the forked child. (sample code)

@tilgovi
Copy link
Collaborator

tilgovi commented Oct 27, 2014

True enough. I'll try to dig up the original conversation, in case there's
something else we've forgotten.

Thanks, tahajahangir

Do you want to try writing a patch? I'm happy to review. Otherwise, I can
maybe look next week.
On Oct 26, 2014 11:37 PM, "Taha Jahangir" notifications@github.com wrote:

Normally, spawning (forking) new master should be before stopping current
master. After fork, it's totally safe to close listening socket. The socket
will remain open in the forked child. (sample code
http://pastebin.com/g0ewbLux)


Reply to this email directly or view it on GitHub
#922 (comment).

@tahajahangir
Copy link
Author

Unfortunately, I'm not familiar with gunicorn structure, additionally we moved to uwsgi in production, and I don't have enough free time to dig into gunicorn.

@benoitc
Copy link
Owner

benoitc commented Nov 23, 2014

So the socket will indeed remains in the forked child but we should make sure to kill the socket in the parent only on final exit (ie when the the master is finally asked to exit) during USR2. I will work on such patch asap.

@pikeas
Copy link

pikeas commented May 18, 2015

Was this ever fixed?

@benoitc
Copy link
Owner

benoitc commented May 18, 2015

I din't tested it yet. thanks for the head-up it's on the todo for today
now!

On Mon, May 18, 2015 at 9:55 AM Aris Pikeas notifications@github.com
wrote:

Was this ever fixed?


Reply to this email directly or view it on GitHub
#922 (comment).

@benoitc benoitc self-assigned this May 18, 2015
@mleventi
Copy link

+1 on getting this in

@benoitc benoitc modified the milestone: R19.4 Nov 23, 2015
@tilgovi tilgovi added this to the 20.0.0 milestone Dec 31, 2015
@benoitc
Copy link
Owner

benoitc commented Dec 31, 2015

Re-reading this issue, I think we are missing some details, was the server closed using SIGQUIT or SIGTERM? It's totally expected to not close immediately the socket on SIGTERM while SIGQUIT close it immediately. What did I miss?

@tilgovi
Copy link
Collaborator

tilgovi commented Dec 31, 2015

We're not closing the socket immediately which I think means that a
downstream proxy will keep trying to pass connections to gunicorn. I think
the request here is that during shutdown we close the listening socket
quickly.

On Thu, Dec 31, 2015, 02:16 Benoit Chesneau notifications@github.com
wrote:

Re-reading this issue, I think we are missing some details, was the server
closed using SIGQUIT or SIGTERM? It's totally expected to not close
immediately the socket on SIGTERM while SIGQUIT close it immediately. What
did I miss?


Reply to this email directly or view it on GitHub
#922 (comment).

tilgovi added a commit that referenced this issue Mar 13, 2016
Close all the listeners when the arbiter shuts down. By doing so,
workers can close the socket at the beginning of a graceful shut
down thereby informing the operating system that the socket can
be cleaned up. With this change, graceful exits with such workers
will refuse new connections while draining, allowing load balancers
to respond more quickly and avoiding leaving connections dangling
in the listen backlog, unaccepted.

Ref #922
tilgovi added a commit that referenced this issue Mar 13, 2016
tilgovi added a commit that referenced this issue Mar 13, 2016
Close all the listeners when the arbiter shuts down. By doing so,
workers can close the socket at the beginning of a graceful shut
down thereby informing the operating system that the socket can
be cleaned up. With this change, graceful exits with such workers
will refuse new connections while draining, allowing load balancers
to respond more quickly and avoiding leaving connections dangling
in the listen backlog, unaccepted.

Ref #922
tilgovi added a commit that referenced this issue Mar 13, 2016
tilgovi added a commit that referenced this issue Mar 13, 2016
tilgovi added a commit that referenced this issue Mar 13, 2016
@tilgovi
Copy link
Collaborator

tilgovi commented Mar 14, 2016

@tilgovi tilgovi assigned tilgovi and unassigned benoitc Mar 14, 2016
tilgovi added a commit that referenced this issue Mar 21, 2016
The run loop has to change slightly to support graceful shutdown.
There is no way to interrupt a call to `futures.wait` so instead
the pattern, used by the async workers, is to sleep for only one
second at the most. The poll is extended to a one second timeout
to match.

Since threads are preemptively scheduled, it's possible that the
listener is closed when the request is actually handled. For this
reason it is necessary to slightly refactor the TConn class to store
the listening socket name. The name is checked once at the start of
the worker run loop.

Ref #922
@tilgovi
Copy link
Collaborator

tilgovi commented Mar 21, 2016

I believe that this can be marked complete as soon as the two PRs I have open for eventlet and the threaded worker are merged.

@benoitc
Copy link
Owner

benoitc commented Mar 22, 2016

@tilgovi 👍

@tilgovi
Copy link
Collaborator

tilgovi commented Mar 25, 2016

Closing as all my open PRs have been merged. Opened #1236 to track future work arising from this discussion and related PRs.

@tilgovi tilgovi closed this as completed Mar 25, 2016
mjjbell pushed a commit to mjjbell/gunicorn that referenced this issue Mar 16, 2018
Close all the listeners when the arbiter shuts down. By doing so,
workers can close the socket at the beginning of a graceful shut
down thereby informing the operating system that the socket can
be cleaned up. With this change, graceful exits with such workers
will refuse new connections while draining, allowing load balancers
to respond more quickly and avoiding leaving connections dangling
in the listen backlog, unaccepted.

Ref benoitc#922
mjjbell pushed a commit to mjjbell/gunicorn that referenced this issue Mar 16, 2018
mjjbell pushed a commit to mjjbell/gunicorn that referenced this issue Mar 16, 2018
The run loop has to change slightly to support graceful shutdown.
There is no way to interrupt a call to `futures.wait` so instead
the pattern, used by the async workers, is to sleep for only one
second at the most. The poll is extended to a one second timeout
to match.

Since threads are preemptively scheduled, it's possible that the
listener is closed when the request is actually handled. For this
reason it is necessary to slightly refactor the TConn class to store
the listening socket name. The name is checked once at the start of
the worker run loop.

Ref benoitc#922
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants