-
Notifications
You must be signed in to change notification settings - Fork 8.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Graceful shutdown for Nginx #1257
Conversation
@maxlaverse I am not sure we should add this to the interface. We cannot assume the implementations are running a process. |
// TODO: If we keep the modified Controller interface | ||
// should this be moved as a generic timeout control for the backend to stop | ||
// on time ? | ||
timer := time.NewTimer(60 * time.Second) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we cannot rely using a timer because if the user sets terminationGracePeriodSeconds: 5
this will not work
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're right. The value should then be configurable.
While running more tests this morning I realized that Nginx has a timeout already kicking in for that scenario:
worker_shutdown_timeout]
This might not be needed at all
@maxlaverse also please check https://github.com/kubernetes/ingress/blob/master/controllers/nginx/pkg/cmd/controller/nginx.go#L179 |
@maxlaverse please test quay.io/aledbf/nginx-ingress-controller:0.198 |
79a3195
to
35fca42
Compare
Thanks for your pull request. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA). 📝 Please follow instructions at https://github.com/kubernetes/kubernetes/wiki/CLA-FAQ to sign the CLA. It may take a couple minutes for the CLA signature to be fully registered; after that, please reply here with a new comment and we'll verify. Thanks.
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
35fca42
to
d505fc8
Compare
3d0dbcf
to
f8c4d63
Compare
Coverage decreased (-0.2%) to 43.734% when pulling f8c4d6373506effa1f242e18cfef627725092861 on maxlaverse:graceful_shutdown into 7844415 on kubernetes:master. |
Coverage decreased (-0.2%) to 43.734% when pulling f8c4d6373506effa1f242e18cfef627725092861 on maxlaverse:graceful_shutdown into 7844415 on kubernetes:master. |
Coverage decreased (-0.2%) to 43.709% when pulling f8c4d6373506effa1f242e18cfef627725092861 on maxlaverse:graceful_shutdown into 7844415 on kubernetes:master. |
Coverage decreased (-0.2%) to 43.734% when pulling f8c4d6373506effa1f242e18cfef627725092861 on maxlaverse:graceful_shutdown into 7844415 on kubernetes:master. |
@aledbf I change the description of this PR to explain how I test it. I pushed some changes to I had a look at your pull-request and made some comments. It's not working because of the way we detect that Nginx has stopped after we gracefully ask it to. Checking if it's answering is probably not an option as it's not accepting incoming connections. I'm not sure what's the best way to disable the automatic restart process and get notified about Nginx having being stopped. Except if we find another way to detect that Nginx is properly shut down ? |
We can use https://github.com/mitchellh/go-ps to check if there's a nginx process running. |
f8c4d63
to
73d8ddd
Compare
Merging. I will fix the dependencies issue in another PR |
@maxlaverse thanks! |
Issue
Nginx is not being gracefully shut down when a Nginx Ingress Controller container is stopped, leading to broken connections. It looks like the
SIGTERM
sent to the controller to stop it is simply forwarded to the Nginx server.The problem is that
SIGTERM
means "Exit as quick as you can" for Nginx (Nginx doc. For a graceful shutdown we should rather send aSIGQUIT
.A solution to this issue is to prevent the Nginx Ingress Controller to forward all the signals it gets to Nginx, allowing us to control what's sent to the Nginx process. When shutting down, the controller would then send a
SIGQUIT
to Nginx. Nginx itself will wait 10secondes (worker_shutdown_timeout
) for the worker to shutdown.Testing
You need a backend exposed on the Ingress that responds slowly. I used one with a 10 secondes response time. Trigger an Ingress deployment and call this slow URL. If your slow URL get an answer before the Nginx is tear down, re-execute it.
We expected this request to be answered to. You should see in the Ingress logs that it's not immediately exiting but waiting for the request to finish.
Additionally, you can
strace
the Nginx master process to make sureSIGQUIT
is not being sent instead ofSIGTERM
.