Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ability to send Unix Signals to processes Nomad is running #817

Closed
phinze opened this issue Feb 18, 2016 · 29 comments
Closed

Ability to send Unix Signals to processes Nomad is running #817

phinze opened this issue Feb 18, 2016 · 29 comments

Comments

@phinze
Copy link
Contributor

phinze commented Feb 18, 2016

Many of our services respond to Unix Signals (SIGHUP, SIGUSR1, etc.) to change their behavior. The most important one is to allow a service to "drain" (stop accepting new work, complete current in-flight work) before it is fully shut down.

The question is how this will work once Nomad is running our services. How can we send signals or achieve the equivalent behavior?

Happy to provide more detail and discuss as needed. 👍

@diptanu
Copy link
Contributor

diptanu commented Feb 18, 2016

@phinze We can do this for our exec based drivers. Docker doesn't have any API AFAIK to send signals to a running pid inside a container.

@c4milo
Copy link
Contributor

c4milo commented Feb 18, 2016

@c4milo
Copy link
Contributor

c4milo commented Feb 18, 2016

@diptanu
Copy link
Contributor

diptanu commented Feb 18, 2016

@c4milo The kill API you referenced assumes that the user wants to kill the container by sending a signal, and the call waits until the container exits.

The usecase for this ticket is that a user might want to send an arbitrary signal to a pid asynchronously. So the kill API won't work in this case.

@lord2800
Copy link

@diptanu it looks to me, from the docs, like you can send arbitrary signals via the signal query parameter?

@skozin
Copy link

skozin commented Feb 18, 2016

@diptanu, Docker has no assumption that a signal would stop the process inside a container. Ue use Docker kill API extensively to send signals to processes.

@diptanu
Copy link
Contributor

diptanu commented Feb 18, 2016

@skozin @lord2800 Yes, just re-read the docs. It looks like that the docker kill API waits for the container to exit only when the SIGTERM signal is passed. So we should be able to use this api to pass any arbitrary signal to the pid inside a docker container.

@dadgar
Copy link
Contributor

dadgar commented Feb 19, 2016

@phinze Is there any reason you can't put an HTTP API on top of the services. It is nice to limit the surface area of the scheduler to do just what is required. We already send a soft-kill (SIGTERM) signal before SIGKILL to let you do cleanup work and the kill timeout between these two is configurable.

@phinze
Copy link
Contributor Author

phinze commented Feb 19, 2016

@dadgar Definitely understand the desire to keep the scheduler simple!

So I'm coming from the opposite direction - trying to minimize the number of for-each-service changes I have to make as I lift an existing microservices architecture into Nomad. So the idea of requiring a sweep through to flop upstart / Unix Signals driven behavior to HTTP API across two dozen services doesn't sound super appealing at first glance. 😀

We might be able to make things work with the TERM/KILL window though. (We have some very long running jobs (6-12 hours) that need to drain from some of our services.)

From where I sit, it seems that the interrupt-driven use cases of pause/resume service, immediate config reload, and other behavior triggered by Unix signals is a relatively core enough feature to warrant inclusion in Nomad. But happy to discuss further!

@skozin
Copy link

skozin commented Feb 19, 2016

@dadgar, I agree with @phinze here. Ability to send signals to a task process would be really useful addition to Nomad.

For example, NginX, as well as a large amount of other widely-used software, uses signals for a number of essential actions: HUP for zero-downtime config reloading, QUIT and WINCH for graceful shutdown, USR2 for upgrading the executable, etc. And this is not configurable.

One could definitely put some kind of HTTP API on top of NginX by running a coprocess that would listen to HTTP requests and communicate with NginX using signals, but that would require non-trivial effort.

@diptanu
Copy link
Contributor

diptanu commented Feb 19, 2016

@skozin So I think in the world of cluster schedulers some of the use cases you have described changes -
For example, if you want to upgrade the Nginx binary, you will probably deploy a new docker container or change the artifact source of your exec driver based task and do a rolling upgrade. So the need to send USR2 for upgrading the executable goes away.

On the topic of config reloading, if you use something like consul template or any other co-process which re-generates the nginx config, I would imagine that the co-process is going to send a signal to the Nginx pid to reload the config and not the operator.

But I don't disagree that sometimes sending signals can be handy, but I agree with @dadgar that in an environment where services are run on cluster schedulers the need for sending signals to processes becomes less.

@steve-jansen
Copy link
Contributor

@diptanu we're attempting to co-schedule a task group with two docker tasks: a consul-template container and an nginx container. I'm curious as to your statement:

I would imagine that the co-process is going to send a signal to the Nginx pid to reload the config and not the operator.

It's seems a bit tricky in this scenario for the consul-template container to send a signal to the nginx container.

Might a proper Nomad HTTP API to send a signal would simplify this problem?

Fictional example: Nomad injects metadata (env var) for a unique endpoint to POST a signal to specific sibling tasks in the group. My consul-template task might then curl -X POST -d SIGHUP ${NOMAD_signal_frontend} to signal nginx in the "frontend" task to reload.

@JensRantil
Copy link

@dadgar, you wrote

kill timeout between these two is configurable.

Could you maybe post a reference to this? I couldn't find anything in https://www.nomadproject.io/docs/drivers/docker.html.

@iverberk
Copy link
Contributor

iverberk commented May 29, 2016

@JensRantil I think you can add a kill_timeout parameter on the task object. Docs can be found here: https://www.nomadproject.io/docs/jobspec/index.html#kill_timeout. It is not docker specific.

@maruina
Copy link

maruina commented Jun 7, 2016

@diptanu, you wrote

We can do this for our exec based drivers

Any news on that? I can't find any reference in the documentation

@dadgar
Copy link
Contributor

dadgar commented Jun 11, 2016

@maruina I think he meant in the abstract. We haven't done this because not only is it driver specific it is also operating system specific. It requires more thought as to whether we want to support this.

@ashald
Copy link

ashald commented Feb 9, 2017

We would be happy to be able to send signals to jobs/groups/individual tasks (via HTTP API as described in one of the comments above) as well!

@stefreak
Copy link

stefreak commented Mar 6, 2017

I'd propose adding a kill_signal parameter, analogous to the template update signal.

Background is that different signals lead to different exit behaviour, in my case e.g. for gitlab-ci runner i want to send SIGQUIT instead of SIGINT: https://gitlab.com/gitlab-org/gitlab-ci-multi-runner/blob/master/docs/commands/README.md#signals

@dadgar
Copy link
Contributor

dadgar commented Mar 6, 2017

@stefreak #1755

@schmichael
Copy link
Member

Closing. Nomad v0.9.2 added the nomad alloc signal ... command and corresponding API via #5515.

Feel free to open a new issue if there are use cases we didn't cover. Thanks and sorry for the delay in closing this issue!

@rmlsun
Copy link

rmlsun commented Jul 10, 2020

@schmichael @dadgar sorry to comment a closed ticket, but I have a use case that can really benefit from such capability the best.

The use case is for a runtime controlled by nomad client that can benefit from having a chance for clean shutdown when the machine is being decommissioned. Specifically, it's a Kafka broker process ran as nomad job on AWS ec2 machine (amazon linux2 with nomad client ran as systemd unit).

When the ec2 machine gets killed (say b/c degraded hardware, aws scheduled maintenance etc.) without nomad being in the loop, nomad job level kill_signal or nomad migration feature are not able to help. Such cases can benefit from a clean/controlled shutdown process initiated by the nomad client on the affected machine.

We can use ExecStop hook in the nomad client systemd unit to initiate such a shutdown process. But without the capability to send a unix signal (I understand this is asking for a *nix specific feature that does not fit well in windows world) to trigger such process, we'd have to resort to node draining API. To enable that, we'd have to grant nomad worker nodes capability to get a nomad token with "node:write" ACL, which if we try to turn this into a generic clean shutdown config to all nomad worker nodes, resulting in enabling any nomad node being able to drain/purge/toggle eligibility of any other nomad node in the same nomad cluster - sounds unnecessary escalation of privilege IMHO.

On the other hand, with capability to send the nomad client a signal as a trigger, each node can only trigger such draining for itself.

I understand not all system NEEDs such clean shutdown capability (I'm a fan of crash only system in fact). But our production Kafka on nomad can really benefit from such local node draining for unexpected machine shutdown cases

@tgross
Copy link
Member

tgross commented Jul 13, 2020

@rmlsun you might be able to cover this case with the stop_after_client_disconnect stanza and an appropriately configured kill signal. If that won't do the trick, please feel free to open a new issue. Thanks!

@schmichael
Copy link
Member

@rmlsun Hm, what you're describing sounds like #2052 ?

Would a client.drain_shutdown = true agent configuration parameter fit your use case? The idea being that when the nomad client received the signal to shutdown it would block exiting until it had drained all running allocations?

If so please leave a comment over on that issue. If not please file a new issue like @tgross said.

@rmlsun
Copy link

rmlsun commented Jul 13, 2020

@rmlsun you might be able to cover this case with the stop_after_client_disconnect stanza and an appropriately configured kill signal. If that won't do the trick, please feel free to open a new issue. Thanks!

@tgross thanks for the pointer. In most of our cases, we want the exact opposite of that stop_after_client_disconnect behavior, which is just leaving the task runtimes alone instead of taking them down. We did quite a bit of destructive test of nomad and were very happy to observe that task runtimes continue to run in the face of nomad server failure, network partition between nomad server and client etc. To that end, I really like what @schmichael mentioned in #2052:

A guiding principle in Nomad's design is in the face of errors: do not stop user services! Nomad downtime should prevent further scheduling, but it should avoid causing service downtime as much as possible.

@rmlsun
Copy link

rmlsun commented Jul 13, 2020

@schmichael thanks for the pointer. Yes I think that kinds of configurable nomad client shutdown behavior will be helpful in this particular case:

Would a client.drain_shutdown = true agent configuration parameter fit your use case? The idea being that when the nomad client received the signal to shutdown it would block exiting until it had drained all running allocations?

Basically what we want is, if nomad itself is running into unexpected issues, leave the task runtime alone and confined nomad issue to be just nomad issue as much as possible (smallest blast radius possible). On the other hand, if it's an intentional shutdown of nomad client, provides a way to trigger a clean shutdown of task runtimes

@rmlsun
Copy link

rmlsun commented Jul 13, 2020

I think there might be a fine line here @schmichael

Ideally, if nomad client itself crashes or shutdown b/c not operator initiated reasons, it should not trigger task shutdown. Only if it's an operator initiated shutdown, it triggers (and waits for the finish of) clean shutdown of all tasks.

So would a signal be a good way to indicate it's an intentional shutdown? Like, instead of having client.drain_shutdown = true, how about client.drain_shutdown_signal = SIGINT something along that line.

@axsuul
Copy link
Contributor

axsuul commented Sep 30, 2022

Need additional clarification on this, does draining a node always force quit tasks or will it use the configured kill_signal?

@schmichael
Copy link
Member

Need additional clarification on this, does draining a node always force quit tasks or will it use the configured kill_signal?

Draining a node stops tasks in the same way nomad alloc stop or a deployment stops tasks:

  1. Services are deregistered from Consul/Nomad
  2. After shutdown_delay the kill_signal is sent to the task
  3. After kill_timeout the task is force killed

The drain command's -force option ignores the migrate block and skips the drain -deadline. Force draining does not change how tasks are killed.

A long shutdown_delay can be bypassed by manually issuing a nomad alloc stop -no-shutdown-delay while the drain is running.

The only way to bypass shutdown_delay and kill_timeout is to use nomad alloc signal or nomad alloc exec to manually kill the task once the drain has begun.

@github-actions
Copy link

I'm going to lock this issue because it has been closed for 120 days ⏳. This helps our maintainers find and focus on the active issues.
If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Jan 29, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests