Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

provide -no-shutdown-delay flag on job/alloc stop to bypass kill timeout #11448

Closed
tgross opened this issue Nov 4, 2021 · 3 comments · Fixed by #11596
Closed

provide -no-shutdown-delay flag on job/alloc stop to bypass kill timeout #11448

tgross opened this issue Nov 4, 2021 · 3 comments · Fixed by #11596
Assignees
Milestone

Comments

@tgross
Copy link
Member

tgross commented Nov 4, 2021

Some operators use long kill timeouts so that tasks have a long window of time to shutdown cleanly. During incident response, these long kill timeouts can make it hard to mitigate the incident. Provide an -immediate flag on nomad job stop and nomad alloc stop that bypasses the kill timeout. We'll still run allocrunner/taskrunner prestop hooks (because the client state needs that). Maybe an open question would be what should we do with tasks with the poststop lifecycle configuration?

@tgross
Copy link
Member Author

tgross commented Nov 29, 2021

Some interesting bits to figure out here:

  • Neither nomad job stop or nomad alloc stop communicates with the clients. Like everything else they update the state store and then create evals for the scheduler to act upon:
    • nomad job stop applies a JobDeregisterRequestType to the FSM, which either deletes the job entirely from the state store (on -purge) or marks the job's Stop field true. In either code path, we create a new eval which the scheduler uses to sync the job and the clients.
    • nomad alloc stop applies a UpdateAllocsDesiredTransitions to the FSM, which updates the alloc with a DesiredTransition to "migrate", "reschedule", or "force reschedule". We create a new eval which the scheduler uses to sync the job and the clients.
  • There is both a group-level shutdown_delay and a task-level shutdown_delay. We'll need to override both behaviors, but it looks like get merged for the task runner anyways.

My initial thought here for implementation was to update the job so that the shutdown_delay is set to zero. But we end up persisting that change in the jobspec. In the case where the operator runs nomad job run again, they'll have the jobspec file with the original shutdown_delay value locally and that's fine. But in the web UI we have a Start button that sets the definition from .Stop = true to .Stop = false before submitting the job. This would restart the job with the modified shutdown_delay value, which would be bad. (Also, we'd hit the same problem with some future hypothetical nomad job restart command.)

I think the approach here will be to add a new DesiredTransition field (Immediate?) that's set on all allocations touched by the RPC that's been given the -immediate flag. That'll get passed along as usual as part of the Allocation struct to the client, which can read that value at any place it'd be looking at Task.ShutdownDelay. I'll give that a try and see how it looks.

(Also, we might want to name this flag -no-shutdown-delay rather than -immediate so that it's not conflated with the eval priority.)

@tgross
Copy link
Member Author

tgross commented Nov 29, 2021

Getting close to the end of my day here so leaving notes for myself: I've got an almost-working branch of nomad alloc stop -immediate up at f-immediate-stop but I've run into a problem with the group shutdown_delay. It turns out the group group service hook does not get merged with the task shutdown_delay but serves a different purpose entirely.

The group service hook waiting on the shutdown_delay was introduced in Nomad 0.10.4 #6746 to allow time for tasks to drain their connections after being deregistered from Consul but before being shut down.

Currently alloc runner hooks don't get their Update hook fired for terminal allocs (ref), so I can't simply add an isImmediate flag to the group service hook and then update its value. But looking at the existing alloc hooks, there are potentially others that could benefit from the group shutdown_delay hook, so I'll probably push that TerminalStatus() check down into the hooks so they can make that decision themselves or react differently in the case where they're terminal.

Edit: a much easier approach will be to expose the allocation update to the alloc runner hook PreKill method which is only implemented by the group service hook so there's very little disruption. I've pushed up a hacky version of that to my branch and should be able to wrap it up Weds.

@tgross tgross changed the title provide -immediate flag on job/alloc stop to bypass kill timeout provide -no-shutdown-delay flag on job/alloc stop to bypass kill timeout Dec 1, 2021
@tgross tgross added this to the 1.2.4 milestone Dec 13, 2021
@github-actions
Copy link

I'm going to lock this issue because it has been closed for 120 days ⏳. This helps our maintainers find and focus on the active issues.
If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Oct 13, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant