Evaluation broker control API #11638

tgross · 2021-12-07T19:23:13Z

The Nomad engineering team would love to get some community feedback on a proposed set of features!

During incident response for Nomad or for Nomad workloads, operators may find that the scheduler can compound the ongoing incident by pushing forward with evaluations as best it can. We've recently been working on a bunch of improvements for incident response like bypassing shutdown_delay (#11448), making num_schedulers tunable via API (#11449), overriding evaluation priority on job registration (#11434), and an API to halt job registration (#11450).

Operators may want to have some sort of top-level "maintenance mode" where evaluation processing is paused or otherwise controlled while a problem is being debugged.

Eval Broker Pause/Resume

Evaluations are the unit of work for the Nomad scheduler. They are created...

When you register, dispatch, or scale a job.
When clients update the server with information about failed allocations.
When the scheduler can't place all allocations in a job and creates a "blocked" eval to try again later.
When the leader schedules a periodic job or garbage collection internal job.

The eval broker is the component on the leader that takes evaluations that have been written to raft and queues them for scheduling on one of the scheduler workers. Pausing the eval broker would prevent the scheduler queues from receiving new work and let the workers catch up so they can reduce CPU, memory, and disk resources. But note that this also includes any scheduling work necessary to reschedule failed allocations and periodic tasks! This isn't something to be done lightly, but as an emergency measure by operators.

For implementing Pause/Resume, there's an existing enabled flag which gets toggled whenever a server steps up/down from leadership (the eval broker only runs on the leader). We check this flag in a few places:

When Evals get upserted, they're enqueued from the FSM on the leader, that is when enabled = true (ref fsm.go#L746)
Likewise when an Eval is dequeued we check that enabled flag (ref eval_broker.go#L374-L377)
The leader has a eval restorer that enqueues all the pending evals (ref leader.go#L491-L517)
The leader has a eval reaper that dequeues dead eval that'll probably need to be turned off so that it doesn't generate new evals (ref leader.go#L781)

So if we write a new Scheduler Configuration entry to raft, then whenever we get that RPC or whenever a server assumes leadership, it can check that value and call SetEnabled only if the eval broker should be enabled.

Evaluation Purge/Delete

Another idea that's been considered is a nomad eval purge command, but this has a few sharp edges. If we flush the eval broker's queue, it'll immediately get filled back up again by the eval restorer on the leader. But if we delete all the pending/blocked evals from the state store, what happens to those that are in flight? So I think to do this safely we'd need to lock the eval broker (pausing it), find all the evals on the queue, delete them from the state store, and then flush the queue before unlocking it (unpausing it). That'll leave whatever evals are still in-flight in the scheduler, and any new evals/reblocks that come from the scheduler will block on Enqueue because we're holding the lock.

Another option (or perhaps in addition?) would be to only allow deleting a single eval with a command like nomad eval delete :eval_id. If we ensure that nil evaluations are handled safely, we could delete the eval from the state store and the eval broker's queue, and it would be a no-op for the scheduler. This would be useful in the case where a particular job's evals are a "poison pill" that generate high scheduler workloads (ex. a job with a very large jobspec or dispatch payload).

Evaluation Force/Priority

A third idea we've considered is the option to force an evaluation to be bumped in the priority queue so that it's being evaluated ahead of other evaluations. This could be a nomad eval priority command that updates the Priority field and raft indexes and forces it to be re-enqueued. A hypothetical nomad eval force command would be the same thing except less fine grained; it would simply update to the highest priority.

The text was updated successfully, but these errors were encountered:

pikeas · 2022-06-21T04:40:04Z

Some eval control would be nice. I have an eval scheduled for the future:

$ nomad eval list|grep pending
ffb463ed  50        alloc-failure       <job>           pending   false

I've resolved the issue that caused this allocation to fail and would like a way to unblock re-deploying the job. Currently, there's no way to purge this pending eval, re-submitting the job is a no-op, setting count = 0 on the group is invalid, and stopping/starting the job via UI only restarts the other groups

jrasell · 2022-07-06T15:10:49Z

Closing this issue as eval broker pause/un-pause as well as eval delete will ship in the next release.

github-actions · 2022-12-22T02:14:08Z

I'm going to lock this issue because it has been closed for 120 days ⏳. This helps our maintainers find and focus on the active issues.
If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

tgross added type/enhancement theme/scheduling labels Dec 7, 2021

tgross self-assigned this Dec 7, 2021

tgross mentioned this issue Dec 21, 2021

workers should check eval token before submitting Ack/Nack #11727

Open

tgross removed their assignment Mar 18, 2022

jrasell self-assigned this May 12, 2022

jrasell mentioned this issue May 17, 2022

core: allow pausing and un-pausing of leader broker routine #13045

Merged

mmcquillan added this to the 1.3.x milestone May 17, 2022

hc-github-team-nomad-core mentioned this issue Jul 6, 2022

Backport of core: allow pausing and un-pausing of leader broker routine into release/1.3.x #13614

Merged

jrasell mentioned this issue Jul 6, 2022

core: allow deleting of evaluations #13492

Merged

jrasell closed this as completed Jul 6, 2022

lgfa29 modified the milestones: 1.3.x, 1.3.2 Aug 24, 2022

github-actions bot locked as resolved and limited conversation to collaborators Dec 22, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Evaluation broker control API #11638

Evaluation broker control API #11638

tgross commented Dec 7, 2021 •

edited

Loading

pikeas commented Jun 21, 2022 •

edited

Loading

jrasell commented Jul 6, 2022

github-actions bot commented Dec 22, 2022

Evaluation broker control API #11638

Evaluation broker control API #11638

Comments

tgross commented Dec 7, 2021 • edited Loading

Eval Broker Pause/Resume

Evaluation Purge/Delete

Evaluation Force/Priority

pikeas commented Jun 21, 2022 • edited Loading

jrasell commented Jul 6, 2022

github-actions bot commented Dec 22, 2022

tgross commented Dec 7, 2021 •

edited

Loading

pikeas commented Jun 21, 2022 •

edited

Loading