eval broker: shed all but one blocked eval per job after ack #14621

tgross · 2022-09-19T19:12:12Z

When an evaluation is acknowledged by a scheduler, the resulting plan is guaranteed to cover up to the ModifyIndex ("wait index") set by the worker based on the most recent evaluation for that job in the state store. At that point, we no longer need to retain blocked evaluations in the broker that are older than that index.

Move these stale evals into a canceled set. When the Eval.Ack RPC returns from the eval broker it will retrieve a batch of canceable evals to write to raft. This paces the cancelations limited by how frequently the schedulers are acknowledging evals; this should reduce the risk of cancelations from overwhelming raft relative to scheduler progress.

Note that the evals will still need to be deleted during garbage collection, but there's not much we can do about that without preventing the evals from being created in the first place.

original approach

When a node updates its fingerprint or status, we need to create new evaluations to ensure that jobs waiting for resources get a chance to be evaluated. But in the case of a cluster with a large backup of evaluations and flapping nodes, we can get a large backlog of evaluations for the same job. Most of these will be canceled but we still need to write the evaluation in raft and then write its deletion in raft.

This changeset proposes that we avoid creating evals for jobs that already have a blocked eval in the eval broker. A blocked eval means that the broker already has work in-flight and work waiting to be re-enqueued, so it's safe to drop the evaluation.

tgross · 2022-09-19T19:45:12Z

nomad/eval_broker.go

+// HasBlockedEval is used to check if the eval broker already has a blocked eval for this job.
+func (b *EvalBroker) HasBlockedEval(namespacedJobID structs.NamespacedID) bool {
+	b.l.Lock()
+	defer b.l.Unlock()
+	_, ok := b.blocked[namespacedJobID]
+	return ok
+}


Node update evals always come in with their priority set to the job priority. Maybe we should check if the blocked eval has a lower priority here as well, so that we can bump-over an eval that was submitted by a user with a low priority?

My initial concerns around this idea are that things like SnapshotIndex and NodeModifyIndex aren't getting evals with updated values, but NodeModifyIndex is just informational (never consumed by Nomad) and the SnapshotIndex is set at scheduling time so there's nothing for us to screw up with that either -- none of the evals in the PendingEvals list we call "blocked" here will have SnapshotIndex set either.

(ref #6480 (comment) for the terrible confusion between a "blocked" eval and a eval with EvalStatusBlocked)

Great analysis. I had assumed {Job,Node}ModifyIndex would be used for the minimum index of the state store when initially dequeuing the eval in the worker, but that doesn't appear to be the case. The index is the max ModifyIndex of all evals for the job. ‼️

(That actually seems like an overly pessimistic index to use as it ensures the oldest eval for a job evaluates against the same index as the latest eval for a job... why even bother processing those subsequent evals then since there's no guarantee the state will be later than when the first eval was committed? If a new eval comes in in the mean time, it would need to get processed.)

Unfortunately I think this logic poses a problem for your otherwise fantastic patch: you will no longer be creating an eval with a potentially higher ModifyIndex than all of the blocked evals for the job. So the job may end up not getting evaluated against the latest index... which is particularly a problem if the whole point is to reevaluate the job against this index of the node.

To try to illustrate: The old behavior ("Blocked" means "dupe pending eval for a job" -- BlockedEvals don't factor in here AFAICT)

Eval E1 for Job J1 created @ Index 10

Eval E2 for Job J1 created and blocked @ Index 20

Worker dequeues E1, gets a WaitIndex of 20

Node.Update occurs, updates Node and creates Eval E3 for J1 at Index 30

Worker dequeues E2, gets a WaitIndex of 30

Worker dequeues E3, gets a WaitIndex of 30

With this PR:

Eval E1 for Job J1 created @ Index 10

Eval E2 for Job J1 created and blocked @ Index 20

Worker dequeues E1, gets a WaitIndex of 20

Node.Update occurs, updates Node at index 30, sees E2 blocked and skips eval creation

Worker dequeues E2, gets a WaitIndex of 20

So we may never evaluate the node at its latest ModifyIndex (30).

I could be wrong though... lots of moving parts here. Honestly I'm not sure why we calculate WaitIndex the way we do... using Eval.{Job,Node}ModifyIndex (based on TriggeredBy) initially seems correct. Using max(Eval.ModifyIndex, Eval.SnapshotIndex) always seems safe and potentially optimal though (I would assume Eval.ModifyIndex is always >= its {Job,Node}ModifyIndex)

I almost want to remove {Job,Node}ModifyIndex if they're not used. I can't imagine them being useful for end users. 🤷

If my assessment of the current behavior is correct, then obviously the eval you're skipping creating is useless to actually process, but it is necessary to ensure the modify index is high enough.

Yeah. ☹️ I hit this problem from a different direction last night.

I was trying to work up a model of the schedulers/broker/raft inspired by the aws s3 lightweight formal methods paper (which is an excellent read, btw). But I needed an invariant to test: we want any change to cluster state (i.e. state at a given index) to be eventually reflected in the last evaluation processed. But without writing the evaluation to raft we don't have a valid index to compare against anywhere in the scheduler.

So for some reason I had it in my head we only had a wait index concept in the planner but it's obvious in retrospect that we'd need it in the scheduler too.

(That actually seems like an overly pessimistic index to use as it ensures the oldest eval for a job evaluates against the same index as the latest eval for a job... why even bother processing those subsequent evals then since there's no guarantee the state will be later than when the first eval was committed? If a new eval comes in in the mean time, it would need to get processed.)

This observation could inform an approach to shed duplicates from the eval broker, even if we can't prevent them from getting written to raft. Don't try to avoid adding them to the blocked list of PendingEvals. But if we get an ack for an eval at ModifyIndex = n, then I think we can immediately discard all evals in PendingEvals for that job with ModifyIndex < n and batch cancel them for efficiency.

Hacking on this idea over in https://github.com/hashicorp/nomad/compare/main..f-evaluation-load-shedding-take-2 but will update this PR with the results once it's at all suitable for discussion.

Something I've run into here is that when we heap.Pop the blocked queue, we're always getting the oldest evaluation instead of the newest one. So when it later gets ack'd, it's never newer than the others. The PendingEvals type sorts by Priority and then CreateIndex FIFO, which is fine for pending (we probably need to do that for fairness across jobs). But for the blocked queue we want to sort by Priority but CreateIndex LIFO for this idea to work. Will have to split the types and see where that goes next.

we're always getting the oldest evaluation instead of the newest one. So when it later gets ack'd, it's never newer than the others.

While this is true, I don't think it matters because WaitIndex given to workers as the min index to work from is calculated to be the max modify index for all evals for the job: https://github.com/hashicorp/nomad/blob/v1.4.0-beta.1/nomad/eval_endpoint.go#L195-L208

So for a given job it doesn't matter which of its evals we process first, we'll always process it against the latest known index at the time of dequeue. I think when completing an eval we could go ahead and cancel all other evals for the same job with a <= ModifyIndex to the plan's SnapshotIndex as that's guaranteed to be >= the WaitIndex.

Put another way, if you have the following evals for the same job:

node-update @ 100

preemption @ 200

Then even though the node-update @ 100 is dequeued first, the WaitIndex will be 200. The resulting Plan's SnapshotIndex will be >= 200 so if the node-update @ 100 eval is complete, the preemption @ 200 eval can be canceled as well.

Oldest Eval ModifyIndex <= WaitIndex at Dequeue time <= Worker's Snapshot Index

Ok, that behavior doesn't show up at a unit testing level because we're not running a real worker, just making Dequeue and Ack calls, but that eliminates some extra code I've had to write here. Just need to update the tests to match that behavior and I think I've actually got this working.

tgross · 2022-09-19T20:38:35Z

diagram for the original approach to this PR, no longer applicable

sequenceDiagram
    Node RPC-->>Broker: node-update 1: has blocked eval? (no)
    Node RPC->>Raft: node-update 1
    Raft-->>Broker: enqueue eval
    Node RPC-->>Broker: node-update 2: has blocked eval? (no)
    Node RPC->>Raft: node-update 2
    Raft-->>Broker: can't enqueue eval (blocked)
    Node RPC-->>Broker: node-update 3: has blocked eval? (yes)    
    Broker-->>Scheduler: GetEval
    Node RPC-->>Broker: node-update 4: has blocked eval? (no)
    Node RPC->>Raft: node-update 4
    Raft-->>Broker: can't enqueue eval (blocked)

tgross · 2022-09-23T13:36:38Z

@schmichael @lgfa29 @shoenig I've now got this into working shape (and green on CI except for the usual test flakes in unrelated areas of code). I'm reasonably comfortable with the idea of landing this in 1.4.0-rc1 for next week but it's probably worth having a discussion about how good we feel about this change coming so late.

lgfa29

Great job!

I was trying to think if there was another place where the batch raft update happens to keep the logic closer together (like some kind of cancelled evals GC). I was also a bit confused initially why the endpoint was calling Cancelable(1000) since it would return evals not necessarily related to the eval being ACK'ed, but I don't think the broker has access to raft.

schmichael

I've spent most of my day trying to reason about this and my brain is fully and completely melted. 😅

I think everything you have done is correct and safe, but I don't think it's optimal. The current approach may not cancel any evals in realistic situations, but I'm not sure about that.

schmichael · 2022-10-24T23:00:55Z

nomad/eval_broker.go

@@ -559,6 +564,7 @@ func (b *EvalBroker) Ack(evalID, token string) error {
 		return fmt.Errorf("Token does not match for Evaluation ID")
 	}
 	jobID := unack.Eval.JobID
+	recentlyAckedIndex := unack.Eval.ModifyIndex


Update: See my comment below about canceling all but 1 eval. I don't think the rest of this comment matters at all, but I spent like 4 hours on it so by golly I'm leaving it. 😵‍💫

tl;dr -> I'm not sure this will cancel any evals. I think having the worker put Eval.SnapshotIndex into structs.AckRequest would allow the broker to cancel the maximum number of evals safely. This gets extremely difficult to reason about as it seems every layer of scheduler (broker, rpc, worker, scheduler) seems to do its own funny little index handling, so please check my work! 😬

AFAICT Eval.ModifyIndex does not include the index at which the eval was updated after committing a plan. unack is pulled out of EvalBroker.unack which contains the original copy of the eval from when it was first inserted into the broker as ready.

This means that Eval.ModifyIndex is pessimistic in the sense that it is the minimum possible value the Worker's snapshot was at when this evaluation was processed. Luckily we have a whole smorgasbord of indexes in the worker to choose from:

EvalDequeueResponse.WaitIndex - the maximum Eval.ModifyIndex of all evals for this job when it was dequeued.

Eval.SnapshotIndex - the actual Raft index the worker's state snapshot used. Only available on the updated Eval.

PlanResult.AllocIndex / PlanResponse.Index - returned by the Plan.Submit RPC to the worker who submitted the plan, but never used. I think the worker only ever uses the PlanResult.RefreshIndex and that index never bubbles up to where the Worker calls Ack.

I think the following holds as an invariant:

PlanResult.AllocIndex > {Updated}Eval.SnapshotIndex >= EvalDequeueResponse.WaitIndex >= {Broker}Eval.ModifyIndex ^ ^ ^ ^ | | | | | no point in processing new evals < this index | this is what's being used | | too new! cluster may have changed prior to this | | safe choice as if it's > ModifyIndex that indicates there are pending evals that can be optimized out!

I think Eval.SnapshotIndex is precisely what we're looking for. There's no point processing a new eval for the same job if we're not going to process it at a higher index than this because we will come up with the same result as before since we haven't allowed the cluster state to change!

Since there could be updates between PR.AllocIndex and SnapshotIndex, we can't drop any evals based on AllocIndex.

nomad/eval_broker_test.go

nomad/eval_endpoint.go

nomad/eval_broker.go

schmichael · 2022-10-24T23:39:47Z

nomad/eval_broker.go

+// MarkForCancel is used to remove any evaluations older than the index from the
+// blocked list and returns a list of cancelable evals so that Eval.Ack RPCs can
+// write batched raft entries to cancel them. This must be called inside the
+// broker's lock.
+func (p *PendingEvaluations) MarkForCancel(index uint64) PendingEvaluations {
+
+	// In pathological cases, we can have a large number of blocked evals but
+	// will want to cancel most of them. Using heap.Remove requires we re-sort
+	// for each eval we remove. Because we expect to have very few if any
+	// blocked remaining, we'll just create a new heap
+	retain := PendingEvaluations{}
+	cancelable := PendingEvaluations{}
+
+	for _, eval := range *p {
+		if eval.ModifyIndex >= index {
+			heap.Push(&retain, eval)
+		} else {
+			heap.Push(&cancelable, eval)
+		}
+	}
+
+	*p = retain
+	return cancelable
+}


Hold on... due to my "favorite" logic in the Eval.Dequeue RPC: https://github.com/hashicorp/nomad/blob/v1.4.1/nomad/eval_endpoint.go#L185-L211

...I think we can cancel every eval except the one with the highest ModifyIndex?!

Since on Dequeue we always set WaitIndex = max(ModifyIndex) it doesn't matter which eval we dequeue next for a job: we'll always process it against a new enough snapshot to capture every cluster change that triggered evals!

...it almost makes me think we could cancel these evals in the FSM, but we'd still need to cancel them in EvalBroker since the broker isn't a "live" view of the evals in the statestore... might not be impossible, but when in doubt I like to keep the FSM simple.

I think you're right on this, but I'm going to work up a diagram of what indexes we have and what eval states we have in the broker just to double-check.

My next instinct here is to mark any blocked evals as cancelable at the time we enqueue rather than at the time we ack. Have EvalBroker.blocked contain a spot for at most one eval, and have EvalBroker.Enqueue bump the blocked eval into the cancelable immediately if it's got a newer one. (As you've said, I'd love to be able to have the FSM detect that and delete the eval right away as well, but the followers wouldn't see the eval broker state. This would be the next best thing.) Let me work up that diagram first and we can come back to this discussion. Edit: from my diagram below I think this doesn't work because we don't drop blocked evals from the broker on Nack, so let's put a pin in this

I think one of the things that's misleading about the existing eval broker code is the PendingEvaluations sorting by CreateIndex. The PendingEvaluations alias over a slice of evals gets used by both blocked and ready but only ready actually cares about the index b/c it contains evals from multiple jobs. The blocked slice is full of evals from the same job, so the order we pick them off doesn't matter.

Well merging a state diagram together with the sequence diagram for the RPCs turned out to be a bad idea. But after a couple hours re-re-reading this code I've got a state diagram here that makes most of the problem clear, which I can annotate.

stateDiagram-v2 isPending: check if jobEvals\n(pending JobID) is empty state if_state <<choice>> [*] --> isPending: Enqueue isPending --> if_state if_state --> ready: no pending JobID.\nready is sorted\nby priority and CreateIndex if_state --> blocked: pending JobID blocked --> ready: Ack ready --> unack: Dequeue unack --> [*]: Ack unack --> isPending: Nack

Loading

When we enqueue, we check jobEvals (better described as "pending jobs") to see if its non-empty.

If it's empty, we insert the job ID into jobEvals and push the eval onto the ready heap.

If it's non-empty, we insert the eval into blocked and return.

When we dequeue, we:

scanForSchedulers:

peeks evals from ready for each scheduler

finds the eval with the highest priortity (or picks between ones with the same)

in dequeueForSched, pop the selected eval from ready and adds it to unack

getWaitIndex(eval.ns, eval.jobid, eval.ModifyIndex) for the eval we got:

gets highest modify index for any eval for the job, or dequeued eval's modify index, whichever is greater. (In practice I don't think we'll never use the dequeued eval's modify index here if that eval was ever blocked)

When we submit the plan:

updateEvalModifiyInxex updates the eval, but this will not be reflected in eval broker

// updateEvalModifyIndex is used to update the modify index of an evaluation that has been
// through a scheduler pass. This is done as part of plan apply. It ensures that when a subsequent
// scheduler workers process a re-queued evaluation it sees any partial updates from the plan apply.

When we ack the eval:

remove from unack

delete jobID from jobEvals so the job isn't marked "pending"

enqueue top blocked item (which will mark us as "pending" again)

So ready sorting is the only thing in the eval broker that cares about CreateIndex, and only getWaitIndex checks the modify index. As long as we're always setting the eval with the highest modify index as ready, we're good. I think we can cancel the entire blocked list when we do so.

stateDiagram-v2 isPending: check if jobEvals\n(pending JobID) is empty state if_state <<choice>> [*] --> isPending: Enqueue isPending --> if_state if_state --> ready: no pending JobID.\nready is sorted\nby priority and CreateIndex if_state --> blocked: pending JobID blocked --> ready: on Ack\neval w/ highest ModifyIndex blocked --> canceled: cancel all the rest ready --> unack: Dequeue unack --> [*]: Ack unack --> isPending: Nack

Loading

Alternately, if we wanted to be extra paranoid to ensure we haven't missed something we could cancel all but one eval out of blocked. That still helps us for the "large number of blocked evals" scenario but makes for bimodal operation where we won't hit this new cancel workflow except in cases where there's a backlog, which I don't love. Edit: this ends up being the safe route, so I've done that.

I'll implement this and make sure we've got an integration test that exercises it.

Started on implementation and immediately remembered that evals have their own priority, so we can't just naively use the highest modify index either: we could get a higher-priority eval (ex 100) and then a lower-priority one (ex 50). If we just use highest modify index the lower-priority eval would be the one that gets set to ready and we'd discard the high-priority one. Then that low-priority eval could get starved by middling-priority evals for other jobs (ex 70). We'll need to account for both fields.

I've solved this by having a separate PendingEvaluations and BlockedEvaluations with different sort criteria.

nomad/eval_endpoint.go

When an evaluation is acknowledged by a scheduler, the resulting plan is guaranteed to cover up to the `waitIndex` set by the worker based on the most recent evaluation for that job in the state store. At that point, we no longer need to retain blocked evaluations in the broker that are older than that index. Move all but the highest priority / highest `ModifyIndex` blocked eval into a canceled set. When the `Eval.Ack` RPC returns from the eval broker it will retrieve a batch of canceable evals to write to raft. This paces the cancelations limited by how frequently the schedulers are acknowledging evals; this should reduce the risk of cancelations from overwhelming raft relative to scheduler progress. In order to avoid straggling batches when the cluster is quiet, we also include a periodic sweep through the cancelable list.

tgross · 2022-11-09T20:23:24Z

I've run a middling-scale performance test of this PR and the results look pretty good. I stood up a cluster of 3 t3.large instances on AWS (8GiB RAM).

Disable the schedulers on all 3 instances by setting server.num_schedulers = 0 and reloading.
Deploy 10 sytem jobs.
Deploy 5000 simulated client nodes via nomad-nodesim.
Wait until all 5000 nodes are registered.

At this point, there are 49730 pending evaluations in the state store and in the blocked queue on the broker. After commenting-out server.num_schedulers = 0 and reloading, the schedulers are restarted and begin processing.
(19:30)

After 15s (19:30):

The blocked queue is empty.
Because some evals have been acked, we've already updated 16159 evals to canceled in raft, with 33694 evals in the canceleable queue on the broker.
At this point, the eval broker has no more work in-flight (in the ready queue).
No more acks are happening to drive cancellations, so we've fallen back to moving batches of evals out of the canelable list and writing them to raft as canceled every 5 sec.

After ~2m (19:32), there are still 16950 evals in the pending state (still on the cancelable queue).

At this point, I stopped the 5000 client nodes. Within 10s, all remaining evals on the cancelable queue have been writting to raft as caneled.

As each of the 5000 nodes miss heartbeats they spawn 10 evaluations each. We can see from the broker metrics that at no point do we have any backlog on the blocked list, because we're immediately moving batches off to cancelable and clearing them to canceled on the next ack. So long as you've got new evals coming in, the cancelable list is getting quickly cleared out.

schmichael · 2022-11-15T00:57:11Z

@tgross That test is phenomenal! Any chance you have the terraform or whatever laying around to run it again against 1.4.2? I'd love to compare the graphs.

tgross · 2022-11-15T16:51:03Z

I re-ran the test with 1.4.2 and it takes about 25m to reach the state we were able to reach in 5m previously.

Here's the chart for 1.4.2. A few notes:

We don't have the new Eval.Count here and I didn't notice until a little while into the test, so I had to dump nomad eval list and post-process to fill in that data. So that's got some trendlines over some of the data points, but the broker_blocked line shows the general progress.
If you look at the source data the number of client nodes isn't steadily 5001 -- doing this work loads the servers so much that we miss some heartbeats!
I stopped the nodes right around the same point as the original tests in terms of amount of outstanding work, but of course that was quite a bit later in terms of elapsed time.

And here's the same chart from above, with the time normalized to an elapsed time so we can compare apples-to-apples with the 1.4.2 chart. Conclusion: we're handling the same work in 20% of the time.

schmichael

No blockers or even logic changes! Merge away and fantastic work. This is the most significant eval broker improvement in years. 🎉

.changelog/14621.txt

nomad/eval_broker.go

schmichael · 2022-11-15T21:52:21Z

nomad/eval_broker.go

+// PendingEvaluations is a list of ready evaluations across multiple jobs. We
+// implement the container/heap interface so that this is a priority queue.
 type PendingEvaluations []*structs.Evaluation

+// BlockedEvaluations is a list of blocked evaluations for a given job. We
+// implement the container/heap interface so that this is a priority queue.
+type BlockedEvaluations []*structs.Evaluation


I have a slight preference for renaming [Bb]locked -> [Pp]ending everywhere and using ReadyEvaluations for the ready queue, but your naming scheme makes sense and changes less code which is nice for such a critical PR.

If we do want to rename for future readers, let's do it in a followup.

Plus I still need to fix the metric name for #6480 so we can do that at the same time.

nomad/eval_endpoint.go

schmichael · 2022-11-15T22:25:45Z

nomad/eval_endpoint.go

@@ -230,7 +230,8 @@ func (e *Eval) Ack(args *structs.EvalAckRequest,
 	if err := e.srv.evalBroker.Ack(args.EvalID, args.Token); err != nil {
 		return err
 	}
-	return nil
+
+	return cancelCancelableEvals(e.srv)


Hm, so this introduces a raft operation in Ack(...) which didn't have one before.

Since cancelable evals are already moved out of the blocked queue, we don't need to commit them to make the optimization work. However the longer we wait between updating the Ack'd eval and committing canceled evals, the longer the statestore will confusingly have ready evals that aren't be processed because they're sitting in the cancelable queue... 🤔

Since your benchmarking demonstrated this approach is already a massive improvement, let's leave this be.

However, I think we could make this async (maybe by just making it wakeup the reapCancelableEvaluations loop?) and potentially speed things up further.

If I'm right it might be worth commenting because it will be very hard for futures folks to load the context to make this determination:

Suggested change

return cancelCancelableEvals(e.srv)

// It's not necessary to cancel evals before Ack returns, but it's done here to

// commit canceled evals as close to the Ack'd eval being committed as possible.

return cancelCancelableEvals(e.srv)

However, I think we could make this async (maybe by just making it wakeup the reapCancelableEvaluations loop?) and potentially speed things up further.

Yeah, that's part of why the reapCancelableEvaluations loop runs fairly frequently. I like the idea of having it nudge the reap loop. I can do that under a separate PR.

In the meantime, made the comment fix.

Actually, the wake-up channel is probably a safer implementation -- if it's a buffered channel that'll ensure that Ack never gets blocked on raft. I've done that here. That also lets me move the cancelCancelableEvals into leader.go, as it now has only that one caller.

Did you mean to change this call to use the wakeup chan, because it's still performing the raft apply synchronously on main.

I don't think it's a big deal, especially because your benchmark demonstrated it's an immense speedup already.

🤦 Yes. Thanks for catching that. This line should now be

s.reapCancelableEvalsCh <-struct{}{} return nil

Fixed in #15294

nomad/leader.go

In #14621 we added an eval canelation reaper goroutine with a channel that allowed us to wake it up. But we forgot to actually send on this channel from `Eval.Ack` and are still committing the cancelations synchronously. Fix this by sending on the buffered channel to wake up the reaper instead.

github-actions · 2023-03-18T02:14:09Z

I'm going to lock this pull request because it has been closed for 120 days ⏳. This helps our maintainers find and focus on the active contributions.
If you have found a problem that seems related to this change, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

tgross added stage/needs-discussion theme/scheduling labels Sep 19, 2022

vercel bot deployed to Preview – nomad-storybook-and-ui September 19, 2022 19:15 View deployment

tgross requested review from schmichael and lgfa29 September 19, 2022 19:17

tgross commented Sep 19, 2022

View reviewed changes

tgross force-pushed the f-evaluation-load-shedding branch from d86f7e0 to 7caddce Compare September 19, 2022 20:19

vercel bot deployed to Preview – nomad-storybook-and-ui September 19, 2022 20:23 View deployment

tgross force-pushed the f-evaluation-load-shedding branch from 7caddce to 4ccb636 Compare September 22, 2022 18:52

tgross changed the title ~~shed evaluations in the node RPC when eval broker has blocked evals~~ eval broker: shed blocked evals older than acknowledged eval Sep 22, 2022

tgross added this to the 1.4.x milestone Sep 22, 2022

vercel bot deployed to Preview – nomad-storybook-and-ui September 22, 2022 18:56 View deployment

tgross force-pushed the f-evaluation-load-shedding branch from 4ccb636 to e5cd347 Compare September 22, 2022 21:14

vercel bot deployed to Preview – nomad-storybook-and-ui September 22, 2022 21:18 View deployment

tgross force-pushed the f-evaluation-load-shedding branch from e5cd347 to 3ef6d15 Compare September 23, 2022 12:26

vercel bot deployed to Preview – nomad-storybook-and-ui September 23, 2022 12:29 View deployment

tgross force-pushed the f-evaluation-load-shedding branch from 3ef6d15 to 6eec29b Compare September 23, 2022 13:15

vercel bot deployed to Preview – nomad-storybook-and-ui September 23, 2022 13:18 View deployment

tgross requested a review from shoenig September 23, 2022 13:35

tgross marked this pull request as ready for review September 23, 2022 13:35

tgross force-pushed the f-evaluation-load-shedding branch from 6eec29b to 9b3ed23 Compare September 23, 2022 13:55

vercel bot deployed to Preview – nomad-storybook-and-ui September 23, 2022 13:58 View deployment

tgross force-pushed the f-evaluation-load-shedding branch from 9b3ed23 to 95f70ed Compare September 23, 2022 14:46

vercel bot deployed to Preview – nomad-storybook-and-ui September 23, 2022 14:49 View deployment

tgross mentioned this pull request Sep 23, 2022

fingerprint: don't clear Consul/Vault attributes on failure #14673

Merged

lgfa29 approved these changes Oct 20, 2022

View reviewed changes

schmichael reviewed Oct 24, 2022

View reviewed changes

tgross marked this pull request as draft November 7, 2022 16:23

tgross force-pushed the f-evaluation-load-shedding branch from a6da173 to 3f012a6 Compare November 8, 2022 17:18

tgross modified the milestones: 1.4.x, 1.4.3 Nov 8, 2022

vercel bot deployed to Preview – nomad-storybook-and-ui November 8, 2022 17:25 View deployment

tgross force-pushed the f-evaluation-load-shedding branch 2 times, most recently from 94fa7eb to 1735564 Compare November 8, 2022 21:50

tgross commented Nov 8, 2022

View reviewed changes

nomad/eval_endpoint.go Outdated Show resolved Hide resolved

vercel bot deployed to Preview – nomad-storybook-and-ui November 8, 2022 21:59 View deployment

tgross force-pushed the f-evaluation-load-shedding branch from 1735564 to e372037 Compare November 9, 2022 14:02

vercel bot deployed to Preview – nomad-storybook-and-ui November 9, 2022 14:08 View deployment

schmichael approved these changes Nov 15, 2022

View reviewed changes

address comments from code review

5b7c8f2

vercel bot deployed to Preview – nomad-storybook-and-ui November 16, 2022 20:37 View deployment

use wake-up channel instead of waiting for ack

63e7389

vercel bot deployed to Preview – nomad-storybook-and-ui November 16, 2022 20:56 View deployment

tgross merged commit 1c4307b into main Nov 16, 2022

tgross deleted the f-evaluation-load-shedding branch November 16, 2022 21:10

hc-github-team-nomad-core mentioned this pull request Nov 16, 2022

Backport of eval broker: shed all but one blocked eval per job after ack into release/1.4.x #15272

Merged

tgross mentioned this pull request Nov 17, 2022

make eval cancelation async with Eval.Ack #15294

Merged

hc-github-team-nomad-core mentioned this pull request Nov 17, 2022

Backport of make eval cancelation async with Eval.Ack into release/1.4.x #15297

Merged

Dgotlieb mentioned this pull request Dec 18, 2022

Nomad is not ready to serve consistent reads when multiple tasks enter restart loop #15579

Open

tgross mentioned this pull request Jan 20, 2023

Rename nomad.broker.total_blocked metric #15835

Merged

github-actions bot locked as resolved and limited conversation to collaborators Mar 18, 2023

eval broker: shed all but one blocked eval per job after ack #14621

eval broker: shed all but one blocked eval per job after ack #14621

Conversation

tgross commented Sep 19, 2022 • edited Loading

Choose a reason for hiding this comment

tgross Sep 19, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

schmichael Sep 22, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

tgross commented Sep 19, 2022 • edited Loading

tgross commented Sep 23, 2022

lgfa29 left a comment

Choose a reason for hiding this comment

schmichael left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

tgross Nov 4, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

tgross Nov 4, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

tgross commented Nov 9, 2022

schmichael commented Nov 15, 2022

tgross commented Nov 15, 2022 • edited Loading

schmichael left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

tgross Nov 16, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

tgross Nov 17, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

github-actions bot commented Mar 18, 2023

tgross commented Sep 19, 2022 •

edited

Loading

tgross Sep 19, 2022 •

edited

Loading

schmichael Sep 22, 2022 •

edited

Loading

tgross commented Sep 19, 2022 •

edited

Loading

tgross Nov 4, 2022 •

edited

Loading

tgross Nov 4, 2022 •

edited

Loading

tgross commented Nov 15, 2022 •

edited

Loading

tgross Nov 16, 2022 •

edited

Loading

tgross Nov 17, 2022 •

edited

Loading