scheduler: fix job update placement on prev node penalized #6781

tgross · 2019-11-26T20:29:47Z

When the scheduler looks for a placement for an allocation that's replacing another allocation, it's supposed to penalize the previous node if the allocation had been rescheduled or failed. But we're currently always penalizing the node, which leads to unnecessary migrations on job update.

This commit leaves in place the existing behavior where if the previous alloc was itself rescheduled, its previous nodes are also penalized. This is conservative but the right behavior especially on larger clusters where a group of hosts might be having correlated trouble (like an AZ failure).

scheduler/generic_sched.go

langmartin · 2019-11-26T22:50:39Z

scheduler/generic_sched.go

+
+		// If alloc failed, penalize the node it failed on to encourage
+		// rescheduling on a new node.
+		if prevAllocation.ClientStatus == "failed" {


I think it's ok to omit "lost", but it might mean that the whole node reboots, and the job gets scheduled back on the same node when it becomes available again, I'm not sure if we want to apply a penalty there. Also, for google foo I think we should use structs.AllocClientStatusFailed.

Yeah, I couldn't think of a reason to penalize lost -- if the node comes back in time for the scheduler to place it there (seems extremely unlikely), what's the harm? If it's a flaky node I guess that's a problem, but now we're compounding unlikely scenarios.

If there's a chance the scheduler worker's statestore is using a snapshot that doesn't see the node as being lost yet, that's a good to penalize it I suppose. That could cause a pathological situation where many lost allocations are placed back on the lost node only to fail at placement and go through scheduling all over again at a fresher snapshot. This too seems pretty unlikely to me and impossible if we ensure the snapshot is at least as fresh as the evaluation being processed in the first place.

+1 to using structs.AllocClientStatusFailed

Fixes #5856 When the scheduler looks for a placement for an allocation that's replacing another allocation, it's supposed to penalize the previous node if the allocation had been rescheduled or failed. But we're currently always penalizing the node, which leads to unnecessary migrations on job update. This commit leaves in place the existing behavior where if the previous alloc was itself rescheduled, its previous nodes are also penalized. This is conservative but the right behavior especially on larger clusters where a group of hosts might be having correlated trouble (like an AZ failure). Co-Authored-By: Michael Schurter <mschurter@hashicorp.com>

github-actions · 2023-01-24T02:15:40Z

I'm going to lock this pull request because it has been closed for 120 days ⏳. This helps our maintainers find and focus on the active contributions.
If you have found a problem that seems related to this change, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

tgross requested review from schmichael and preetapan November 26, 2019 20:29

schmichael approved these changes Nov 26, 2019

View reviewed changes

scheduler/generic_sched.go Outdated Show resolved Hide resolved

tgross force-pushed the gh5856_node_penalty branch from dd4a98e to 6f98b61 Compare November 26, 2019 20:50

tgross mentioned this pull request Nov 26, 2019

Job update always lead to allocations migrate (node-reschedule-penalty ) #5856

Closed

tgross added this to the 0.10.3 milestone Nov 26, 2019

langmartin reviewed Nov 26, 2019

View reviewed changes

preetapan approved these changes Dec 2, 2019

View reviewed changes

tgross force-pushed the gh5856_node_penalty branch from 6f98b61 to e82244c Compare December 3, 2019 13:19

tgross merged commit 3716a67 into master Dec 3, 2019

tgross deleted the gh5856_node_penalty branch December 3, 2019 14:14

drewbailey mentioned this pull request Feb 7, 2020

Changing affinity or spreads prevents in-place upgrade #6988

Closed

github-actions bot locked as resolved and limited conversation to collaborators Jan 24, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

scheduler: fix job update placement on prev node penalized #6781

scheduler: fix job update placement on prev node penalized #6781

tgross commented Nov 26, 2019

langmartin Nov 26, 2019

schmichael Nov 26, 2019

preetapan Dec 2, 2019

github-actions bot commented Jan 24, 2023

scheduler: fix job update placement on prev node penalized #6781

scheduler: fix job update placement on prev node penalized #6781

Conversation

tgross commented Nov 26, 2019

langmartin Nov 26, 2019

Choose a reason for hiding this comment

schmichael Nov 26, 2019

Choose a reason for hiding this comment

preetapan Dec 2, 2019

Choose a reason for hiding this comment

github-actions bot commented Jan 24, 2023