Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

scheduler: fix job update placement on prev node penalized #6781

Merged
merged 1 commit into from
Dec 3, 2019

Conversation

tgross
Copy link
Member

@tgross tgross commented Nov 26, 2019

Fixes #5856

When the scheduler looks for a placement for an allocation that's replacing another allocation, it's supposed to penalize the previous node if the allocation had been rescheduled or failed. But we're currently always penalizing the node, which leads to unnecessary migrations on job update.

This commit leaves in place the existing behavior where if the previous alloc was itself rescheduled, its previous nodes are also penalized. This is conservative but the right behavior especially on larger clusters where a group of hosts might be having correlated trouble (like an AZ failure).

scheduler/generic_sched.go Outdated Show resolved Hide resolved

// If alloc failed, penalize the node it failed on to encourage
// rescheduling on a new node.
if prevAllocation.ClientStatus == "failed" {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's ok to omit "lost", but it might mean that the whole node reboots, and the job gets scheduled back on the same node when it becomes available again, I'm not sure if we want to apply a penalty there. Also, for google foo I think we should use structs.AllocClientStatusFailed.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I couldn't think of a reason to penalize lost -- if the node comes back in time for the scheduler to place it there (seems extremely unlikely), what's the harm? If it's a flaky node I guess that's a problem, but now we're compounding unlikely scenarios.

If there's a chance the scheduler worker's statestore is using a snapshot that doesn't see the node as being lost yet, that's a good to penalize it I suppose. That could cause a pathological situation where many lost allocations are placed back on the lost node only to fail at placement and go through scheduling all over again at a fresher snapshot. This too seems pretty unlikely to me and impossible if we ensure the snapshot is at least as fresh as the evaluation being processed in the first place.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 to using structs.AllocClientStatusFailed

Fixes #5856

When the scheduler looks for a placement for an allocation that's
replacing another allocation, it's supposed to penalize the previous
node if the allocation had been rescheduled or failed. But we're
currently always penalizing the node, which leads to unnecessary
migrations on job update.

This commit leaves in place the existing behavior where if the
previous alloc was itself rescheduled, its previous nodes are also
penalized. This is conservative but the right behavior especially on
larger clusters where a group of hosts might be having correlated
trouble (like an AZ failure).

Co-Authored-By: Michael Schurter <mschurter@hashicorp.com>
@tgross tgross merged commit 3716a67 into master Dec 3, 2019
@tgross tgross deleted the gh5856_node_penalty branch December 3, 2019 14:14
@github-actions
Copy link

I'm going to lock this pull request because it has been closed for 120 days ⏳. This helps our maintainers find and focus on the active contributions.
If you have found a problem that seems related to this change, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Jan 24, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Job update always lead to allocations migrate (node-reschedule-penalty )
4 participants