scheduler: create placements for non-register MRD #15325

lgfa29 · 2022-11-19T00:47:20Z

For multiregion jobs, the scheduler does not create placements on registration because the deployment must wait for the other regions. Once of these regions will then trigger the deployment to run.

Currently, this is done in the scheduler by considering any eval for a multiregion job as "paused" since it's expected that another region will eventually unpause it.

This becomes a problem where evals not triggered by a job registration happen, such as on a node update. These types of regional changes do not have other regions waiting to progress the deployment, and so they were never resulting in placements.

The fix is to create a deployment at job registration time. This additional piece of state allows the scheduler to differentiate between a multiregion change, where there are other regions engaged in the deployment so no placements are required, from a regional change, where the scheduler does need to create placements.

This deployment starts in the new "initializing" status to signal to the scheduler that it needs to compute the initial deployment state. The multiregion deployment will wait until this deployment state is persisted and its starts is set to "pending". Without this state transition it's possible to hit a race condition where the plan applier and the deployment watcher may step of each other and overwrite their changes.

For multiregion jobs, the scheduler does not create placements on registration because the deployment must wait for the other regions. Once of these regions will then trigger the deployment to run. Currently, this is done in the scheduler by considering any eval for a multiregion job as "paused" since it's expected that another region will eventually unpause it. This becomes a problem where evals not triggered by a job registration happen, such as on a node update. These types of regional changes do not have other regions waiting to progress the deployment, and so they were never resulting in placements. The fix is to create a deployment at job registration time. This additional piece of state allows the scheduler to differentiate between a multiregion change, where there are other regions engaged in the deployment so no placements are required, from a regional change, where the scheduler does need to create placements. This deployment starts in the new "initializing" status to signal to the scheduler that it needs to compute the initial deployment state. The multiregion deployment will wait until this deployment state is persisted and its starts is set to "pending". Without this state transition it's possible to hit a race condition where the plan applier and the deployment watcher may step of each other and overwrite their changes.

pkazmierczak

LGTM! Great work Luiz!

* scheduler: create placements for non-register MRD For multiregion jobs, the scheduler does not create placements on registration because the deployment must wait for the other regions. Once of these regions will then trigger the deployment to run. Currently, this is done in the scheduler by considering any eval for a multiregion job as "paused" since it's expected that another region will eventually unpause it. This becomes a problem where evals not triggered by a job registration happen, such as on a node update. These types of regional changes do not have other regions waiting to progress the deployment, and so they were never resulting in placements. The fix is to create a deployment at job registration time. This additional piece of state allows the scheduler to differentiate between a multiregion change, where there are other regions engaged in the deployment so no placements are required, from a regional change, where the scheduler does need to create placements. This deployment starts in the new "initializing" status to signal to the scheduler that it needs to compute the initial deployment state. The multiregion deployment will wait until this deployment state is persisted and its starts is set to "pending". Without this state transition it's possible to hit a race condition where the plan applier and the deployment watcher may step of each other and overwrite their changes. * changelog: add entry for #15325

For multiregion jobs, the scheduler does not create placements on registration because the deployment must wait for the other regions. Once of these regions will then trigger the deployment to run. Currently, this is done in the scheduler by considering any eval for a multiregion job as "paused" since it's expected that another region will eventually unpause it. This becomes a problem where evals not triggered by a job registration happen, such as on a node update. These types of regional changes do not have other regions waiting to progress the deployment, and so they were never resulting in placements. The fix is to create a deployment at job registration time. This additional piece of state allows the scheduler to differentiate between a multiregion change, where there are other regions engaged in the deployment so no placements are required, from a regional change, where the scheduler does need to create placements. This deployment starts in the new "initializing" status to signal to the scheduler that it needs to compute the initial deployment state. The multiregion deployment will wait until this deployment state is persisted and its starts is set to "pending". Without this state transition it's possible to hit a race condition where the plan applier and the deployment watcher may step of each other and overwrite their changes.

github-actions · 2023-03-26T02:14:35Z

I'm going to lock this pull request because it has been closed for 120 days ⏳. This helps our maintainers find and focus on the active contributions.
If you have found a problem that seems related to this change, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

vercel bot deployed to Preview – nomad-storybook-and-ui November 19, 2022 00:51 View deployment

changelog: add entry for #15325

7d21ad8

vercel bot deployed to Preview – nomad-storybook-and-ui November 19, 2022 01:15 View deployment

lgfa29 mentioned this pull request Nov 19, 2022

cli: improve errors for multiregion deployments #15326

Merged

lgfa29 added backport/1.2.x backport to 1.1.x release line backport/1.3.x backport to 1.3.x release line backport/1.4.x backport to 1.4.x release line labels Nov 19, 2022

lgfa29 requested review from pkazmierczak and tgross November 19, 2022 01:28

pkazmierczak approved these changes Nov 21, 2022

View reviewed changes

lgfa29 mentioned this pull request Nov 22, 2022

scheduler: handle MRD jobs correctly in computeDeploymentPaused #14649

Closed

hc-github-team-nomad-core mentioned this pull request Nov 23, 2022

Backport of cli: improve errors for multiregion deployments into release/1.4.x #15375

Merged

lgfa29 merged commit a55c124 into main Nov 25, 2022

lgfa29 deleted the b-fix-mrd-node-update branch November 25, 2022 17:45

github-actions bot locked as resolved and limited conversation to collaborators Mar 26, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

scheduler: create placements for non-register MRD #15325

scheduler: create placements for non-register MRD #15325

lgfa29 commented Nov 19, 2022

pkazmierczak left a comment

github-actions bot commented Mar 26, 2023

scheduler: create placements for non-register MRD #15325

scheduler: create placements for non-register MRD #15325

Conversation

lgfa29 commented Nov 19, 2022

pkazmierczak left a comment

Choose a reason for hiding this comment

github-actions bot commented Mar 26, 2023