Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

core: reject plans which include or cause duplicate alloc indexes. #18376

Closed
wants to merge 4 commits into from

Conversation

jrasell
Copy link
Member

@jrasell jrasell commented Sep 1, 2023

Nomad allocation indexes are detailed as being unique across all allocations within a single version of a job. This is not enforced at any level and users have reported seeing duplicate indexes used for allocations running the same job version.

In order to ensure this cannot happen, the plan applier will now perform a check and reject any plans which include or would cause duplication of indexes.

closes #10727

Test trigger with count 100
$ for i in {1..100}
do
go test -count 1 -run Test_evaluatePlanAllocIndexes .
done
ok  	github.com/hashicorp/nomad/nomad	0.562s
ok  	github.com/hashicorp/nomad/nomad	0.513s
ok  	github.com/hashicorp/nomad/nomad	0.558s
ok  	github.com/hashicorp/nomad/nomad	0.558s
ok  	github.com/hashicorp/nomad/nomad	0.546s
ok  	github.com/hashicorp/nomad/nomad	0.544s
ok  	github.com/hashicorp/nomad/nomad	0.530s
ok  	github.com/hashicorp/nomad/nomad	0.540s
ok  	github.com/hashicorp/nomad/nomad	0.555s
ok  	github.com/hashicorp/nomad/nomad	0.561s
ok  	github.com/hashicorp/nomad/nomad	0.552s
ok  	github.com/hashicorp/nomad/nomad	0.555s
ok  	github.com/hashicorp/nomad/nomad	0.559s
ok  	github.com/hashicorp/nomad/nomad	0.566s
ok  	github.com/hashicorp/nomad/nomad	0.556s
ok  	github.com/hashicorp/nomad/nomad	0.533s
ok  	github.com/hashicorp/nomad/nomad	0.544s
ok  	github.com/hashicorp/nomad/nomad	0.543s
ok  	github.com/hashicorp/nomad/nomad	0.544s
ok  	github.com/hashicorp/nomad/nomad	0.535s
ok  	github.com/hashicorp/nomad/nomad	0.557s
ok  	github.com/hashicorp/nomad/nomad	0.549s
ok  	github.com/hashicorp/nomad/nomad	0.529s
ok  	github.com/hashicorp/nomad/nomad	0.544s
ok  	github.com/hashicorp/nomad/nomad	0.547s
ok  	github.com/hashicorp/nomad/nomad	0.548s
ok  	github.com/hashicorp/nomad/nomad	0.533s
ok  	github.com/hashicorp/nomad/nomad	0.553s
ok  	github.com/hashicorp/nomad/nomad	0.555s
ok  	github.com/hashicorp/nomad/nomad	0.546s
ok  	github.com/hashicorp/nomad/nomad	0.542s
ok  	github.com/hashicorp/nomad/nomad	0.547s
ok  	github.com/hashicorp/nomad/nomad	0.538s
ok  	github.com/hashicorp/nomad/nomad	0.555s
ok  	github.com/hashicorp/nomad/nomad	0.545s
ok  	github.com/hashicorp/nomad/nomad	0.549s
ok  	github.com/hashicorp/nomad/nomad	0.536s
ok  	github.com/hashicorp/nomad/nomad	0.561s
ok  	github.com/hashicorp/nomad/nomad	0.543s
ok  	github.com/hashicorp/nomad/nomad	0.553s
ok  	github.com/hashicorp/nomad/nomad	0.553s
ok  	github.com/hashicorp/nomad/nomad	0.578s
ok  	github.com/hashicorp/nomad/nomad	0.547s
ok  	github.com/hashicorp/nomad/nomad	0.546s
ok  	github.com/hashicorp/nomad/nomad	0.546s
ok  	github.com/hashicorp/nomad/nomad	0.554s
ok  	github.com/hashicorp/nomad/nomad	0.556s
ok  	github.com/hashicorp/nomad/nomad	0.563s
ok  	github.com/hashicorp/nomad/nomad	0.558s
ok  	github.com/hashicorp/nomad/nomad	0.569s
ok  	github.com/hashicorp/nomad/nomad	0.561s
ok  	github.com/hashicorp/nomad/nomad	0.559s
ok  	github.com/hashicorp/nomad/nomad	0.570s
ok  	github.com/hashicorp/nomad/nomad	0.547s
ok  	github.com/hashicorp/nomad/nomad	0.557s
ok  	github.com/hashicorp/nomad/nomad	0.538s
ok  	github.com/hashicorp/nomad/nomad	0.546s
ok  	github.com/hashicorp/nomad/nomad	0.561s
ok  	github.com/hashicorp/nomad/nomad	0.557s
ok  	github.com/hashicorp/nomad/nomad	0.566s
ok  	github.com/hashicorp/nomad/nomad	0.555s
ok  	github.com/hashicorp/nomad/nomad	0.569s
ok  	github.com/hashicorp/nomad/nomad	0.569s
ok  	github.com/hashicorp/nomad/nomad	0.554s
ok  	github.com/hashicorp/nomad/nomad	0.555s
ok  	github.com/hashicorp/nomad/nomad	0.555s
ok  	github.com/hashicorp/nomad/nomad	0.545s
ok  	github.com/hashicorp/nomad/nomad	0.563s
ok  	github.com/hashicorp/nomad/nomad	0.573s
ok  	github.com/hashicorp/nomad/nomad	0.532s
ok  	github.com/hashicorp/nomad/nomad	0.566s
ok  	github.com/hashicorp/nomad/nomad	0.566s
ok  	github.com/hashicorp/nomad/nomad	0.560s
ok  	github.com/hashicorp/nomad/nomad	0.539s
ok  	github.com/hashicorp/nomad/nomad	0.557s
ok  	github.com/hashicorp/nomad/nomad	0.554s
ok  	github.com/hashicorp/nomad/nomad	0.566s
ok  	github.com/hashicorp/nomad/nomad	0.558s
ok  	github.com/hashicorp/nomad/nomad	0.549s
ok  	github.com/hashicorp/nomad/nomad	0.563s
ok  	github.com/hashicorp/nomad/nomad	0.547s
ok  	github.com/hashicorp/nomad/nomad	0.544s
ok  	github.com/hashicorp/nomad/nomad	0.568s
ok  	github.com/hashicorp/nomad/nomad	0.569s
ok  	github.com/hashicorp/nomad/nomad	0.548s
ok  	github.com/hashicorp/nomad/nomad	0.554s
ok  	github.com/hashicorp/nomad/nomad	0.551s
ok  	github.com/hashicorp/nomad/nomad	0.554s
ok  	github.com/hashicorp/nomad/nomad	0.574s
ok  	github.com/hashicorp/nomad/nomad	0.559s
ok  	github.com/hashicorp/nomad/nomad	0.551s
ok  	github.com/hashicorp/nomad/nomad	0.568s
ok  	github.com/hashicorp/nomad/nomad	0.556s
ok  	github.com/hashicorp/nomad/nomad	0.527s
ok  	github.com/hashicorp/nomad/nomad	0.548s
ok  	github.com/hashicorp/nomad/nomad	0.576s
ok  	github.com/hashicorp/nomad/nomad	0.571s
ok  	github.com/hashicorp/nomad/nomad	0.566s
ok  	github.com/hashicorp/nomad/nomad	0.559s
ok  	github.com/hashicorp/nomad/nomad	0.557s

Nomad allocation indexes are detailed as being unique across all
allocations within a single version of a job. This is not enforced
at any level and users have reported seeing duplicate indexes used
for allocations running the same job version.

In order to ensure this cannot happen, the plan applier will now
perform a check and reject any plans which include or would cause
duplication of indexes.
@jrasell jrasell self-assigned this Sep 1, 2023
Copy link
Member

@tgross tgross left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is looking like it's moving in the right direction @jrasell. I've left some comments about some of the logic, which I think we'll want to verify with some tests.

nomad/plan_apply.go Outdated Show resolved Hide resolved
nomad/plan_apply.go Outdated Show resolved Hide resolved
nomad/plan_apply.go Outdated Show resolved Hide resolved
nomad/plan_apply.go Outdated Show resolved Hide resolved
nomad/plan_apply.go Show resolved Hide resolved
tgross
tgross previously approved these changes Oct 2, 2023
Copy link
Member

@tgross tgross left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 👍

Comment on lines +1454 to +1456
// Delete the alloc state entry, so it doesn't impact any subsequent
// tests.
must.NoError(t, testState.DeleteEval(30, []string{}, []string{canaryAlloc.ID}, false))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this (and similar bits in these tests) happen in a t.Cleanup so that subsequent tests can run even if a test fails?

Comment on lines +1613 to +1615
// Perform a test where duplicates already exist within state. This is
// possible on all but fresh clusters.
t.Run("duplicate exist in state", func(t *testing.T) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm really glad we have this test.

This also raises a question about what happens if we have existing duplicate allocations and the scheduler submits a plan that ignores them? For example, if I have alloc[0] and alloc[0] and then scale up the job, the existing allocs will be untouched. Does it become impossible to scale up/down and create that kind of plan for these jobs? (Maybe that's ok, if the user can then destroy one of the allocations out of band?)

Comment on lines +860 to +861
// ensure correct removal. This path is most common when a node goes down
// and a plan is generated to replace the failed allocations.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or any time we reschedule, right? (ex. allocation failure)

@tgross tgross dismissed their stale review October 2, 2023 18:43

probably shouldn't have approved; needs more discussion around livelock

@schmichael
Copy link
Member

From a slack discussion: I don't think we can take this approach. #10727 makes it appear like Nomad repeatedly selects overlapping alloc indexes over multiple deployments. While this could just be "bad luck," it could also be an indication that the scheduler is deterministically choosing overlapping indexes due to a bug.

If so, we need to fix the bug in the scheduler before we add this check to the plan applier. If we add the plan applier check first we risk creating an unschedulable job which is far worse than overlapping indexes (which most users probably don't even notice). This is effectively a livelock for one scheduler where it keeps doing the same work over and over but is unable to make progress because this plan applier check would reject the buggy index selection.

@jrasell jrasell marked this pull request as draft October 3, 2023 06:47
@jrasell
Copy link
Member Author

jrasell commented Oct 25, 2023

I am closing this PR out as the approach was deemed unsuitable to more forward with. I will link to this PR with any future PR that attempts to close the linked issue.

@jrasell jrasell closed this Oct 25, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

NOMAD_ALLOC_INDEX is not always unique within a single service job version
3 participants