Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

auto promote canaries #5719

Merged
merged 32 commits into from
May 22, 2019
Merged

auto promote canaries #5719

merged 32 commits into from
May 22, 2019

Conversation

langmartin
Copy link
Contributor

Add a new configuration option, auto_promote, that causes deployments with healthy canary allocations to be promoted automatically. This fixes #3636

Copy link
Member

@schmichael schmichael left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the core logic looks good! An e2e test would be nice or at least an integration test possibly in the nomad/ server package to test without so much mocking.

Great work!

Backwards compat

I think it's possible for HealthyCanaries to be permanently inaccurate if servers are upgraded mid-deploy. However those deploys could never have AutoPromote=true so I don't think it matters.

api/jobs.go Outdated Show resolved Hide resolved
nomad/deploymentwatcher/deployment_watcher.go Outdated Show resolved Hide resolved
nomad/deploymentwatcher/deployment_watcher.go Outdated Show resolved Hide resolved
nomad/structs/structs.go Outdated Show resolved Hide resolved
nomad/structs/structs.go Show resolved Hide resolved
nomad/structs/structs.go Show resolved Hide resolved
nomad/deploymentwatcher/deployments_watcher_test.go Outdated Show resolved Hide resolved
nomad/deploymentwatcher/deployments_watcher_test.go Outdated Show resolved Hide resolved
@langmartin langmartin force-pushed the f-auto-promote-canaries branch 3 times, most recently from c02910f to dbee31a Compare May 20, 2019 19:00
Copy link
Member

@schmichael schmichael left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Auto promote! E2E improvements! Typo fixes! So much to enjoy in one nice concise PR. My last 2 comments are blockers. Merge whenever you feel it's ready!

Copy link
Contributor

@notnoop notnoop left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great! Love the tests.

Have a question about autopromotion with multi-taskgroup jobs and few other nitpicks.

e2e/deployment/deployment.go Show resolved Hide resolved
e2e/deployment/input/deployment_auto0.nomad Show resolved Hide resolved
e2e/e2eutil/utils.go Show resolved Hide resolved
@@ -30,6 +30,7 @@ IMPROVEMENTS:
* core: Add `-verbose` flag to `nomad status` wrapper command [[GH-5516](https://github.com/hashicorp/nomad/pull/5516)]
* core: Reduce the size of the raft transaction for plans by only sending fields updated by the plan applier [[GH-5602](https://github.com/hashicorp/nomad/pull/5602)]
* core: Add ability to filter job deployments by most recent version of job [[GH-5702](https://github.com/hashicorp/nomad/issues/5702)]
* core: Add job update `auto_promote` flag, which causes deployments to promote themselves when all canaries become healthy [[GH-5719](https://github.com/hashicorp/nomad/pull/5719)]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would suggest updating changelog after merging PR as the changelog is common cause of conflicts.

TASKGROUP:
for tk, tv := range d.TaskGroups {
if !tv.AutoPromote || tv.DesiredCanaries != len(tv.PlacedCanaries) {
continue
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mind if you elaborate on auto promotion and multiple task groups? I would have expected auto promotion of a job to take place if all its task group were set for AutoPromote or if AutoPromote was a job level property.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

auto_promote will only update the job if all the task groups are healthy and marked for auto_promote, it will otherwise stay in the pending state

Copy link
Contributor

@notnoop notnoop left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a question about job and task group update merging - but it's lgtm otherwise.

api/jobs.go Outdated
@@ -433,6 +435,9 @@ func (u *UpdateStrategy) Copy() *UpdateStrategy {
copy.Canary = intToPtr(*u.Canary)
}

// boolPtrs need to preserve nil unset value
copy.AutoPromote = u.AutoPromote
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't quite follow this comment and why the copying here is different?

I presume that the rest of assignments used boolToPtr to avoid having two structures share the same reference when deep copying the structure; but I don't know for sure and we don't really do reference manipulation here afaik.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good catch, I thought about that wrong. copy and merge should match the old behavior

au.Merge(taskGroup.Update)
} else {
au = taskGroup.Update
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suspect the merging of jobUpdate might be a subtle change in behavior. How about use taskGroup.Update if it's not nil, otherwise use jobUpdate? Either way, we should note it in the docs/changelog/release notes.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can go either way on this, the current behavior does not match what the docs claim: "If specified at the job level, the configuration will apply to all groups within the job. If multiple update stanzas are specified, they are merged with the group stanza taking the highest precedence and then the job" I think if it's safer to keep the existing behavior, we should change the docs, and if we keep this change we can leave them as is.

Copy link
Contributor

@notnoop notnoop left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

awesome!

@github-actions
Copy link

github-actions bot commented Feb 9, 2023

I'm going to lock this pull request because it has been closed for 120 days ⏳. This helps our maintainers find and focus on the active contributions.
If you have found a problem that seems related to this change, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Feb 9, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Automatic deployment promotion of canary updates
4 participants