-
Notifications
You must be signed in to change notification settings - Fork 1.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Ineffective handling of canary edge case on service de-registration #10842
Labels
stage/accepted
Confirmed, and intend to work on. No timeline committment though.
theme/consul
type/bug
Comments
shoenig
added
type/bug
theme/consul
stage/accepted
Confirmed, and intend to work on. No timeline committment though.
labels
Jul 2, 2021
shoenig
added a commit
that referenced
this issue
Jul 6, 2021
There are bits of logic in callers of RemoveWorkload on group/task cleanup hooks which call RemoveWorkload with the "Canary" version of the workload, in case the alloc is marked as a Canary. This logic triggers an extra sync with Consul, and also doesn't do the intended behavior - for which no special casing is necessary anyway. When the workload is marked for removal, all associated services and checks will be removed regardless of the Canary status, because the service and check IDs do not incorporate the canary-ness in the first place. The only place where canary-ness matters is when updating a workload, where we need to compute the hash of the services and checks to determine whether they have been modified, the Canary flag of which is a part of that. Fixes #10842
shoenig
added a commit
that referenced
this issue
Jul 6, 2021
There are bits of logic in callers of RemoveWorkload on group/task cleanup hooks which call RemoveWorkload with the "Canary" version of the workload, in case the alloc is marked as a Canary. This logic triggers an extra sync with Consul, and also doesn't do the intended behavior - for which no special casing is necessary anyway. When the workload is marked for removal, all associated services and checks will be removed regardless of the Canary status, because the service and check IDs do not incorporate the canary-ness in the first place. The only place where canary-ness matters is when updating a workload, where we need to compute the hash of the services and checks to determine whether they have been modified, the Canary flag of which is a part of that. Fixes #10842
shoenig
added a commit
that referenced
this issue
Jul 6, 2021
There are bits of logic in callers of RemoveWorkload on group/task cleanup hooks which call RemoveWorkload with the "Canary" version of the workload, in case the alloc is marked as a Canary. This logic triggers an extra sync with Consul, and also doesn't do the intended behavior - for which no special casing is necessary anyway. When the workload is marked for removal, all associated services and checks will be removed regardless of the Canary status, because the service and check IDs do not incorporate the canary-ness in the first place. The only place where canary-ness matters is when updating a workload, where we need to compute the hash of the services and checks to determine whether they have been modified, the Canary flag of which is a part of that. Fixes #10842
shoenig
added a commit
that referenced
this issue
Jul 6, 2021
There are bits of logic in callers of RemoveWorkload on group/task cleanup hooks which call RemoveWorkload with the "Canary" version of the workload, in case the alloc is marked as a Canary. This logic triggers an extra sync with Consul, and also doesn't do the intended behavior - for which no special casing is necessary anyway. When the workload is marked for removal, all associated services and checks will be removed regardless of the Canary status, because the service and check IDs do not incorporate the canary-ness in the first place. The only place where canary-ness matters is when updating a workload, where we need to compute the hash of the services and checks to determine whether they have been modified, the Canary flag of which is a part of that. Fixes #10842
shoenig
added a commit
that referenced
this issue
Jul 6, 2021
There are bits of logic in callers of RemoveWorkload on group/task cleanup hooks which call RemoveWorkload with the "Canary" version of the workload, in case the alloc is marked as a Canary. This logic triggers an extra sync with Consul, and also doesn't do the intended behavior - for which no special casing is necessary anyway. When the workload is marked for removal, all associated services and checks will be removed regardless of the Canary status, because the service and check IDs do not incorporate the canary-ness in the first place. The only place where canary-ness matters is when updating a workload, where we need to compute the hash of the services and checks to determine whether they have been modified, the Canary flag of which is a part of that. Fixes #10842
shoenig
added a commit
that referenced
this issue
Jul 6, 2021
There are bits of logic in callers of RemoveWorkload on group/task cleanup hooks which call RemoveWorkload with the "Canary" version of the workload, in case the alloc is marked as a Canary. This logic triggers an extra sync with Consul, and also doesn't do the intended behavior - for which no special casing is necessary anyway. When the workload is marked for removal, all associated services and checks will be removed regardless of the Canary status, because the service and check IDs do not incorporate the canary-ness in the first place. The only place where canary-ness matters is when updating a workload, where we need to compute the hash of the services and checks to determine whether they have been modified, the Canary flag of which is a part of that. Fixes #10842
Merged
I'm going to lock this issue because it has been closed for 120 days ⏳. This helps our maintainers find and focus on the active issues. |
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Labels
stage/accepted
Confirmed, and intend to work on. No timeline committment though.
theme/consul
type/bug
In hooks
https://github.com/hashicorp/nomad/blob/v1.1.2/client/allocrunner/groupservice_hook.go#L212
and
https://github.com/hashicorp/nomad/blob/v1.1.2/client/allocrunner/taskrunner/service_hook.go#L177
we attempt a little trick to incur a de-registration of potential canary versions of services registered for a group or task. In reality this is unnecessary - the Consul sync is based on the computed IDs of the service or check. This computed ID does not incorporate whether the service or check is part of a Canary workload. All services of the workload are removed regardless if the alloc is tagged as a canary.
MakeAllocServiceID
MakeCheckID
The extra call to
RemoveWorkload
does cause an extra sync with Consul, exacerbating #10797The text was updated successfully, but these errors were encountered: