Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

trhooks: Add TaskStopHook interface to services #5821

Merged
merged 2 commits into from
Jun 12, 2019
Merged

Conversation

endocrimes
Copy link
Contributor

@endocrimes endocrimes commented Jun 12, 2019

We currently only run cleanup Service Hooks when a task is either
Killed, or Exited. However, due to the implementation of a task runner,
tasks are only Exited if they every correctly started running, which is
not true when you recieve an error early in the task start flow, such as
not being able to pull secrets from Vault.

This updates the service hook to also call consul deregistration
routines during a task Stop lifecycle event, to ensure that any
registered checks and services are cleared in such cases.

fixes #5770

@endocrimes endocrimes force-pushed the dani/b-5770 branch 3 times, most recently from 036f939 to ec18a80 Compare June 12, 2019 13:57
We currently only run cleanup Service Hooks when a task is either
Killed, or Exited. However, due to the implementation of a task runner,
tasks are only Exited if they every correctly started running, which is
not true when you recieve an error early in the task start flow, such as
not being able to pull secrets from Vault.

This updates the service hook to also call consul deregistration
routines during a task Stop lifecycle event, to ensure that any
registered checks and services are cleared in such cases.

fixes #5770
Copy link
Contributor

@notnoop notnoop left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The change lgtm.

It's quite reasonable to have Stop here deregister as a catch-all case as Stop() is called after alloc is marked as dead in

// Mark the task as dead
tr.UpdateState(structs.TaskStateDead, nil)
// Run the stop hooks
if err := tr.stop(); err != nil {
tr.logger.Error("stop failed", "error", err)
}
. This may lead to multiple deregistering invocation, but we already deal with that in PreKill and Exited cases.

It would be great to have an integration test that ensures that service is deregistered when prestart hook fails.


// Removing canary and non-canary entries on stop
require.Equal(t, "remove", consulOps[1].Op)
require.Equal(t, "remove", consulOps[2].Op)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mean 6 and 7 here?

@endocrimes endocrimes merged commit f6757ff into master Jun 12, 2019
@endocrimes endocrimes deleted the dani/b-5770 branch June 12, 2019 15:30
@github-actions
Copy link

github-actions bot commented Feb 8, 2023

I'm going to lock this pull request because it has been closed for 120 days ⏳. This helps our maintainers find and focus on the active contributions.
If you have found a problem that seems related to this change, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Feb 8, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Nomad not cleaning consul services/healthcheck after failed deploy [0.9.x]
2 participants