Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Delayed evaluations for stop_after_client_disconnect can cause unwanted extra followup evaluations around job garbage collection #8098

Closed
langmartin opened this issue Jun 2, 2020 · 1 comment · Fixed by #8099

Comments

@langmartin
Copy link
Contributor

In Nomad 0.11.2 stop_after_client_disconnect was introduced. If a nomad client is separated from the network causing the scheduler to delay the evaluation and that job is subsequently garbage collected, the followup evaluation will create two more followup evaluations, with 0 WaitUntil. Unfortunately, both of those will create 2, ultimately causing the cluster leader to become unresponsive.

This failure mode requires the combination of jobs opting in to the new feature, the feature being used to delay rescheduling and the job being garbage collected (or stop -purge).

@github-actions
Copy link

github-actions bot commented Nov 6, 2022

I'm going to lock this issue because it has been closed for 120 days ⏳. This helps our maintainers find and focus on the active issues.
If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Nov 6, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
1 participant