Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make Job + Eval submitting atomic #8219

Closed
schmichael opened this issue Jun 19, 2020 · 1 comment · Fixed by #8435
Closed

Make Job + Eval submitting atomic #8219

schmichael opened this issue Jun 19, 2020 · 1 comment · Fixed by #8435

Comments

@schmichael
Copy link
Member

Currently it's possible for a job to be submitted but Nomad fails to create an evaluation for it. This leaves the job permanently in pending state until an operator notices and manually creates a new evaluation.

An orphaned job could be created by a server crashing, leader election, or a backup happening after a job has been committed to Raft, but before the corresponding evaluation has been committed. While this should be exceedingly rare, it does happen.

It's especially problematic with periodic jobs as the only indication of a failure is a log line on the current leader:

nomad.periodic: failed to dispatch job ...

Failures when submitting a job through the API would return a similar error to the user, so they would have immediate feedback and could resubmit the job.

Solution

Submit the job and its eval in a single raft log entry. This ensures both is either fully committed or the whole operation fails leading to no job or raft log entry.

In the case of a leader election during periodic job dispatching, the newly elected leader should notice the missing invocation and create it successfully.

@github-actions
Copy link

github-actions bot commented Nov 4, 2022

I'm going to lock this issue because it has been closed for 120 days ⏳. This helps our maintainers find and focus on the active issues.
If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Nov 4, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants