Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Job history and auto_revert fail when version exceeds 255 #3357

Closed
hobochili opened this issue Oct 11, 2017 · 3 comments · Fixed by #3372
Closed

Job history and auto_revert fail when version exceeds 255 #3357

hobochili opened this issue Oct 11, 2017 · 3 comments · Fixed by #3372

Comments

@hobochili
Copy link
Contributor

hobochili commented Oct 11, 2017

Nomad version

Confirmed in 0.6.2, 0.6.3, and v0.7.0-beta1

Operating system and Environment details

Reproduced in included dev Vagrant VM (Linux)

Issue

With auto_revert set to true and a job version greater than 255, a failed deploy will revert to the last stable job with version less than or equal to 255, even when there is a stable version greater than 255.

Other issues, probably related: nomad job history <job> doesn't display jobs past version 255 and nomad job history -version <version> <job> outputs a stack trace if the specified version doesn't exist. Let me know if I should file these separately.

Reproduction steps

  • Create a stable job in Nomad with auto_revert=true
  • Update that job until its version exceeds 256
  • Tweak the job so that it will fail on the next deploy
  • Deploy the faulty job

If the last stable job was 256 and job 257 fails, Nomad should revert back to 256. Instead, it will revert to 255 (assuming 255 was stable).

Here's a gist of the script I used to generate 256 stable jobs and a 257th unstable job: https://gist.github.com/hobochili/c714d246d20b8b3c0bf985b2b9a54b5a

In this test, the failing 257th job reverts back to 255 rather than 256. This issue persists for all subsequent failed jobs.

Nomad Server logs

https://gist.github.com/hobochili/762f8ebade8288a7322f81ab7931cf1d

Nomad Client logs

==> Monitoring evaluation "3dea4483"
    Evaluation triggered by job "test"
    Allocation "3d1e439d" created: node "1b8ac397", group "fail"
    Evaluation within deployment: "1b0f1af6"
    Evaluation status changed: "pending" -> "complete"
==> Evaluation "3dea4483" finished with status "complete"

Job file

https://gist.github.com/hobochili/c714d246d20b8b3c0bf985b2b9a54b5a

@hobochili hobochili reopened this Oct 11, 2017
@hobochili hobochili changed the title job history and auto_revert failing beyond the 255th deployment Job history and auto_revert fail when version exceeds 255 Oct 11, 2017
@dadgar
Copy link
Contributor

dadgar commented Oct 11, 2017

Thanks for the detailed report. I will try to reproduce soon and will update with findings!

dadgar added a commit that referenced this issue Oct 12, 2017
Fixes an issue in which the versions were improperly sorted which would
cause pruning of the wrong job version. This essentially meant that job
versions above 255 would be dropped from the job version table (note
this was due to the prefix walk crossing from the 1-byte to 2-byte
threshold).

Fixes #3357
@dadgar
Copy link
Contributor

dadgar commented Oct 12, 2017

@hobochili Thanks for the issue! Got it fixed and will be out with 0.7

@github-actions
Copy link

github-actions bot commented Dec 7, 2022

I'm going to lock this issue because it has been closed for 120 days ⏳. This helps our maintainers find and focus on the active issues.
If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Dec 7, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants