Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CLI: Failed job run with successful rollback should still result in rc != 0 #10994

Closed
peterfroehlich opened this issue Aug 3, 2021 · 4 comments · Fixed by #11550
Closed

CLI: Failed job run with successful rollback should still result in rc != 0 #10994

peterfroehlich opened this issue Aug 3, 2021 · 4 comments · Fixed by #11550

Comments

@peterfroehlich
Copy link

Nomad version, OS version

Nomad v1.1.2 (60638a0)
Centos 7.9

Issue

A failed deployment, started with "nomad job run", that gets rolled back due to a failure with the update exits with return code 0. This is an error imho, as the initial intention of the command failed.

Sample output:

$ nomad run -var version=1 server-staging.nomad
==> 2021-08-02T18:57:12+02:00: Monitoring evaluation "d26d2bf1"
    2021-08-02T18:57:12+02:00: Evaluation triggered by job "jobspec-staging"
    2021-08-02T18:57:12+02:00: Evaluation within deployment: "c0b8a749"
    2021-08-02T18:57:12+02:00: Allocation "2edcde23" created: node "c152d901", group "jobspec"
    2021-08-02T18:57:12+02:00: Evaluation status changed: "pending" -> "complete"
==> 2021-08-02T18:57:12+02:00: Evaluation "d26d2bf1" finished with status "complete"
==> 2021-08-02T18:57:12+02:00: Monitoring deployment "c0b8a749"
  ⠦ Deployment "c0b8a749" failed
    2021-08-02T19:02:12+02:00
    ID          = c0b8a749
    Job ID      = jobspec-staging
    Job Version = 23
    Status      = failed
    Description = Failed due to progress deadline - rolling back to job version 22
    Deployed
    Task Group         Auto Revert  Desired  Placed  Healthy  Unhealthy  Progress
    Deadline
    jobspec  true         1        3       0        3
    2021-08-02T19:02:12+02:00
  ⠙ Deployment "769e7273" successful
    2021-08-02T19:02:47+02:00
    ID          = 769e7273
    Job ID      = jobspec-staging
    Job Version = 24
    Status      = successful
    Description = Deployment completed successfully
    Deployed
    Task Group         Auto Revert  Desired  Placed  Healthy  Unhealthy  Progress
    Deadline
    jobspec  true         1        1       1        0
    2021-08-02T19:07:45+02:00

$ echo $?
0

Reproduction steps

Configure the job-wide update stanza with "auto_revert = true":

    update {
      max_parallel      = 1
      min_healthy_time  = "10s"
      healthy_deadline  = "2m"
      progress_deadline = "5m"
      auto_revert       = true
      stagger           = "30s"
    }

Expected Result

The initial update failed, the cli should exit with an rc != 0 to enable automation systems (like jenkins) to react without parsing the output or requesting further status.

Actual Result

The "nomad" cli exits with rc=0. I guess this is the case because the rollback succeded.

@lgfa29
Copy link
Contributor

lgfa29 commented Aug 19, 2021

Thanks @peterfroehlich. I think it's a tricky situation where the deployment "failed successfully" 😅

We'll discuss this further how to better handle this situation.

@muuki88
Copy link

muuki88 commented Nov 8, 2021

Hi there 😃

Is there any update on this? Anything we can provide to help?

@tgross tgross moved this from Needs Triage to Needs Roadmapping in Nomad - Community Issues Triage Nov 9, 2021
Nomad - Community Issues Triage automation moved this from Needs Roadmapping to Done Dec 9, 2021
@tgross
Copy link
Member

tgross commented Dec 9, 2021

Fixed in #11550 and will ship in the next planned patch release.

@tgross tgross added this to the 1.2.3 milestone Dec 9, 2021
@github-actions
Copy link

I'm going to lock this issue because it has been closed for 120 days ⏳. This helps our maintainers find and focus on the active issues.
If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Oct 13, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
Development

Successfully merging a pull request may close this issue.

4 participants