Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prevent kill_timeout greater than progress_deadline #8487

Closed
threemachines opened this issue Jul 21, 2020 · 4 comments
Closed

Prevent kill_timeout greater than progress_deadline #8487

threemachines opened this issue Jul 21, 2020 · 4 comments
Assignees
Labels
good first issue help-wanted We encourage community PRs for these issues! stage/accepted Confirmed, and intend to work on. No timeline committment though. theme/jobspec type/bug

Comments

@threemachines
Copy link

threemachines commented Jul 21, 2020

Nomad version

Nomad v0.12.0 (8f7fbc8)

Operating system and Environment details

Ubuntu 18.04 on AWS

Issue

If a kill_timeout is greater than the job's progress_deadline, allocations may keep running (after initial kill signal) for long enough that the deploy fails. However, the allocations that were scheduled by that job will still be pending after job failure, and will be placed and started once the previous allocations do exit.

I'm honestly split between this being a bug or a feature request. I can't say with certainty that the current behavior is wrong, but we certainly found it surprising and unpleasant. We found it by accidental misconfiguration, and I'm hard-pressed to imagine a situation where you'd really want your job to work this way, so why not add a guard-rail?

The alternative fix I can imagine is that the timer for progress_deadline could not start until allocations have been placed, but that would have other problems.

Reproduction steps

  1. Deploy the below job file. Should be fine.
  2. Change the environment variable to force redeploys and deploy it again. Deploy will fail on timeout.
  3. Do it again, just for fun. (Changing environment variables again.)
  4. Wait until five minutes after the second deploy.

You now have fifteen allocations from deploy 0, and five allocations from deploy 2, which all started several minutes after deploy 2 failed. (The five allocations from deploy 1 went directly from pending to completed.)

Job file

job "example" {
  datacenters = ["sandbox"]
  type = "service"
  update {
    max_parallel = 5
    min_healthy_time = "10s"
    healthy_deadline = "30s"
    progress_deadline = "1m"
    auto_revert = false
    canary = 0
  }
  migrate {
    max_parallel = 5
    health_check = "checks"
    min_healthy_time = "10s"
    healthy_deadline = "5m"
  }
  group "cache" {
    count = 20
    restart {
      attempts = 2
      interval = "30m"
      delay = "15s"
      mode = "fail"
    }
    ephemeral_disk {
      size = 300
    }
    task "redis" {
      driver = "docker"
      ## send redis a sighup so it doesn't actually exit
      ## this lets us play with the kill_timeouts
      kill_signal = "SIGHUP"
      kill_timeout = "5m"
      config {
        image = "redis:3.2"
        port_map {
          db = 6379
        }
      }
      env {
        asdf = "asdf"
      }
      resources {
        cpu    = 500
        memory = 256
        network {
          mbits = 10
          port "db" {}
        }
      }
      service {
        name = "redis-cache"
        tags = ["global", "cache"]
        port = "db"
        check {
          name     = "alive"
          type     = "tcp"
          interval = "10s"
          timeout  = "2s"
        }
      }
    }
  }
}

One obvious modification is to set auto_revert = true. Although that does eventually result in a better final state (all 20 allocations from the same job and same version), the way it gets there is very alarming, and you end up with some weird time-traveling jobs. (Version 1 finished a minute after version 2, etc.) If you stay in this jobs-pending state for more than a few minutes (we, uh, had a 24h kill_timeout...) your allocations can become nightmarish. (I think there's additional failure modes when you stack enough of those up, but I haven't tried to reproduce them yet.)

@tgross
Copy link
Member

tgross commented Dec 16, 2020

Hi @threemachines, sorry no one got back to you on this one. I agree that's always going to be a misconfiguration so we should put up some validation of the jobspec on that. That sort of makes this issue a "bug-hancement" but I'll mark it as accepted so it gets on our roadmap.

@pavanrangain
Copy link

Wonder why this got back ported on a patch release. We were waiting for fix of #17079 but this backport is preventing us from moving to 1..5.6 as many jobs may now need a change.
Isn't this considered a breaking change to have been back ported to patch release ?

Also if i check the changes https://github.com/hashicorp/nomad/pull/17207/files does it mean we cannot set progress_deadline to 0 for tasks which take longer to be healthy now and have to always set some higher value ?

https://developer.hashicorp.com/nomad/docs/job-specification/update#progress_deadline - this does not explicitly say so but i i read it right it means 0 is considered as infinite i.e till an alloc becomes unhealthy

@Juanadelacuesta
Copy link
Member

Juanadelacuesta commented May 24, 2023

Hello pavanrangain, Im sorry this issue is delaying your progress, this change was not intended as a breaking one, its just to signal a misconfiguration that can cause your cluster to act in unexpected ways, when updating a task the "old" allocations might take too long to gracefully exit and wont allow the new allocations to succeed, causing a failed deployment

Addressing your concern about not being able to set progress_deadline to 0 for slow tasks, it might be an unexpected consequence of the bug fix and we will be looking into it.

@lgfa29
Copy link
Contributor

lgfa29 commented May 29, 2023

Close by #16761.

The incorrect validation when progress_deadline is 0 is fixed in #17342.

@lgfa29 lgfa29 closed this as completed May 29, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
good first issue help-wanted We encourage community PRs for these issues! stage/accepted Confirmed, and intend to work on. No timeline committment though. theme/jobspec type/bug
Projects
None yet
Development

No branches or pull requests

7 participants