Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

resource stanza completed ignored during first plan #8544

Closed
shantanugadgil opened this issue Jul 27, 2020 · 3 comments · Fixed by #8567
Closed

resource stanza completed ignored during first plan #8544

shantanugadgil opened this issue Jul 27, 2020 · 3 comments · Fixed by #8567

Comments

@shantanugadgil
Copy link
Contributor

Nomad version

Nomad v0.12.1 (14a6893)

Operating system and Environment details

CentOS 7/8 Ubuntu 16/18

Issue

The resource stanza seems to be ignored during the first plan (i.e. before the job is submitted)

Reproduction steps

Use the example job file below, notice the large amount of memory requested; there is no machine in the network with that much memory.

Running nomad plan should catch this and show the usual "memory exhausted ..." message.

Job file (if appropriate)

job "example" {
  datacenters = ["dc1"]

  group "cache" {
    count = 1

    scaling {
      enabled = true
      min = 1
      max = 1
    }

    task "redis" {
      driver = "docker"

      config {
        image = "redis:3.2"

        port_map {
          db = 6379
        }
      }

      resources {
        cpu    = 512
        memory = 2560000

        network {
          mbits = 10
          port "db" {}
        }
      }
    }
  }
}

Output of nomad plan ... 😖

nomad plan example.nomad
+ Job: "example"
+ Task Group: "cache"
  + Task: "redis" (forces create)

Scheduler dry-run:
- All tasks successfully allocated.

Job Modify Index: 0
To submit the job with version verification run:

nomad job run -check-index 0 example.nomad

Of course, nomad run makes the job go into 'Pending' state...

@shantanugadgil
Copy link
Contributor Author

@jrasell
Copy link
Member

jrasell commented Jul 29, 2020

Hi @shantanugadgil, thanks for raising this with the reproduction steps.

Running this locally I managed to find that the regression occurred as part of the 0.11.0 release. In the 0.10.5 release the plan warns as expected (shown below).

Scheduler dry-run:
- WARNING: Failed to place all allocations.
  Task Group "cache" (failed to place 1 allocation):
    * Resources exhausted on 1 nodes
    * Dimension "memory" exhausted on 1 nodes

notnoop pushed a commit that referenced this issue Jul 30, 2020
Fixes #8544

This PR fixes a bug where using `nomad job plan ...` always report no change if the submitted job contain scaling.

The issue has three contributing factors:
1. The plan endpoint doesn't populate the required scaling policy ID; unlike the job register endpoint
2. The plan endpoint suppresses errors on job insertion - the job insertion fails here, because the scaling policy is missing the required ID
3. The scheduler reports no update necessary when the relevant job isn't in store (because the insertion failed)

This PR fixes the first two factors.  Changing the scheduler to be more strict might make sense, but may violate some idempotency invariant or make the scheduler more brittle.
@github-actions
Copy link

github-actions bot commented Nov 4, 2022

I'm going to lock this issue because it has been closed for 120 days ⏳. This helps our maintainers find and focus on the active issues.
If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Nov 4, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants