Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Job Evaluations are not correctly adjusting for dead worker nodes #1663

Closed
BSick7 opened this issue Aug 30, 2016 · 6 comments
Closed

Job Evaluations are not correctly adjusting for dead worker nodes #1663

BSick7 opened this issue Aug 30, 2016 · 6 comments

Comments

@BSick7
Copy link

BSick7 commented Aug 30, 2016

Nomad version

Clients and Servers

$ nomad -v
Nomad v0.4.1

Operating system and Environment details

Our clients have the following set:

leave_on_interrupt = true
leave_on_terminate = true

We are running nomad workers using immutable infrastructure.
We have 2 sets of workgroups (blue and green) that allow us to upgrade worker boxes without downtime.
We run nomad jobs at both workgroups using constraints.
Our job status may look something like this:

$ nomad status deploy
ID          = deploy
Name        = deploy
Type        = system
Priority    = 50
Datacenters = us-east-1a,us-east-1b,us-east-1d
Status      = running
Periodic    = false

Summary
Task Group    Queued  Starting  Running  Failed  Complete  Lost
deploy-blue   0       0         3        0       0         0
deploy-green  0       0         3        0       0         0

Allocations
ID        Eval ID   Node ID   Task Group    Desired  Status   Created At
0e2eee8f  c340a16a  82e0e895  deploy-green  run      running  08/29/16 13:54:58 UTC
23884cb5  fcc2b1fc  92264826  deploy-green  run      running  08/29/16 13:54:55 UTC
12f809a5  4e1fee7c  11cdc54f  deploy-green  run      running  08/29/16 13:54:52 UTC
006ab90f  93168101  ec3aa0f7  deploy-blue   run      running  08/29/16 12:22:24 UTC
75712d72  93168101  19dea94b  deploy-blue   run      running  08/29/16 12:22:24 UTC
8d55a1bf  93168101  6ff73ef5  deploy-blue   run      running  08/29/16 12:22:24 UTC

Issue

The issue arises when upgrading our nodes.
This particular job is scheduled as a system job so we expect it to run on all worker nodes that are live.

Instead, what we get is the job is placed based on comparing "total live workers" vs "total allocations".
From the above job status, we can see that all 6 allocations are on down worker nodes.

$ nomad node-status
ID        DC          Name              Class  Drain  Status
0d72463c  us-east-1a  worker-green-17   green  false  ready
772737ae  us-east-1b  worker-green-69   green  false  ready
f2424785  us-east-1d  worker-green-148  green  false  ready
2f1c84e1  us-east-1b  worker-green-69   green  false  down
0683e842  us-east-1a  worker-green-19   green  false  down
f26a326a  us-east-1d  worker-green-158  green  false  down
8a2b653d  us-east-1a  worker-green-8    green  false  down
a9fc0246  us-east-1b  worker-green-86   green  false  down
b88c0861  us-east-1d  worker-green-133  green  false  down
11cdc54f  us-east-1b  worker-green-69   green  false  down
82e0e895  us-east-1d  worker-green-140  green  false  down
92264826  us-east-1a  worker-green-22   green  false  down
ec3aa0f7  us-east-1b  worker-blue-85    blue   false  down
6ff73ef5  us-east-1d  worker-blue-139   blue   false  down
19dea94b  us-east-1a  worker-blue-20    blue   false  down

Since a system job should run on every box, we would expect 3 live allocations.

Reproduction steps

  1. Start with a 1 worker-node nomad cluster
  2. Issue a job that satisfies that node.
  3. Start another worker-node (that will satisfy the same constraint)
  4. Stop original worker-node.
@dadgar
Copy link
Contributor

dadgar commented Aug 30, 2016

In your example there are only 3 nodes up but the status says it is running on 6?

@dadgar
Copy link
Contributor

dadgar commented Aug 30, 2016

Would it be possible to maybe share two node configs and a job file that will expose this behavior?

@BSick7
Copy link
Author

BSick7 commented Aug 31, 2016

There are 3 live nodes (think of these as worker-green v2). The 6 running jobs are allocated on 3 worker-green v1 (dead) and 3 worker-blue v1 (dead).

Deploy Job Spec

NOTE: I dropped Config(docker config), Env, Services, and Resources from definition.

{
    "Job": {
        "Region": "us-east",
        "ID": "deploy",
        "Name": "deploy",
        "Type": "system",
        "Priority": 50,
        "AllAtOnce": false,
        "Datacenters": [
            "us-east-1b",
            "us-east-1e"
        ],
        "Constraints": null,
        "TaskGroups": [
            {
                "Name": "deploy-blue",
                "Count": 1,
                "Constraints": [
                    {
                        "LTarget": "${node.class}",
                        "RTarget": "blue",
                        "Operand": "="
                    }
                ],
                "Tasks": [
                    {
                        "Name": "deploy",
                        "Driver": "docker",
                        "User": "",
                        "Config": {},
                        "Constraints": null,
                        "Services": [],
                        "Env": {},
                        "Resources": {},
                        "Meta": null,
                        "KillTimeout": 5000000000,
                        "LogConfig": {
                            "MaxFiles": 10,
                            "MaxFileSizeMB": 10
                        },
                        "Artifacts": null
                    }
                ],
                "RestartPolicy": {
                    "Interval": 60000000000,
                    "Attempts": 1,
                    "Delay": 15000000000,
                    "Mode": "delay"
                },
                "Meta": null
            },
            {
                "Name": "deploy-green",
                "Count": 1,
                "Constraints": [
                    {
                        "LTarget": "${node.class}",
                        "RTarget": "green",
                        "Operand": "="
                    }
                ],
                "Tasks": [
                    {
                        "Name": "deploy",
                        "Driver": "docker",
                        "User": "",
                        "Config": {},
                        "Constraints": null,
                        "Services": [],
                        "Env": {},
                        "Resources": {},
                        "Meta": null,
                        "KillTimeout": 5000000000,
                        "LogConfig": {
                            "MaxFiles": 10,
                            "MaxFileSizeMB": 10
                        },
                        "Artifacts": null
                    }
                ],
                "RestartPolicy": {
                    "Interval": 60000000000,
                    "Attempts": 1,
                    "Delay": 15000000000,
                    "Mode": "delay"
                },
                "Meta": null
            }
        ],
        "Update": {
            "Stagger": 0,
            "MaxParallel": 0
        },
        "Periodic": null,
        "Meta": null,
        "Status": "running",
        "StatusDescription": "",
        "CreateIndex": 313764,
        "ModifyIndex": 313766,
        "JobModifyIndex": 313764
    }
}

Sample Green Worker Config

data_dir = "/var/lib/nomad"
leave_on_interrupt = true
leave_on_terminate = true
disable_update_check = true
datacenter = "<scrubbed>"
region = "<scrubbed>"
bind_addr = "<scrubbed>"
client {
  node_class = "green"
}
consul {
  address = "<scrubbed>"
}

Sample Blue Worker Config

data_dir = "/var/lib/nomad"
leave_on_interrupt = true
leave_on_terminate = true
disable_update_check = true
datacenter = "<scrubbed>"
region = "<scrubbed>"
bind_addr = "<scrubbed>"
client {
  node_class = "blue"
}
consul {
  address = "<scrubbed>"
}

@dadgar dadgar added this to the v0.5.0 milestone Aug 31, 2016
@steve-jansen
Copy link
Contributor

@dadgar

This problem appears to be limited to system jobs.

It's also worth noting that the nodes remain listed in nomad node-status after the nodes have been terminated in AWS, and after we issue a curl -X PUT ${NOMAD_ADDR}/v1/system/gc to garbage collect dead nodes.

It's unclear why system jobs remain "running" on a ghost node.

@dadgar
Copy link
Contributor

dadgar commented Sep 12, 2016

@steve-jansen Thanks for the additional detail. Will get this fixed before releasing 0.5!

@github-actions
Copy link

I'm going to lock this issue because it has been closed for 120 days ⏳. This helps our maintainers find and focus on the active issues.
If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Dec 19, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

3 participants