Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dead jobs switch their state back to pending when you shutdown client node #1178

Closed
cigdono opened this issue May 17, 2016 · 2 comments · Fixed by #1205
Closed

dead jobs switch their state back to pending when you shutdown client node #1178

cigdono opened this issue May 17, 2016 · 2 comments · Fixed by #1205

Comments

@cigdono
Copy link

cigdono commented May 17, 2016

Nomad Version

nomad version
Nomad v0.3.2

Operating System and Environment Details

CentOS7 3.10.0-327.13.1.el7.x86_64

Issue

If I shutdown client nodes, jobs that were in the dead state switch back to the pending state.

Reproduction Steps

Configure a 3 node server cluster. Bring up a single client node. Submit a single batch job with a task count of 1 using the Docker driver. Let the job run to completion. If you run a nomad status the jobs will be in the dead state. Shutdown the client node. The job switches back to the pending state.

Note: I did not test this with other job types or drivers. May be specific to this combination... or may not be.

Nomad Server Config (other 2 are identical except for advertise IP)

bind_addr = "0.0.0.0"
log_level = "DEBUG"

data_dir = "/var/log/nomad/data_dir"
disable_update_check = true
region = "global"
datacenter = "dc1"
leave_on_terminate = true
leave_on_interrupt = true

telemetry {
    statsd_address = "127.0.0.1:8125"
}

advertise {
    http = "10.132.0.9:4646"
    rpc  = "10.132.0.9:4647"
    serf = "10.132.0.9:4648"
}

server {
    enabled = true
    bootstrap_expect = 3
    node_gc_threshold = "1m"
    retry_join = ["nomad-svr-global-01","nomad-svr-global-02","nomad-svr-global-03"]
}

Nomad Client Config

bind_addr = "0.0.0.0"
log_level = "DEBUG"
region = "global"
datacenter = "dc1"

data_dir = "/var/log/nomad/data_dir"
disable_update_check = true
leave_on_terminate = true
leave_on_interrupt = true

telemetry {
   statsd_address = "127.0.0.1:8125"
}

client {
   enabled = true
   servers = ["nomad-svr-global-01","nomad-svr-global-02","nomad-svr-global-03"]
   options = {
      "driver.raw_exec.enable" = "1"
      "docker.cleanup.image" = false
   }
}

Job File

{
    "Job": {
        "Region": "global",
        "ID": "test-01",
        "Name": "test-01",
        "Type": "batch",
        "Priority": 50,
        "Datacenters": [
            "dc1"
        ],
        "TaskGroups": [
            {
                "Name": "test-group",
                "Count": 1,
                "Tasks": [
                    {
                        "Name": "hello-world",
                        "Driver": "docker",
                        "Config": {
                            "image": "https://docker-cache.service.consul:5000/cdi/nomad-test:v0.0.9",
                            "command": "/opt/test/bin/test_batch.py",
                            "args": ["-t","30"],
                            "network_mode": "host"
                        },
                        "Resources": {
                            "CPU": 2500,
                            "MemoryMB": 256,
                            "DiskMB": 300,
                            "IOPS": 0
                        },
                        "LogConfig": {
                           "MaxFiles": 10,
                           "MaxFileSizeMB": 10
                        }
                    }
                ]
            }
        ]
    }
}
@dverbeek84
Copy link

I see a sort of problem with the Docker driver.

When the Docker container is dead and you restart Nomad. Nomad wil not remove the dead container with nomad stop <job>

@github-actions
Copy link

I'm going to lock this issue because it has been closed for 120 days ⏳. This helps our maintainers find and focus on the active issues.
If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Dec 23, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants