Segmentation fault after submitting a job on v0.9.2 #5810

pznamensky · 2019-06-11T13:48:43Z

Nomad version

Nomad v0.9.2 (0283266)

Operating system and Environment details

CentOS Linux release 7.6.1810
docker-ce 18.03

Issue

Segmentation fault on nomad agent after submitting a job.

Reproduction steps

Upgrade (or just install, I guess) nomad agent from v0.8.4 to v0.9.2
Submit a job (you can see example below)

Job file (if appropriate)

job "some-task" {
  datacenters = ["stage"]
  type = "service"

  update {
    max_parallel = 1
    min_healthy_time = "10s"
    healthy_deadline = "3m"
  }

  group "some-task" {
    count = 1
    restart {
      attempts = 10
      interval = "5m"
      delay = "20s"
      mode = "delay"
    }

    task "some-task" {
      driver = "docker"
      user = "some-task"
      kill_timeout = "61s"
      config {
        logging {
            type = "json-file"
        }
        image = "docker-image"
        force_pull = true
        network_mode = "host"
      }

      env {
        PORT = "${NOMAD_HOST_PORT_http}"
      }

      resources {
        cpu    = 300 # MHz
        memory = 512 # GB
        network {
          port "http" {}
        }
      }

      service {
        name = "some-task"
        tags = ["http", "some-task"]
        port = "http"
        check {
          type     = "tcp"
          interval = "10s"
          timeout  = "5s"

          check_restart {
            limit = 10
            grace = "3m"
          }
        }
      }
    }
  }
}

Nomad Client logs

==> Loaded configuration from /etc/nomad.d/nomad-client.hcl
==> Starting Nomad agent...
==> Nomad agent configuration:

       Advertise Addrs: HTTP: [ip_addr_here]:4646
            Bind Addrs: HTTP: [::]:4646
                Client: true
             Log Level: DEBUG
                Region: global (DC: stage)
                Server: false
               Version: 0.9.2

==> Nomad agent started! Log data will stream in below:

    2019-06-11T16:30:49.923+0300 [WARN ] agent.plugin_loader: skipping external plugins since plugin_dir doesn't exist: plugin_dir=/var/lib/nomad/plugins
    2019-06-11T16:30:49.928+0300 [DEBUG] agent.plugin_loader.docker: using client connection initialized from environment: plugin_dir=/var/lib/nomad/plugins
    2019-06-11T16:30:49.928+0300 [DEBUG] agent.plugin_loader.docker: using client connection initialized from environment: plugin_dir=/var/lib/nomad/plugins
    2019-06-11T16:30:49.928+0300 [INFO ] agent: detected plugin: name=rkt type=driver plugin_version=0.1.0
    2019-06-11T16:30:49.928+0300 [INFO ] agent: detected plugin: name=raw_exec type=driver plugin_version=0.1.0
    2019-06-11T16:30:49.928+0300 [INFO ] agent: detected plugin: name=exec type=driver plugin_version=0.1.0
    2019-06-11T16:30:49.928+0300 [INFO ] agent: detected plugin: name=qemu type=driver plugin_version=0.1.0
    2019-06-11T16:30:49.928+0300 [INFO ] agent: detected plugin: name=java type=driver plugin_version=0.1.0
    2019-06-11T16:30:49.928+0300 [INFO ] agent: detected plugin: name=docker type=driver plugin_version=0.1.0
    2019-06-11T16:30:49.928+0300 [INFO ] agent: detected plugin: name=nvidia-gpu type=device plugin_version=0.1.0
    2019-06-11T16:30:49.929+0300 [INFO ] client: using state directory: state_dir=/var/lib/nomad/client
    2019-06-11T16:30:49.929+0300 [INFO ] client: using alloc directory: alloc_dir=/var/lib/nomad/alloc
    2019-06-11T16:30:49.931+0300 [DEBUG] client.fingerprint_mgr: built-in fingerprints: fingerprinters="[arch cgroup consul cpu host memory network nomad signal storage vault env_aws env_gce]"
    2019-06-11T16:30:49.932+0300 [INFO ] client.fingerprint_mgr.cgroup: cgroups are available
    2019-06-11T16:30:49.932+0300 [DEBUG] client.fingerprint_mgr: fingerprinting periodically: fingerprinter=cgroup period=15s
    2019-06-11T16:30:49.936+0300 [INFO ] client.fingerprint_mgr.consul: consul agent is available
    2019-06-11T16:30:49.936+0300 [DEBUG] client.fingerprint_mgr.cpu: detected cpu frequency: MHz=2095
    2019-06-11T16:30:49.936+0300 [DEBUG] client.fingerprint_mgr.cpu: detected core count: cores=2
    2019-06-11T16:30:49.939+0300 [DEBUG] client.fingerprint_mgr: fingerprinting periodically: fingerprinter=consul period=15s
    2019-06-11T16:30:49.944+0300 [DEBUG] client.fingerprint_mgr.network: link speed detected: interface=eth1 mbits=1000
    2019-06-11T16:30:49.944+0300 [DEBUG] client.fingerprint_mgr.network: detected interface IP: interface=eth1 IP=192.168.122.166
    2019-06-11T16:30:49.946+0300 [DEBUG] client.fingerprint_mgr: fingerprinting periodically: fingerprinter=vault period=15s
    2019-06-11T16:30:51.947+0300 [DEBUG] client.fingerprint_mgr.env_aws: error querying AWS Metadata URL, skipping
    2019-06-11T16:30:53.020+0300 [DEBUG] client.fingerprint_mgr.env_gce: could not read value for attribute: attribute=machine-type error="Get http://169.254.169.254/computeMetadata/v1/instance/machine-type: dial tcp 169.254.169.254:80: connect: no route to host"
    2019-06-11T16:30:53.020+0300 [DEBUG] client.fingerprint_mgr.env_gce: error querying GCE Metadata URL, skipping
    2019-06-11T16:30:53.020+0300 [DEBUG] client.fingerprint_mgr: detected fingerprints: node_attrs="[arch cgroup consul cpu host network nomad signal storage]"
    2019-06-11T16:30:53.020+0300 [INFO ] client.plugin: starting plugin manager: plugin-type=driver
    2019-06-11T16:30:53.020+0300 [INFO ] client.plugin: starting plugin manager: plugin-type=device
    2019-06-11T16:30:53.024+0300 [DEBUG] client.plugin: waiting on plugin manager initial fingerprint: plugin-type=driver
    2019-06-11T16:30:53.024+0300 [DEBUG] client.plugin: waiting on plugin manager initial fingerprint: plugin-type=device
    2019-06-11T16:30:53.025+0300 [DEBUG] client.plugin: finished plugin manager initial fingerprint: plugin-type=device
    2019-06-11T16:30:53.025+0300 [DEBUG] client.driver_mgr: initial driver fingerprint: driver=rkt health=undetected description="Failed to execute rkt version: exec: "rkt": executable file not found in $PATH"
    2019-06-11T16:30:53.035+0300 [DEBUG] client.driver_mgr: initial driver fingerprint: driver=raw_exec health=undetected description=disabled
    2019-06-11T16:30:53.037+0300 [DEBUG] client.driver_mgr: initial driver fingerprint: driver=qemu health=undetected description=
    2019-06-11T16:30:53.037+0300 [DEBUG] client.driver_mgr: initial driver fingerprint: driver=exec health=healthy description=Healthy
    2019-06-11T16:30:53.061+0300 [DEBUG] client.driver_mgr: initial driver fingerprint: driver=docker health=healthy description=Healthy
    2019-06-11T16:30:53.065+0300 [DEBUG] client.server_mgr: new server list: new_servers=[2001:1af8:4700:a06f:1:0:cccc:2]:4647,[2001:1af8:4700:a06f:1:0:cccc:3]:4647,[2001:1af8:4700:a06f:2:0:cccc:2]:4647,[2001:1af8:4700:a06f:2:0:cccc:3]:4647 old_servers=
    2019-06-11T16:30:53.128+0300 [DEBUG] client.driver_mgr: initial driver fingerprint: driver=java health=healthy description=Healthy
    2019-06-11T16:30:53.128+0300 [DEBUG] client.driver_mgr: detected drivers: drivers="map[undetected:[rkt raw_exec qemu] healthy:[exec docker java]]"
    2019-06-11T16:30:53.128+0300 [DEBUG] client.plugin: finished plugin manager initial fingerprint: plugin-type=driver
    2019-06-11T16:30:53.135+0300 [WARN ] client.driver_mgr.docker: failed to reattach to docker logger process: driver=docker error="failed to reattach to docker logger process: Reattachment process not found"
    2019-06-11T16:30:53.135+0300 [DEBUG] client.driver_mgr.docker.docker_logger: starting plugin: driver=docker path=/usr/bin/nomad args="[/usr/bin/nomad docker_logger]"
    2019-06-11T16:30:53.137+0300 [DEBUG] client.driver_mgr.docker.docker_logger: plugin started: driver=docker path=/usr/bin/nomad pid=25228
    2019-06-11T16:30:53.137+0300 [DEBUG] client.driver_mgr.docker.docker_logger: waiting for RPC address: driver=docker path=/usr/bin/nomad
    2019-06-11T16:30:53.156+0300 [INFO ] client: node registration complete
    2019-06-11T16:30:53.167+0300 [DEBUG] client.driver_mgr.docker.docker_logger: using plugin: driver=docker version=2
    2019-06-11T16:30:53.168+0300 [DEBUG] client.driver_mgr.docker.docker_logger.nomad: plugin address: driver=docker @module=docker_logger address=/tmp/plugin218997716 network=unix timestamp=2019-06-11T16:30:53.167+0300
    2019-06-11T16:30:53.169+0300 [DEBUG] client.driver_mgr.docker.docker_logger.nomad: using client connection initialized from environment: driver=docker @module=docker_logger timestamp=2019-06-11T16:30:53.169+0300
    2019-06-11T16:30:53.184+0300 [ERROR] client.driver_mgr.docker: failed to reattach to executor: driver=docker error="error creating rpc client for executor plugin: Reattachment process not found" task_name=some-task
    2019-06-11T16:30:53.184+0300 [INFO ] client: started client: node_id=f122509a-cdb5-fa86-74be-4a239c5b1096
    2019-06-11T16:30:53.191+0300 [DEBUG] client: updated allocations: index=3703755 total=6 pulled=1 filtered=5
    2019-06-11T16:30:53.192+0300 [DEBUG] client.alloc_runner.task_runner: task run loop exiting: alloc_id=0346e3fc-3253-26f7-b813-361ffb05f65a task=some-task
    2019-06-11T16:30:53.192+0300 [INFO ] client.gc: marking allocation for GC: alloc_id=0346e3fc-3253-26f7-b813-361ffb05f65a
    2019-06-11T16:30:53.192+0300 [DEBUG] client: allocation updates: added=0 removed=0 updated=1 ignored=5
    2019-06-11T16:30:53.194+0300 [DEBUG] client.alloc_runner.task_runner: task run loop exiting: alloc_id=2954ae12-81ab-9c24-0d48-32beefa14452 task=some-task
    2019-06-11T16:30:53.195+0300 [INFO ] client.gc: marking allocation for GC: alloc_id=2954ae12-81ab-9c24-0d48-32beefa14452
    2019-06-11T16:30:53.195+0300 [WARN ] client.alloc_runner.task_runner.task_hook: failed to reattach to logmon process: alloc_id=14055ea2-8d05-1a15-0595-41f310182554 task=some-task error="Reattachment process not found"
    2019-06-11T16:30:53.196+0300 [DEBUG] client.alloc_runner.task_runner.task_hook.logmon: starting plugin: alloc_id=14055ea2-8d05-1a15-0595-41f310182554 task=some-task path=/usr/bin/nomad args="[/usr/bin/nomad logmon]"
    2019-06-11T16:30:53.197+0300 [DEBUG] client.alloc_runner.task_runner.task_hook.logmon: plugin started: alloc_id=14055ea2-8d05-1a15-0595-41f310182554 task=some-task path=/usr/bin/nomad pid=25240
    2019-06-11T16:30:53.197+0300 [DEBUG] client.alloc_runner.task_runner.task_hook.logmon: waiting for RPC address: alloc_id=14055ea2-8d05-1a15-0595-41f310182554 task=some-task path=/usr/bin/nomad
    2019-06-11T16:30:53.198+0300 [DEBUG] client.alloc_runner.task_runner: task run loop exiting: alloc_id=56d2e783-52c1-c394-e704-df6b820766bc task=some-task
    2019-06-11T16:30:53.199+0300 [INFO ] client.gc: marking allocation for GC: alloc_id=56d2e783-52c1-c394-e704-df6b820766bc
    2019-06-11T16:30:53.201+0300 [DEBUG] client.alloc_runner.task_runner: task run loop exiting: alloc_id=6c660e0e-ec61-5eec-7db6-b49f95445a26 task=some-task
    2019-06-11T16:30:53.201+0300 [INFO ] client.gc: marking allocation for GC: alloc_id=6c660e0e-ec61-5eec-7db6-b49f95445a26
    2019-06-11T16:30:53.204+0300 [DEBUG] client.alloc_runner.task_runner: task run loop exiting: alloc_id=e7236208-ef4d-2ec4-984a-203343b20431 task=some-task
    2019-06-11T16:30:53.204+0300 [INFO ] client.gc: marking allocation for GC: alloc_id=e7236208-ef4d-2ec4-984a-203343b20431
    2019-06-11T16:30:53.205+0300 [DEBUG] client: evaluations triggered by node update: num_evals=1
    2019-06-11T16:30:53.205+0300 [DEBUG] client: state updated: node_status=ready
    2019-06-11T16:30:53.206+0300 [DEBUG] client: allocation updates applied: added=0 removed=0 updated=1 ignored=5 errors=0
    2019-06-11T16:30:53.234+0300 [DEBUG] client.alloc_runner.task_runner.task_hook.logmon.nomad: plugin address: alloc_id=14055ea2-8d05-1a15-0595-41f310182554 task=some-task @module=logmon address=/tmp/plugin979365883 network=unix timestamp=2019-06-11T16:30:53.234+0300
    2019-06-11T16:30:53.235+0300 [DEBUG] client.alloc_runner.task_runner.task_hook.logmon: using plugin: alloc_id=14055ea2-8d05-1a15-0595-41f310182554 task=some-task version=2
    2019-06-11T16:30:53.243+0300 [INFO ] client.alloc_runner.task_runner.task_hook.logmon.nomad: opening fifo: alloc_id=14055ea2-8d05-1a15-0595-41f310182554 task=some-task @module=logmon path=/var/lib/nomad/alloc/14055ea2-8d05-1a15-0595-41f310182554/alloc/logs/.some-task.stdout.fifo timestamp=2019-06-11T16:30:53.242+0300
    2019-06-11T16:30:53.244+0300 [INFO ] client.alloc_runner.task_runner.task_hook.logmon.nomad: opening fifo: alloc_id=14055ea2-8d05-1a15-0595-41f310182554 task=some-task @module=logmon path=/var/lib/nomad/alloc/14055ea2-8d05-1a15-0595-41f310182554/alloc/logs/.some-task.stderr.fifo timestamp=2019-06-11T16:30:53.242+0300
    2019-06-11T16:30:53.531+0300 [DEBUG] client: updated allocations: index=3704020 total=6 pulled=0 filtered=6
    2019-06-11T16:30:53.532+0300 [DEBUG] client: allocation updates: added=0 removed=0 updated=0 ignored=6
    2019-06-11T16:30:53.533+0300 [DEBUG] client: allocation updates applied: added=0 removed=0 updated=0 ignored=6 errors=0
    2019-06-11T16:30:53.746+0300 [ERROR] client.driver_mgr.docker.docker_logger.nomad: reading plugin stderr: driver=docker error="read |0: file already closed"
    2019-06-11T16:30:53.746+0300 [DEBUG] client.driver_mgr.docker.docker_logger: plugin process exited: driver=docker path=/usr/bin/nomad pid=25228
    2019-06-11T16:30:53.746+0300 [DEBUG] client.driver_mgr.docker.docker_logger: plugin exited: driver=docker
    2019-06-11T16:30:53.760+0300 [WARN ] client.driver_mgr.docker: RemoveImage on non-referenced counted image id: driver=docker image_id=sha256:c7c69ab545451fb9ba8b01ecc7f0571214b23c161c567340fff00ced2bcc3dbb
    2019-06-11T16:30:53.762+0300 [INFO ] client.gc: marking allocation for GC: alloc_id=14055ea2-8d05-1a15-0595-41f310182554
    2019-06-11T16:30:53.764+0300 [DEBUG] client.alloc_runner.task_runner.task_hook.logmon: plugin process exited: alloc_id=14055ea2-8d05-1a15-0595-41f310182554 task=some-task path=/usr/bin/nomad pid=25240
    2019-06-11T16:30:53.764+0300 [DEBUG] client.alloc_runner.task_runner.task_hook.logmon: plugin exited: alloc_id=14055ea2-8d05-1a15-0595-41f310182554 task=some-task
    2019-06-11T16:30:53.765+0300 [DEBUG] client.alloc_runner.task_runner: task run loop exiting: alloc_id=14055ea2-8d05-1a15-0595-41f310182554 task=some-task
    2019-06-11T16:30:53.938+0300 [DEBUG] client: updated allocations: index=3704021 total=6 pulled=0 filtered=6
    2019-06-11T16:30:53.938+0300 [DEBUG] client: allocation updates: added=0 removed=0 updated=0 ignored=6
    2019-06-11T16:30:53.938+0300 [DEBUG] client: allocation updates applied: added=0 removed=0 updated=0 ignored=6 errors=0
    2019-06-11T16:31:00.062+0300 [DEBUG] http: request complete: method=GET path=/v1/agent/health?type=client duration=1.680443ms
    2019-06-11T16:31:00.571+0300 [DEBUG] http: request complete: method=GET path=/v1/metrics?format=prometheus duration=2.303221ms
    2019-06-11T16:31:02.772+0300 [DEBUG] client: state changed, updating node and re-registering
    2019-06-11T16:31:02.824+0300 [INFO ] client: node registration complete
    2019-06-11T16:31:03.808+0300 [DEBUG] http: request complete: method=GET path=/v1/client/stats duration=2.545694ms
    2019-06-11T16:31:04.872+0300 [DEBUG] http: request complete: method=GET path=/v1/metrics?format=prometheus duration=1.724549ms
    2019-06-11T16:31:10.072+0300 [DEBUG] http: request complete: method=GET path=/v1/agent/health?type=client duration=271.092µs
    2019-06-11T16:31:11.785+0300 [DEBUG] http: request complete: method=GET path=/v1/client/stats duration=415.135µs
    2019-06-11T16:31:12.665+0300 [DEBUG] http: request complete: method=GET path=/v1/client/stats duration=1.010621ms
    2019-06-11T16:31:13.287+0300 [DEBUG] http: request complete: method=GET path=/v1/client/stats duration=301.019µs
    2019-06-11T16:31:16.103+0300 [DEBUG] http: request complete: method=GET path=/v1/client/stats duration=385.118µs
    2019-06-11T16:31:16.972+0300 [DEBUG] http: request complete: method=GET path=/v1/client/stats duration=406.317µs
    2019-06-11T16:31:19.850+0300 [DEBUG] http: request complete: method=GET path=/v1/client/stats duration=322.139µs
    2019-06-11T16:31:20.084+0300 [DEBUG] http: request complete: method=GET path=/v1/agent/health?type=client duration=257.515µs
    2019-06-11T16:31:22.023+0300 [DEBUG] http: request complete: method=GET path=/v1/client/stats duration=464.485µs
    2019-06-11T16:31:30.094+0300 [DEBUG] http: request complete: method=GET path=/v1/agent/health?type=client duration=288.736µs
    2019-06-11T16:31:30.561+0300 [DEBUG] http: request complete: method=GET path=/v1/metrics?format=prometheus duration=2.002077ms
    2019-06-11T16:31:33.890+0300 [DEBUG] http: request complete: method=GET path=/v1/client/stats duration=632.66µs
    2019-06-11T16:31:34.858+0300 [DEBUG] http: request complete: method=GET path=/v1/metrics?format=prometheus duration=2.451902ms
    2019-06-11T16:31:40.105+0300 [DEBUG] http: request complete: method=GET path=/v1/agent/health?type=client duration=299.138µs
    2019-06-11T16:31:41.759+0300 [DEBUG] http: request complete: method=GET path=/v1/client/stats duration=330.836µs
    2019-06-11T16:31:42.634+0300 [DEBUG] http: request complete: method=GET path=/v1/client/stats duration=491.786µs
    2019-06-11T16:31:43.324+0300 [DEBUG] http: request complete: method=GET path=/v1/client/stats duration=334.907µs
    2019-06-11T16:31:46.117+0300 [DEBUG] http: request complete: method=GET path=/v1/client/stats duration=312.227µs
    2019-06-11T16:31:47.028+0300 [DEBUG] http: request complete: method=GET path=/v1/client/stats duration=298.833µs
    2019-06-11T16:31:49.848+0300 [DEBUG] http: request complete: method=GET path=/v1/client/stats duration=424.32µs
    2019-06-11T16:31:50.116+0300 [DEBUG] http: request complete: method=GET path=/v1/agent/health?type=client duration=343.534µs
    2019-06-11T16:31:52.049+0300 [DEBUG] http: request complete: method=GET path=/v1/client/stats duration=361.498µs
    2019-06-11T16:32:00.129+0300 [DEBUG] http: request complete: method=GET path=/v1/agent/health?type=client duration=411.154µs
    2019-06-11T16:32:00.562+0300 [DEBUG] http: request complete: method=GET path=/v1/metrics?format=prometheus duration=2.679342ms
    2019-06-11T16:32:03.787+0300 [DEBUG] http: request complete: method=GET path=/v1/client/stats duration=667.186µs
    2019-06-11T16:32:04.857+0300 [DEBUG] http: request complete: method=GET path=/v1/metrics?format=prometheus duration=2.697152ms
    2019-06-11T16:32:10.139+0300 [DEBUG] http: request complete: method=GET path=/v1/agent/health?type=client duration=358.334µs
    2019-06-11T16:32:11.754+0300 [DEBUG] http: request complete: method=GET path=/v1/client/stats duration=1.378745ms
    2019-06-11T16:32:12.174+0300 [DEBUG] client: updated allocations: index=3704032 total=6 pulled=6 filtered=0
    2019-06-11T16:32:12.176+0300 [DEBUG] client: allocation updates: added=0 removed=0 updated=6 ignored=0
    2019-06-11T16:32:12.188+0300 [DEBUG] client: allocation updates applied: added=0 removed=0 updated=6 ignored=0 errors=0
    2019-06-11T16:32:12.607+0300 [DEBUG] http: request complete: method=GET path=/v1/client/stats duration=366.989µs
    2019-06-11T16:32:13.278+0300 [DEBUG] http: request complete: method=GET path=/v1/client/stats duration=420.553µs
    2019-06-11T16:32:16.114+0300 [DEBUG] http: request complete: method=GET path=/v1/client/stats duration=558.399µs
    2019-06-11T16:32:16.946+0300 [DEBUG] http: request complete: method=GET path=/v1/client/stats duration=427.045µs
    2019-06-11T16:32:18.037+0300 [DEBUG] client: updated allocations: index=3704036 total=7 pulled=7 filtered=0
    2019-06-11T16:32:18.038+0300 [DEBUG] client: allocation updates: added=1 removed=0 updated=6 ignored=0
    2019-06-11T16:32:18.052+0300 [DEBUG] client: allocation updates applied: added=1 removed=0 updated=6 ignored=0 errors=0
    2019-06-11T16:32:18.060+0300 [DEBUG] client.alloc_runner.task_runner.task_hook.logmon: starting plugin: alloc_id=0a55f940-abe8-c194-c3f9-4887643ee03a task=some-task path=/usr/bin/nomad args="[/usr/bin/nomad logmon]"
    2019-06-11T16:32:18.061+0300 [DEBUG] client.alloc_runner.task_runner.task_hook.logmon: plugin started: alloc_id=0a55f940-abe8-c194-c3f9-4887643ee03a task=some-task path=/usr/bin/nomad pid=25584
    2019-06-11T16:32:18.061+0300 [DEBUG] client.alloc_runner.task_runner.task_hook.logmon: waiting for RPC address: alloc_id=0a55f940-abe8-c194-c3f9-4887643ee03a task=some-task path=/usr/bin/nomad
    2019-06-11T16:32:18.091+0300 [DEBUG] client.alloc_runner.task_runner.task_hook.logmon.nomad: plugin address: alloc_id=0a55f940-abe8-c194-c3f9-4887643ee03a task=some-task @module=logmon address=/tmp/plugin181061252 network=unix timestamp=2019-06-11T16:32:18.090+0300
    2019-06-11T16:32:18.091+0300 [DEBUG] client.alloc_runner.task_runner.task_hook.logmon: using plugin: alloc_id=0a55f940-abe8-c194-c3f9-4887643ee03a task=some-task version=2
    2019-06-11T16:32:18.093+0300 [INFO ] client.alloc_runner.task_runner.task_hook.logmon.nomad: opening fifo: alloc_id=0a55f940-abe8-c194-c3f9-4887643ee03a task=some-task @module=logmon path=/var/lib/nomad/alloc/0a55f940-abe8-c194-c3f9-4887643ee03a/alloc/logs/.some-task.stdout.fifo timestamp=2019-06-11T16:32:18.092+0300
    2019-06-11T16:32:18.094+0300 [INFO ] client.alloc_runner.task_runner.task_hook.logmon.nomad: opening fifo: alloc_id=0a55f940-abe8-c194-c3f9-4887643ee03a task=some-task path=/var/lib/nomad/alloc/0a55f940-abe8-c194-c3f9-4887643ee03a/alloc/logs/.some-task.stderr.fifo @module=logmon timestamp=2019-06-11T16:32:18.094+0300
    2019-06-11T16:32:18.104+0300 [DEBUG] client.driver_mgr.docker: force pulling image instead of inspecting local: driver=docker image_ref=docker-image
    2019-06-11T16:32:18.199+0300 [DEBUG] client.driver_mgr.docker: docker pull succeeded: driver=docker image_ref=docker-image
    2019-06-11T16:32:18.204+0300 [DEBUG] client.driver_mgr.docker: image reference count incremented: driver=docker image_name=docker-image image_id=sha256:c7c69ab545451fb9ba8b01ecc7f0571214b23c161c567340fff00ced2bcc3dbb references=1
    2019-06-11T16:32:18.204+0300 [DEBUG] client.driver_mgr.docker: configured resources: driver=docker task_name=some-task memory=536870912 cpu_shares=300 cpu_quota=0 cpu_period=0
    2019-06-11T16:32:18.204+0300 [DEBUG] client.driver_mgr.docker: binding directories: driver=docker task_name=some-task binds="[]string{"/var/lib/nomad/alloc/0a55f940-abe8-c194-c3f9-4887643ee03a/alloc:/alloc", "/var/lib/nomad/alloc/0a55f940-abe8-c194-c3f9-4887643ee03a/some-task/local:/local", "/var/lib/nomad/alloc/0a55f940-abe8-c194-c3f9-4887643ee03a/some-task/secrets:/secrets"}"
    2019-06-11T16:32:18.204+0300 [DEBUG] client.driver_mgr.docker: allocated mapped port: driver=docker task_name=some-task ip=192.168.122.166 port=22333
    2019-06-11T16:32:18.204+0300 [DEBUG] client.driver_mgr.docker: exposed port: driver=docker task_name=some-task port=22333
    2019-06-11T16:32:18.204+0300 [DEBUG] client.driver_mgr.docker: setting container name: driver=docker task_name=some-task container_name=some-task-0a55f940-abe8-c194-c3f9-4887643ee03a
    2019-06-11T16:32:18.293+0300 [INFO ] client.driver_mgr.docker: created container: driver=docker container_id=e4a26ae7517a5c9951727da8a51f3afd5d15b5f5e363f8fed64fbc2f220eec75
    2019-06-11T16:32:18.340+0300 [DEBUG] client: updated allocations: index=3704038 total=7 pulled=6 filtered=1
    2019-06-11T16:32:18.340+0300 [DEBUG] client: allocation updates: added=0 removed=0 updated=6 ignored=1
    2019-06-11T16:32:18.353+0300 [DEBUG] client: allocation updates applied: added=0 removed=0 updated=6 ignored=1 errors=0
    2019-06-11T16:32:18.523+0300 [INFO ] client.driver_mgr.docker: started container: driver=docker container_id=e4a26ae7517a5c9951727da8a51f3afd5d15b5f5e363f8fed64fbc2f220eec75
    2019-06-11T16:32:18.523+0300 [DEBUG] client.driver_mgr.docker.docker_logger: starting plugin: driver=docker path=/usr/bin/nomad args="[/usr/bin/nomad docker_logger]"
    2019-06-11T16:32:18.523+0300 [DEBUG] client.driver_mgr.docker.docker_logger: plugin started: driver=docker path=/usr/bin/nomad pid=25638
    2019-06-11T16:32:18.523+0300 [DEBUG] client.driver_mgr.docker.docker_logger: waiting for RPC address: driver=docker path=/usr/bin/nomad
    2019-06-11T16:32:18.557+0300 [DEBUG] client.driver_mgr.docker.docker_logger.nomad: plugin address: driver=docker network=unix @module=docker_logger address=/tmp/plugin054488067 timestamp=2019-06-11T16:32:18.556+0300
    2019-06-11T16:32:18.557+0300 [DEBUG] client.driver_mgr.docker.docker_logger: using plugin: driver=docker version=2
    2019-06-11T16:32:18.559+0300 [DEBUG] client.driver_mgr.docker.docker_logger.nomad: using client connection initialized from environment: driver=docker @module=docker_logger timestamp=2019-06-11T16:32:18.559+0300
    2019-06-11T16:32:18.588+0300 [DEBUG] consul.health: watching check: alloc_id=0a55f940-abe8-c194-c3f9-4887643ee03a task=some-task check="service: "some-task" check"
    2019-06-11T16:32:18.649+0300 [DEBUG] consul.sync: sync complete: registered_services=1 deregistered_services=0 registered_checks=1 deregistered_checks=0
    2019-06-11T16:32:18.712+0300 [DEBUG] client: updated allocations: index=3704039 total=7 pulled=6 filtered=1
    2019-06-11T16:32:18.712+0300 [DEBUG] client: allocation updates: added=0 removed=0 updated=6 ignored=1
    2019-06-11T16:32:18.725+0300 [DEBUG] client: allocation updates applied: added=0 removed=0 updated=6 ignored=1 errors=0
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x111ad21]

goroutine 551 [running]:
github.com/hashicorp/nomad/client/allocrunner/taskrunner.(*TaskRunner).setGaugeForMemory(0xc000478b40, 0xc00071d9c0)
        /opt/gopath/src/github.com/hashicorp/nomad/client/allocrunner/taskrunner/task_runner.go:1184 +0x51
github.com/hashicorp/nomad/client/allocrunner/taskrunner.(*TaskRunner).emitStats(0xc000478b40, 0xc00071d9c0)
        /opt/gopath/src/github.com/hashicorp/nomad/client/allocrunner/taskrunner/task_runner.go:1259 +0x74
github.com/hashicorp/nomad/client/allocrunner/taskrunner.(*TaskRunner).UpdateStats(0xc000478b40, 0xc00071d9c0)
        /opt/gopath/src/github.com/hashicorp/nomad/client/allocrunner/taskrunner/task_runner.go:1176 +0x8c
github.com/hashicorp/nomad/client/allocrunner/taskrunner.(*statsHook).collectResourceUsageStats(0xc000767200, 0x24468c0, 0xc000085500, 0x2429780, 0xc000085440)
        /opt/gopath/src/github.com/hashicorp/nomad/client/allocrunner/taskrunner/stats_hook.go:144 +0x1aa
created by github.com/hashicorp/nomad/client/allocrunner/taskrunner.(*statsHook).Poststart
        /opt/gopath/src/github.com/hashicorp/nomad/client/allocrunner/taskrunner/stats_hook.go:64 +0xf7

The text was updated successfully, but these errors were encountered:

When a client is running against an old server (e.g. running 0.8), `alloc.AllocatedResources` may be nil, and we need to check the deprecated `alloc.TaskResources` instead. Fixes #5810

github-actions · 2022-11-21T02:30:11Z

I'm going to lock this issue because it has been closed for 120 days ⏳. This helps our maintainers find and focus on the active issues.
If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

notnoop mentioned this issue Jun 11, 2019

Fallback to alloc.TaskResources for old allocs #5812

Merged

notnoop closed this as completed in #5812 Jun 11, 2019

github-actions bot locked as resolved and limited conversation to collaborators Nov 21, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Segmentation fault after submitting a job on v0.9.2 #5810

Segmentation fault after submitting a job on v0.9.2 #5810

pznamensky commented Jun 11, 2019

github-actions bot commented Nov 21, 2022

Segmentation fault after submitting a job on v0.9.2 #5810

Segmentation fault after submitting a job on v0.9.2 #5810

Comments

pznamensky commented Jun 11, 2019

Nomad version

Operating system and Environment details

Issue

Reproduction steps

Job file (if appropriate)

Nomad Client logs

github-actions bot commented Nov 21, 2022