Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

A nomad jobspec provided as JSON results in a panic when the jobspec does not contain an ID property #17418

Closed
rcousens opened this issue Jan 13, 2023 · 1 comment · Fixed by #17689
Assignees
Labels
stage/accepted Confirmed, and intend to work on. No timeline committment though. theme/api HTTP API and SDK issues theme/crash type/bug

Comments

@rcousens
Copy link

rcousens commented Jan 13, 2023

locals {
  NOMAD_JOB = {
    Job = {
      Name = "test"
      ....
    }
  }
}

resource "nomad_job" "nomad-spring" {
  jobspec = jsonencode(local.NOMAD_JOB)
  json    = true
}

Debug Output

Gist with debug output

Panic Output

See Gist with debug output

Expected Behavior

Nomad should have invalidated the jobspec and complained about a missing ID field

Actual Behavior

Provider crashed

Steps to Reproduce

Please list the steps required to reproduce the issue, for example:

  1. Define an otherwise valid jobspec without an ID property
  2. Try and apply it

Important Factoids

The crash ultimately comes from here though I don't know that it's the responsibility of the API client to validate the job has a non nil ID property: nomad API client

Because job.ID is escaped and passed as a pointer, the failure to validate the spec contains an "ID" property causes a crash with a nil pointer dereference

References

Source of crash in provider

@lgfa29
Copy link
Contributor

lgfa29 commented Jun 3, 2023

Thanks for the report @rcousens!

This is actually a bug in the Nomad API client, where a nil check is missing before we get here:

wm, err := j.client.put("/v1/job/"+url.PathEscape(*job.ID)+"/plan", req, &resp, q)

I was able to reproduce this by using a sample JSON file like this:

JSON job
{
    "Job": {
        "Affinities": null,
        "AllAtOnce": false,
        "Constraints": null,
        "ConsulNamespace": "",
        "ConsulToken": "",
        "CreateIndex": 14,
        "Datacenters": [
            "dc1"
        ],
        "DispatchIdempotencyToken": "",
        "Dispatched": false,
        "JobModifyIndex": 14,
        "Meta": null,
        "Migrate": null,
        "ModifyIndex": 14,
        "Multiregion": null,
        "Name": "foo",
        "Namespace": "default",
        "NodePool": "default",
        "NomadTokenID": "",
        "ParameterizedJob": null,
        "ParentID": "",
        "Payload": null,
        "Periodic": null,
        "Priority": 50,
        "Region": "global",
        "Reschedule": null,
        "Spreads": null,
        "Stable": false,
        "Status": "pending",
        "StatusDescription": "",
        "Stop": false,
        "SubmitTime": 1685759953739812000,
        "TaskGroups": [
            {
                "Affinities": null,
                "Constraints": [
                    {
                        "LTarget": "${attr.consul.version}",
                        "Operand": "semver",
                        "RTarget": ">= 1.7.0"
                    }
                ],
                "Consul": {
                    "Namespace": ""
                },
                "Count": 1,
                "EphemeralDisk": {
                    "Migrate": false,
                    "SizeMB": 300,
                    "Sticky": false
                },
                "MaxClientDisconnect": null,
                "Meta": null,
                "Migrate": {
                    "HealthCheck": "checks",
                    "HealthyDeadline": 300000000000,
                    "MaxParallel": 1,
                    "MinHealthyTime": 10000000000
                },
                "Name": "foo",
                "Networks": null,
                "ReschedulePolicy": {
                    "Attempts": 0,
                    "Delay": 30000000000,
                    "DelayFunction": "exponential",
                    "Interval": 0,
                    "MaxDelay": 3600000000000,
                    "Unlimited": true
                },
                "RestartPolicy": {
                    "Attempts": 2,
                    "Delay": 15000000000,
                    "Interval": 1800000000000,
                    "Mode": "fail"
                },
                "Scaling": null,
                "Services": [
                    {
                        "Address": "192.168.1.2",
                        "AddressMode": "auto",
                        "CanaryMeta": null,
                        "CanaryTags": null,
                        "CheckRestart": null,
                        "Checks": null,
                        "Connect": null,
                        "EnableTagOverride": false,
                        "Meta": null,
                        "Name": "test",
                        "OnUpdate": "require_healthy",
                        "PortLabel": "mysql",
                        "Provider": "consul",
                        "TaggedAddresses": null,
                        "Tags": null,
                        "TaskName": ""
                    }
                ],
                "ShutdownDelay": null,
                "Spreads": null,
                "StopAfterClientDisconnect": null,
                "Tasks": [
                    {
                        "Affinities": null,
                        "Artifacts": null,
                        "Config": {
                            "args": [
                                "1"
                            ],
                            "command": "/bin/sleep"
                        },
                        "Constraints": null,
                        "DispatchPayload": null,
                        "Driver": "raw_exec",
                        "Env": null,
                        "Identity": null,
                        "KillSignal": "",
                        "KillTimeout": 5000000000,
                        "Kind": "",
                        "Leader": false,
                        "Lifecycle": null,
                        "LogConfig": {
                            "Disabled": false,
                            "Enabled": null,
                            "MaxFileSizeMB": 10,
                            "MaxFiles": 3
                        },
                        "Meta": null,
                        "Name": "foo",
                        "Resources": {
                            "CPU": 20,
                            "Cores": 0,
                            "Devices": null,
                            "DiskMB": 0,
                            "IOPS": 0,
                            "MemoryMB": 10,
                            "MemoryMaxMB": 0,
                            "Networks": null
                        },
                        "RestartPolicy": {
                            "Attempts": 2,
                            "Delay": 15000000000,
                            "Interval": 1800000000000,
                            "Mode": "fail"
                        },
                        "ScalingPolicies": null,
                        "Services": null,
                        "ShutdownDelay": 0,
                        "Templates": null,
                        "User": "",
                        "Vault": null,
                        "VolumeMounts": null
                    }
                ],
                "Update": {
                    "AutoPromote": false,
                    "AutoRevert": false,
                    "Canary": 0,
                    "HealthCheck": "checks",
                    "HealthyDeadline": 300000000000,
                    "MaxParallel": 1,
                    "MinHealthyTime": 10000000000,
                    "ProgressDeadline": 600000000000,
                    "Stagger": 30000000000
                },
                "Volumes": null
            }
        ],
        "Type": "service",
        "Update": {
            "AutoPromote": false,
            "AutoRevert": false,
            "Canary": 0,
            "HealthCheck": "",
            "HealthyDeadline": 0,
            "MaxParallel": 1,
            "MinHealthyTime": 0,
            "ProgressDeadline": 0,
            "Stagger": 30000000000
        },
        "VaultNamespace": "",
        "VaultToken": "",
        "Version": 0
    }
}

And running nomad job plan -json job.json, which results in this panic:

panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x2 addr=0x0 pc=0x1009e1f68]

goroutine 1 [running]:
github.com/hashicorp/nomad/api.(*Jobs).PlanOpts(0x14000d1fc58, 0x14000643860, 0x14000d1fbbe, 0x101ce2235?)
        github.com/hashicorp/nomad/api@v0.0.0-20221006174558-2aa7e66bdb52/jobs.go:438 +0x98
github.com/hashicorp/nomad/command.(*JobPlanCommand).Run(0x1400040e600, {0x1400010e080, 0x2, 0x2})
        github.com/hashicorp/nomad/command/job_plan.go:250 +0x718
github.com/mitchellh/cli.(*CLI).Run(0x14000bee000)
        github.com/mitchellh/cli@v1.1.5/cli.go:262 +0x4a8
main.Run({0x1400010e060, 0x4, 0x4})
        github.com/hashicorp/nomad/main.go:107 +0x29c
main.main()
        github.com/hashicorp/nomad/main.go:77 +0x50

Since this is an issue with Nomad I will move this to the hashicorp/nomad repo.

Thanks again for the report!

@lgfa29 lgfa29 transferred this issue from hashicorp/terraform-provider-nomad Jun 3, 2023
@lgfa29 lgfa29 added theme/api HTTP API and SDK issues theme/crash stage/accepted Confirmed, and intend to work on. No timeline committment though. type/bug labels Jun 3, 2023
@lgfa29 lgfa29 added this to Needs Triage in Nomad - Community Issues Triage via automation Jun 3, 2023
@lgfa29 lgfa29 self-assigned this Jun 3, 2023
@lgfa29 lgfa29 moved this from Needs Triage to In Progress in Nomad - Community Issues Triage Jun 3, 2023
Nomad - Community Issues Triage automation moved this from In Progress to Done Jun 23, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
stage/accepted Confirmed, and intend to work on. No timeline committment though. theme/api HTTP API and SDK issues theme/crash type/bug
Projects
Development

Successfully merging a pull request may close this issue.

2 participants