Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dynamic blocks do not render all fields in content block. #9871

Closed
BlizzTom opened this issue Jan 21, 2021 · 7 comments · Fixed by #9921
Closed

Dynamic blocks do not render all fields in content block. #9871

BlizzTom opened this issue Jan 21, 2021 · 7 comments · Fixed by #9921
Assignees
Labels
stage/accepted Confirmed, and intend to work on. No timeline committment though. theme/hcl type/bug
Milestone

Comments

@BlizzTom
Copy link

BlizzTom commented Jan 21, 2021

Nomad version

Verified in both versions below

Nomad v1.0.1 (c9c68aa55a7275f22d2338f2df53e67ebfcb9238)
Nomad v1.0.2 (4c1d4fc6a5823ebc8c3e748daec7b4fda3f11037)

Operating system and Environment details

Ubuntu 20.04.1 LTS

Issue

Dynamic blocks do not render all fields causing local HCL variables to be transmitted to the server.

Reproduction steps

Use a dynamic block for groups and attempt to use the value in a docker image.

Job file

variable "instances" {
    type = list(object({
        name    = string
        version = string
        count   = number
        driver  = string
        cpu     = number
    }))

    default = [
        {
            name = "test01"
            version = "alpine"
            count   = 1
            driver  = "docker"
            cpu     = 500
        },
        {
            name = "test02"
            version = "alpine"
            count   = 2
            driver  = "podman"
            cpu     = 250
        }
    ]
}

job "playground-hcl" {
    type        = "service"
    datacenters = ["dc1"]

    dynamic "group" {
        for_each = var.instances
        iterator = group
        labels = ["thing-${group.value.name}"]

        content {
            # This works
            count = group.value.count

            # Not populated renders as "${group.value.version}"
            meta {
                VERSION = group.value.version
            }

            task "task" {
                # This works
                driver = group.value.driver

                env {
                    # Not populated renders as "${group.value.version}"
                    VERSION = group.value.version
                }

                config {
                    # Not populated renders as "redis:${group.value.version}"
                    image = format("redis:%s", group.value.version)
                }

                resources {
                    # This works
                    cpu    = group.value.cpu
                    memory = 128
                }
            }
        }
    }
}

Generated JSON sent to the server:

{
  "Stop": false,
  "Region": "global",
  "Namespace": "default",
  "ID": "playground-hcl",
  "ParentID": "",
  "Name": "playground-hcl",
  "Type": "service",
  "Priority": 50,
  "AllAtOnce": false,
  "Datacenters": [
    "dc1"
  ],
  "Constraints": null,
  "Affinities": null,
  "Spreads": null,
  "TaskGroups": [
    {
      "Name": "thing-test01",
      "Count": 1,
      "Update": {
        "Stagger": 30000000000,
        "MaxParallel": 1,
        "HealthCheck": "checks",
        "MinHealthyTime": 10000000000,
        "HealthyDeadline": 300000000000,
        "ProgressDeadline": 600000000000,
        "AutoRevert": false,
        "AutoPromote": false,
        "Canary": 0
      },
      "Migrate": {
        "MaxParallel": 1,
        "HealthCheck": "checks",
        "MinHealthyTime": 10000000000,
        "HealthyDeadline": 300000000000
      },
      "Constraints": null,
      "Scaling": null,
      "RestartPolicy": {
        "Attempts": 2,
        "Interval": 1800000000000,
        "Delay": 15000000000,
        "Mode": "fail"
      },
      "Tasks": [
        {
          "Name": "task",
          "Driver": "docker",
          "User": "",
          "Config": {
            "image": "redis:${group.value.version}"
          },
          "Env": {
            "VERSION": "${group.value.version}"
          },
          "Services": null,
          "Vault": null,
          "Templates": null,
          "Constraints": null,
          "Affinities": null,
          "Resources": {
            "CPU": 500,
            "MemoryMB": 128,
            "DiskMB": 0,
            "IOPS": 0,
            "Networks": null,
            "Devices": null
          },
          "RestartPolicy": {
            "Attempts": 2,
            "Interval": 1800000000000,
            "Delay": 15000000000,
            "Mode": "fail"
          },
          "DispatchPayload": null,
          "Lifecycle": null,
          "Meta": null,
          "KillTimeout": 5000000000,
          "LogConfig": {
            "MaxFiles": 10,
            "MaxFileSizeMB": 10
          },
          "Artifacts": null,
          "Leader": false,
          "ShutdownDelay": 0,
          "VolumeMounts": null,
          "ScalingPolicies": null,
          "KillSignal": "",
          "Kind": "",
          "CSIPluginConfig": null
        }
      ],
      "EphemeralDisk": {
        "Sticky": false,
        "SizeMB": 300,
        "Migrate": false
      },
      "Meta": {
        "VERSION": "${group.value.version}"
      },
      "ReschedulePolicy": {
        "Attempts": 0,
        "Interval": 0,
        "Delay": 30000000000,
        "DelayFunction": "exponential",
        "MaxDelay": 3600000000000,
        "Unlimited": true
      },
      "Affinities": null,
      "Spreads": null,
      "Networks": null,
      "Services": null,
      "Volumes": null,
      "ShutdownDelay": null,
      "StopAfterClientDisconnect": null
    },
    {
      "Name": "thing-test02",
      "Count": 2,
      "Update": {
        "Stagger": 30000000000,
        "MaxParallel": 1,
        "HealthCheck": "checks",
        "MinHealthyTime": 10000000000,
        "HealthyDeadline": 300000000000,
        "ProgressDeadline": 600000000000,
        "AutoRevert": false,
        "AutoPromote": false,
        "Canary": 0
      },
      "Migrate": {
        "MaxParallel": 1,
        "HealthCheck": "checks",
        "MinHealthyTime": 10000000000,
        "HealthyDeadline": 300000000000
      },
      "Constraints": null,
      "Scaling": null,
      "RestartPolicy": {
        "Attempts": 2,
        "Interval": 1800000000000,
        "Delay": 15000000000,
        "Mode": "fail"
      },
      "Tasks": [
        {
          "Name": "task",
          "Driver": "podman",
          "User": "",
          "Config": {
            "image": "redis:${group.value.version}"
          },
          "Env": {
            "VERSION": "${group.value.version}"
          },
          "Services": null,
          "Vault": null,
          "Templates": null,
          "Constraints": null,
          "Affinities": null,
          "Resources": {
            "CPU": 250,
            "MemoryMB": 128,
            "DiskMB": 0,
            "IOPS": 0,
            "Networks": null,
            "Devices": null
          },
          "RestartPolicy": {
            "Attempts": 2,
            "Interval": 1800000000000,
            "Delay": 15000000000,
            "Mode": "fail"
          },
          "DispatchPayload": null,
          "Lifecycle": null,
          "Meta": null,
          "KillTimeout": 5000000000,
          "LogConfig": {
            "MaxFiles": 10,
            "MaxFileSizeMB": 10
          },
          "Artifacts": null,
          "Leader": false,
          "ShutdownDelay": 0,
          "VolumeMounts": null,
          "ScalingPolicies": null,
          "KillSignal": "",
          "Kind": "",
          "CSIPluginConfig": null
        }
      ],
      "EphemeralDisk": {
        "Sticky": false,
        "SizeMB": 300,
        "Migrate": false
      },
      "Meta": {
        "VERSION": "${group.value.version}"
      },
      "ReschedulePolicy": {
        "Attempts": 0,
        "Interval": 0,
        "Delay": 30000000000,
        "DelayFunction": "exponential",
        "MaxDelay": 3600000000000,
        "Unlimited": true
      },
      "Affinities": null,
      "Spreads": null,
      "Networks": null,
      "Services": null,
      "Volumes": null,
      "ShutdownDelay": null,
      "StopAfterClientDisconnect": null
    }
  ],
  "Update": {
    "Stagger": 30000000000,
    "MaxParallel": 1,
    "HealthCheck": "",
    "MinHealthyTime": 0,
    "HealthyDeadline": 0,
    "ProgressDeadline": 0,
    "AutoRevert": false,
    "AutoPromote": false,
    "Canary": 0
  },
  "Multiregion": null,
  "Periodic": null,
  "ParameterizedJob": null,
  "Dispatched": false,
  "Payload": null,
  "Meta": null,
  "ConsulToken": "",
  "VaultToken": "",
  "VaultNamespace": "",
  "NomadTokenID": "",
  "Status": "pending",
  "StatusDescription": "",
  "Stable": false,
  "Version": 0,
  "SubmitTime": 1611272612461914600,
  "CreateIndex": 11,
  "ModifyIndex": 15,
  "JobModifyIndex": 11
}

Nomad Server logs

2 errors occurred: * failed to parse config: * Unknown variable: There is no variable named "group".
@tgross
Copy link
Member

tgross commented Jan 22, 2021

Another example in https://discuss.hashicorp.com/t/hcl2-temporary-iterator-variable-interpolation/20053/2

jobspec
variables {
  netscaler_endpoints = ["1.2.3.4", "4.5.6.7", "8.9.10.11"]
}

job "netscaler-prometheus" {
  datacenters = ["dc1"]

  constraint {
    attribute = "${attr.nomad.version}"
    operator  = "semver"
    value     = ">= 1.0.0"
  }

  dynamic "group" {
    for_each = var.netscaler_endpoints
    #iterator = ip
    labels   = ["prom-exporter-${group.value}"]

    content {

      network {
        port "http" {}

        dns {
          servers = ["10.10.10.10"]
        }
      }

      task "prom-exporter" {
        driver = "docker"

        config {
          image = "quay.io/citrix/citrix-adc-metrics-exporter:1.4.6"
          args = [
            format("--target-nsip=%s", group.value),
            "--port=${NOMAD_PORT_http}"
          ]
          ports = ["http"]
        }

        service {
          name = "prom-exporter"
          port = "http"

          check {
            type     = "http"
            port     = "http"
            path     = "/"
            interval = "5s"
            timeout  = "3s"
          }
        }

        resources {
          cpu    = 500
          memory = 256
        }

      } # task ends here
    }   # content block ends here
  }     # dynamic block ends here
}       # job ends here

The results from that are interesting because the group label is being interpolated correctly for server validation, but once it lands on the client we get:

$ nomad alloc status :alloc_id
...
Recent Events:
Time                  Type               Description
2021-01-22T20:40:29Z  Killing            Sent interrupt. Waiting 5s before force killing
2021-01-22T20:40:27Z  Alloc Unhealthy    Unhealthy because of failed task
2021-01-22T20:40:27Z  Not Restarting     Error was unrecoverable
2021-01-22T20:40:27Z  Failed Validation  2 errors occurred:
        * failed to parse config:
        * Unknown variable: There is no variable named "group".
2021-01-22T20:40:27Z  Task Setup         Building Task Directory
2021-01-22T20:40:27Z  Received           Task received by client

@tgross tgross self-assigned this Jan 27, 2021
@tgross
Copy link
Member

tgross commented Jan 28, 2021

I've put together a failing unit test that demonstrates the bug pretty clearly and hints that the problem is in how we handle blocks that are read as map[string]interface{} because they have either user-defined keys (like env and meta) or plugin-defined keys (like task.config):

$ go test -v ./jobspec2 -run TestParseDynamic
=== RUN   TestParseDynamic
    parse_test.go:332:
                Error Trace:    parse_test.go:332
                Error:          Not equal:
                                expected: "groupA"
                                actual  : "${group.value.name}"

                                Diff:
                                --- Expected
                                +++ Actual
                                @@ -1 +1 @@
                                -groupA
                                +${group.value.name}
                Test:           TestParseDynamic
    parse_test.go:333:
                Error Trace:    parse_test.go:333
                Error:          Not equal:
                                expected: "1"
                                actual  : "${group.value.idx}"

                                Diff:
                                --- Expected
                                +++ Actual
                                @@ -1 +1 @@
                                -1
                                +${group.value.idx}
                Test:           TestParseDynamic
    parse_test.go:334:
                Error Trace:    parse_test.go:334
                Error:          Not equal:
                                expected: "id:1"
                                actual  : "id:${group.value.idx}"

                                Diff:
                                --- Expected
                                +++ Actual
                                @@ -1 +1 @@
                                -id:1
                                +id:${group.value.idx}
                Test:           TestParseDynamic
--- FAIL: TestParseDynamic (0.00s)
FAIL
FAIL    github.com/hashicorp/nomad/jobspec2     0.047s
FAIL

From https://github.com/hashicorp/nomad/compare/b-9871-dynamic-blocks:

failing unit test
diff --git a/jobspec2/parse_test.go b/jobspec2/parse_test.go
index e54d50cfa..5215eed02 100644
--- a/jobspec2/parse_test.go
+++ b/jobspec2/parse_test.go
@@ -7,6 +7,7 @@ import (

        "github.com/hashicorp/nomad/api"
        "github.com/hashicorp/nomad/jobspec"
+       "github.com/stretchr/testify/assert"
        "github.com/stretchr/testify/require"
 )

@@ -277,21 +278,41 @@ job "example" {
 }

 func TestParseDynamic(t *testing.T) {
-       hcl := `
-job "example" {
-
-dynamic "group" {
-  for_each = ["groupA", "groupB", "groupC"]
-  labels   = [group.value]
+       hcl := `locals {
+  instances = [
+    { name = "groupA", idx = 1 },
+    { name = "groupB", idx = 2 },
+    { name = "groupC", idx = 3 },
+  ]
+}

-  content {
-    task "simple" {
-      driver = "raw_exec"
+job "example" {

+  dynamic "group" {
+    for_each = local.instances
+    labels   = [group.value.name]
+
+    content {
+      count = group.value.idx
+
+      task "simple" {
+        driver  = "raw_exec"
+        config {
+          command = group.value.name
+        }
+        meta {
+          VERSION = group.value.idx
+        }
+        env {
+          ID = format("id:%s", group.value.idx)
+        }
+        resources {
+          cpu = group.value.idx
+        }
+      }
     }
   }
 }
-}
 `
        out, err := ParseWithConfig(&ParseConfig{
                Path:    "input.hcl",
@@ -305,6 +326,12 @@ dynamic "group" {
        require.Equal(t, "groupA", *out.TaskGroups[0].Name)
        require.Equal(t, "groupB", *out.TaskGroups[1].Name)
        require.Equal(t, "groupC", *out.TaskGroups[2].Name)
+       require.Equal(t, 1, *out.TaskGroups[0].Tasks[0].Resources.CPU)
+
+       // TODO: map[string]interface{} blocks all fail
+       assert.Equal(t, "groupA", out.TaskGroups[0].Tasks[0].Config["command"])
+       assert.Equal(t, "1", out.TaskGroups[0].Tasks[0].Meta["VERSION"])
+       assert.Equal(t, "id:1", out.TaskGroups[0].Tasks[0].Env["ID"])
 }

 func TestParse_InvalidScalingSyntax(t *testing.T) {

@tgross
Copy link
Member

tgross commented Jan 28, 2021

A patch that seems to fix the problem in the HCL library (hat-tip to @notnoop who discovered this):

diff --git a/vendor/github.com/hashicorp/hcl/v2/ext/dynblock/expand_body.go b/vendor/github.com/hashicorp/hcl/v2/ext/dynblock/expand_body.go
index 65a9eab2d..94dd35c45 100644
--- a/vendor/github.com/hashicorp/hcl/v2/ext/dynblock/expand_body.go
+++ b/vendor/github.com/hashicorp/hcl/v2/ext/dynblock/expand_body.go
@@ -251,10 +251,9 @@ func (b *expandBody) expandChild(child hcl.Body, i *iteration) hcl.Body {
 }

 func (b *expandBody) JustAttributes() (hcl.Attributes, hcl.Diagnostics) {
-       // blocks aren't allowed in JustAttributes mode and this body can
-       // only produce blocks, so we'll just pass straight through to our
-       // underlying body here.
-       return b.original.JustAttributes()
+       attrs, diags := b.original.JustAttributes()
+       attrs = b.prepareAttributes(attrs)
+       return attrs, diags
 }

 func (b *expandBody) MissingItemRange() hcl.Range {

Turns out "this body can only produce blocks" is not true, at least for Nomad.

We use a vendored branch of the github.com/hashicorp/hcl/v2 so we'll need to get that patch added to that branch and then brought over to this repo. I want to do a bit more testing of this with some complex jobs to make sure we haven't broken anything unexpectedly in the process.

@tgross tgross added stage/accepted Confirmed, and intend to work on. No timeline committment though. and removed stage/needs-investigation labels Jan 28, 2021
@tgross
Copy link
Member

tgross commented Jan 29, 2021

Hi @BlizzTom! I just merged #9921 and that should ship in the next standard patch version of Nomad.

@tgross tgross added this to the 1.0.3 milestone Jan 29, 2021
@BlizzTom
Copy link
Author

Appreciate the update and the fix!

@chrisvanmeer
Copy link
Contributor

Would it also be possible to use a variable in the task name?

@github-actions
Copy link

I'm going to lock this issue because it has been closed for 120 days ⏳. This helps our maintainers find and focus on the active issues.
If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Oct 10, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
stage/accepted Confirmed, and intend to work on. No timeline committment though. theme/hcl type/bug
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants