Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consul Connect job with exposed check plan diff always shows changes #10099

Closed
evandam opened this issue Feb 26, 2021 · 3 comments · Fixed by #10492
Closed

Consul Connect job with exposed check plan diff always shows changes #10099

evandam opened this issue Feb 26, 2021 · 3 comments · Fixed by #10492
Assignees
Labels
stage/accepted Confirmed, and intend to work on. No timeline committment though. theme/consul/connect Consul Connect integration type/bug

Comments

@evandam
Copy link

evandam commented Feb 26, 2021

Nomad version

Nomad v1.0.4 (9294f35f9aa8dbb4acb6e85fa88e3e2534a3e41a)

Operating system and Environment details

Ubuntu 18.04

Issue

When running a job with Consul Connect enabled with an exposed check, Nomad always sees check port as changed in the plan. It also throws a warning that the allocation cannot be placed, but that doesn't actually seem to be the case.

Reproduction steps

nomad job run countdash.hcl
nomad job plan countdash.hcl

Job file (if appropriate)

job "countdash" {
  datacenters = ["kitchen"]

  group "api" {
    count = 1

    network {
      mode = "bridge"
    }

    service {
      name = "count-api"
      port = "9001"

      connect {
        sidecar_service {}
        sidecar_task {
          resources {
            cpu    = 20
            memory = 20
          }
        }
      }

      check {
        type     = "http"
        name     = "api-health"
        path     = "/health"
        interval = "10s"
        timeout  = "3s"
        expose   = true
      }
    }

    task "web" {
      driver = "docker"

      config {
        image = "hashicorpnomad/counter-api:v3"
      }

      resources {
        cpu = 20
        memory = 50
      }
    }
  }

  group "dashboard" {
    network {
      mode = "bridge"
      port "http" {}
    }

    service {
      name = "count-dashboard"
      port = "http"
      check {
        type     = "http"
        name     = "dashboard-health"
        path     = "/health"
        interval = "10s"
        timeout  = "3s"
      }

      connect {
        sidecar_service {
          proxy {
            upstreams {
              destination_name = "count-api"
              local_bind_port  = 8080
            }
          }
        }

        sidecar_task {
          resources {
            cpu    = 20
            memory = 50
          }
        }
      }
    }

    task "dashboard" {
      driver = "docker"

      env {
        COUNTING_SERVICE_URL = "http://${NOMAD_UPSTREAM_ADDR_count_api}"
        PORT = "${NOMAD_PORT_http}"
      }

      config {
        image = "hashicorpnomad/counter-dashboard:v3"
      }

      resources {
        cpu = 20
        memory = 50
      }
    }
  }
}

Nomad Client logs (if appropriate)

❯ nomad job plan countdash.hcl
+/- Job: "countdash"
+/- Task Group: "api" (1 create/destroy update)
  +   Network {
      + MBits: "0"
      + Mode:  "bridge"
      + Dynamic Port {
        + HostNetwork: "default"
        + Label:       "connect-proxy-count-api"
        + To:          "-1"
        }
      + Dynamic Port {
        + HostNetwork: "default"
        + Label:       "svc_count-api_ck_6b034a"
        + To:          "-1"
        }
      }
  -   Network {
      - MBits: "0"
      - Mode:  "bridge"
      - Dynamic Port {
        - HostNetwork: "default"
        - Label:       "connect-proxy-count-api"
        - To:          "-1"
        }
      - Dynamic Port {
        - HostNetwork: "default"
        - Label:       "svc_count-api_ck_8c8e30"
        - To:          "-1"
        }
      }
  +/- Service {
        AddressMode:       "auto"
        EnableTagOverride: "false"
        Name:              "count-api"
        PortLabel:         "9001"
        TaskName:          ""
    +/- Check {
          AddressMode:            ""
          Command:                ""
          Expose:                 "true"
          FailuresBeforeCritical: "0"
          GRPCService:            ""
          GRPCUseTLS:             "false"
          InitialStatus:          ""
          Interval:               "10000000000"
          Method:                 ""
          Name:                   "api-health"
          Path:                   "/health"
      +/- PortLabel:              "svc_count-api_ck_8c8e30" => "svc_count-api_ck_6b034a"
          Protocol:               ""
          SuccessBeforePassing:   "0"
          TLSSkipVerify:          "false"
          TaskName:               ""
          Timeout:                "3000000000"
          Type:                   "http"
        }
    +/- ConsulConnect {
          Native: "false"
      +/- SidecarService {
            Port: ""
        +/- ConsulProxy {
          LocalServiceAddress: ""
          LocalServicePort:    "0"
            }
          }
        }
      }
      Task: "connect-proxy-count-api"
      Task: "web"

    Task Group: "dashboard" (1 in-place update)
      Task: "connect-proxy-count-dashboard"
      Task: "dashboard"

Scheduler dry-run:
- WARNING: Failed to place all allocations.
  Task Group "api" (failed to place 1 allocation):
    * Resources exhausted on 1 nodes
    * Dimension "network: no addresses available for \"\" network" exhausted on 1 nodes

Job Modify Index: 122
To submit the job with version verification run:

nomad job run -check-index 122 debug/countdash.hcl

When running the job with the check-index flag, the job will only be run if the
job modify index given matches the server-side version. If the index has
changed, another user has modified the job and the plan's results are
potentially invalid.
@shoenig shoenig added stage/accepted Confirmed, and intend to work on. No timeline committment though. type/bug theme/consul/connect Consul Connect integration labels Mar 1, 2021
@shoenig shoenig added this to Needs Triage in Nomad - Community Issues Triage via automation Mar 1, 2021
@shoenig shoenig moved this from Needs Triage to In Progress in Nomad - Community Issues Triage Mar 1, 2021
@shoenig shoenig self-assigned this Mar 1, 2021
@shoenig
Copy link
Member

shoenig commented Mar 1, 2021

Thanks for reporting, @evandam! Indeed we shouldn't be including that dynamically generated port label in the job diff.

@fredwangwang
Copy link
Contributor

IIRC this has been an issue since the introduction of expose. This is what we've been doing to overcome the issue:

network {
  mode = "bridge"
  port "exposed" {}
}

...

check {
  expose = true
  type = "http"
  port = "exposed" // **hardcode port label exposed**
  path = "/hc"
  interval = "10s"
  timeout = "60s"
}

shoenig added a commit that referenced this issue Apr 30, 2021
This PR uses the checksum of the check for which a dynamic exposed
port is being generated (instead of a UUID prefix) so that the
generated port label is deterministic.

This fixes 2 bugs:
 - 'job plan' output is now idempotent for jobs making use of injected ports
 - tasks will no longer be destructively when jobs making use of injected ports
   are re-run without changing any user specified part of job config.

Fixes: #10099
shoenig added a commit that referenced this issue Apr 30, 2021
This PR uses the checksum of the check for which a dynamic exposed
port is being generated (instead of a UUID prefix) so that the
generated port label is deterministic.

This fixes 2 bugs:
 - 'job plan' output is now idempotent for jobs making use of injected ports
 - tasks will no longer be destructively updated when jobs making use of
   injected ports are re-run without changing any user specified part of
   job config.

Closes: #10099
shoenig added a commit that referenced this issue Apr 30, 2021
This PR uses the checksum of the check for which a dynamic exposed
port is being generated (instead of a UUID prefix) so that the
generated port label is deterministic.

This fixes 2 bugs:
 - 'job plan' output is now idempotent for jobs making use of injected ports
 - tasks will no longer be destructively updated when jobs making use of
   injected ports are re-run without changing any user specified part of
   job config.

Closes: #10099
shoenig added a commit that referenced this issue Apr 30, 2021
This PR uses the checksum of the check for which a dynamic exposed
port is being generated (instead of a UUID prefix) so that the
generated port label is deterministic.

This fixes 2 bugs:
 - 'job plan' output is now idempotent for jobs making use of injected ports
 - tasks will no longer be destructively updated when jobs making use of
   injected ports are re-run without changing any user specified part of
   job config.

Closes: #10099
shoenig added a commit that referenced this issue Apr 30, 2021
This PR uses the checksum of the check for which a dynamic exposed
port is being generated (instead of a UUID prefix) so that the
generated port label is deterministic.

This fixes 2 bugs:
 - 'job plan' output is now idempotent for jobs making use of injected ports
 - tasks will no longer be destructively updated when jobs making use of
   injected ports are re-run without changing any user specified part of
   job config.

Closes: #10099
Nomad - Community Issues Triage automation moved this from In Progress to Done May 3, 2021
@github-actions
Copy link

I'm going to lock this issue because it has been closed for 120 days ⏳. This helps our maintainers find and focus on the active issues.
If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Oct 20, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
stage/accepted Confirmed, and intend to work on. No timeline committment though. theme/consul/connect Consul Connect integration type/bug
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants