Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[question] Constraint "${attr.vault.version} version >= 0.6.1" filtered 1 nodes #4276

Closed
karma0 opened this issue May 10, 2018 · 10 comments
Closed

Comments

@karma0
Copy link

karma0 commented May 10, 2018

Nomad version

Nomad v0.8.3 (c85483d)
Vault v0.10.1 ('756fdc4587350daf1c65b93647b2cc31a6f119cd')

Operating system and Environment details

Terraform modules in AWS:
terraform-aws-vault
terraform-aws-nomad

These modules both launch ASGs, totaling 3 EC2 clusters: vault, nomad servers, and nomad clients.

Issue

Attempting to plan or execute a job that uses Vault results in a hidden constraint that filters nodes with Vault version >= 0.6.1.

Reproduction steps

Executing nomad job plan test.nomad for my attempt at the given job reveals:

Scheduler dry-run:
- WARNING: Failed to place all allocations.
  Task Group "api" (failed to place 1 allocation):
    * No nodes are available in datacenter "us-east-1b"
    * No nodes are available in datacenter "us-east-1c"
    * Constraint "${attr.vault.version} version >= 0.6.1" filtered 1 nodes

Nomad Server logs (if appropriate)

None.

Nomad Client logs (if appropriate)

None.

Job file (if appropriate)

job "test" {
  datacenters = ["us-east-1a", "us-east-1b", "us-east-1c"]
  type = "service"
  update {
    max_parallel = 1
    min_healthy_time = "10s"
    healthy_deadline = "3m"
    auto_revert = false
    canary = 0
  }
  group "api" {
    restart {
      attempts = 10
      interval = "5m"
      delay = "25s"
      mode = "delay"
    }
    ephemeral_disk {
      size = 300
    }
    task "testapi" {
      driver = "docker"
      config {
        image = "https://<REDACTED>"
        port_map {
          http = 8080
        }
      }
      resources {
        memory = 256 # 256MB
        network {
          mbits = 10
          port "http" {}
        }
      }
      service {
        name = "global-testapi-check"
        tags = ["global", "cache"]
        port = "http"
        check {
          name     = "alive"
          type     = "tcp"
          interval = "10s"
          timeout  = "2s"
        }
      }
      template {
        source      = "nomad-config.json.tpl"
        destination = "config.json"
        perms       = 640
      }
      vault {
        policies      = ["testapi"]
        change_mode   = "signal"
        change_signal = "SIGHUP"
      }
    }
  }
}
@angrycub
Copy link
Contributor

@karma0, This is the intended behavior. If no client node meets this constraint, then the cluster is unable to run the vault-enabled job. Is there an element to your issue report that I am missing?

@karma0
Copy link
Author

karma0 commented May 10, 2018

Where is this constraint coming from? Is Nomad not compatible with Vault >= 0.6.1?

@angrycub
Copy link
Contributor

angrycub commented May 10, 2018

"Constraint "${attr.vault.version} version >= 0.6.1" filtered 1 nodes" indicates that Nomad believes that you don't have Vault greater than or equal to v0.6.1. The constraint text is the expectation. More conversationally, this would be read as: "One node was filtered from the eligibility because it does not meet the constraint that ${attr.vault.version} is greater than or equal to 0.6.1."

Nomad dynamically generates this constraint on vault-enabled jobs. When this constraint filters nodes, it often indicates a misconfiguration in or other problem with your Vault configuration for Nomad.

@dadgar
Copy link
Contributor

dadgar commented May 10, 2018

@karma0 Closing this as it is expected behavior and @angrycub's explanation is correct. Make sure you have set up your clients to be able to talk to the same Vault as the servers.

@dadgar dadgar closed this as completed May 10, 2018
@angrycub
Copy link
Contributor

@karma0 I will say that I was just fighting this in my local lab cluster. Turned out that I had accidentally set my the address in my vault stanza to https://active.vault.service.consul:8200 when I wasn't running TLS on my Vault server.

@karma0
Copy link
Author

karma0 commented May 10, 2018

@angrycub What is the difference between active.vault.service.consul and vault.service.consul? Should either work, just one is preferred?

Also, vault is setup to use certs. Here is the client and server config that I'm using (vault.hcl) for nomad:

vault {
  enabled          = true
  cert_file        = "/opt/nomad/tls/vault.crt.pem"
  key_file         = "/opt/nomad/tls/vault.key.pem"
  token            = "<REDACTED>"
  address          = "https://vault.services.consul:8200"
  create_from_role = "nomad-cluster"
}

Here is the vault config:

listener "tcp" {
  address         = "0.0.0.0:8200"
  cluster_address = "0.0.0.0:8201"
  tls_cert_file   = "/opt/vault/tls/vault.crt.pem"
  tls_key_file    = "/opt/vault/tls/vault.key.pem"
}

storage "consul" {
  address = "127.0.0.1:8500"
  path    = "vault/"
  scheme  = "http"
  service = "vault"

  # HA settings
  cluster_addr  = "https://10.10.1.57:8201"
  api_addr      = "https://10.10.1.57:8201"
}

@karma0 karma0 changed the title Constraint "${attr.vault.version} version >= 0.6.1" filtered 1 nodes [question] Constraint "${attr.vault.version} version >= 0.6.1" filtered 1 nodes May 10, 2018
@angrycub
Copy link
Contributor

angrycub commented May 10, 2018

I think I see your issue. You have "https://vault.services.consul:8200" and I believe it should be "https://vault.service.consul:8200" — no trailing s on service.

The only other thing I noticed in the configurations is that you don't supply ca_path, which is fine if the operating system's CA path contains your the CA certs used to generate your TLS certs.

As to active.vault.service.consul, a Vault in an HA configuration will tag the currently active node with active in Consul. This enables you to select the active node specifically rather than the active and any standbys (which are all registered as providing the vault service). This saves a little because the standbys will forward the request to the active node.

Hope this gets you unjammed!

@karma0
Copy link
Author

karma0 commented May 18, 2018

The issue turned out to be a bit more convoluted. The certs weren't setup with the correct IP addresses and/or DNS names. I was using private-tls-cert and the certs' IP addresses were set to ["127.0.0.1"] while the DNS was set to ["*.vault.service.consul", "vault.service.consul"]. The issue was that the Vault clients weren't coming from localhost or an IP address that resolved to any of these DNS addresses.

Adding "*.consul" to the list of domains for the certs allowed the nomad systems to connect because the IP addresses resolved to *.<region>.consul in a reverse lookup.

Thanks for the help!

@angrycub
Copy link
Contributor

@karma0 Thanks for posting your solution! That's a great note for future explorers.

picatz added a commit to picatz/terraform-google-nomad that referenced this issue Jul 27, 2020
Should have Nomad and Consul deployed and configured with mTLS. ACLs are currently not enabled on Consul, only Nomad.

This should provide the minimal working example using mTLS to get the cought dashboard working after a ton of tinkering. 😭

The links I used during my investigation/debugging session:
* hashicorp/nomad#6463
* https://learn.hashicorp.com/nomad/consul-integration/nomad-connect-acl#run-a-connect-enabled-job
* hashicorp/nomad#6594
* hashicorp/nomad#4276
hashicorp/nomad#7715
* https://www.consul.io/docs/agent/options
⭐ * hashicorp/nomad#7602
@github-actions
Copy link

I'm going to lock this issue because it has been closed for 120 days ⏳. This helps our maintainers find and focus on the active issues.
If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Nov 30, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants