Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tls verification issues after update to 0.6.2 #3126

Closed
dansteen opened this issue Aug 29, 2017 · 7 comments · Fixed by #3127
Closed

tls verification issues after update to 0.6.2 #3126

dansteen opened this issue Aug 29, 2017 · 7 comments · Fixed by #3127

Comments

@dansteen
Copy link

dansteen commented Aug 29, 2017

Nomad version

nomad 0.6.2

Operating system and Environment details

server and application clients: debian jessie
client that I am running this from: archlinux

Issue

After upgrading to 0.6.2 I get the following error when getting node and job status:

#> nomad node-status 489da15e                                                                          

error fetching node stats (HINT: ensure Client.Advertise.HTTP is set): Get https://orgs-0a0.stag.awse:4646/v1/client/stats: x509: certificate is valid for global.nomad, client.global.nomad, localhost, orgs-0a0.stag.awse, not client..nomad
ID      = 489da15e
Name    = orgs-0a0.stag.awse
Class   = <none>
DC      = awse
Drain   = false
Status  = ready
Drivers = exec,java,rkt
Error querying node for running allocations: Get https://nomad.stag.awse:4646/v1/node/489da15e-1278-ce2a-e4e5-21d4f9df19ef/allocations: x509: certificate is valid for global.nomad, localhost, nomad-0ea.stag.awse, nomad.server.consul, nomad.stag.awse, server.global.nomad, not client..nomad

There are two really strange parts to this. In the first error message, which seems to be coming from the client:

error fetching node stats (HINT: ensure Client.Advertise.HTTP is set): Get https://orgs-0a0.stag.awse:4646/v1/client/stats: x509: certificate is valid for global.nomad, client.global.nomad, localhost, orgs-0a0.stag.awse, not client..nomad

Why does the client send the name client..nomad, as apposed to client.global.nomad?

In the second error message:

Error querying node for running allocations: Get https://nomad.stag.awse:4646/v1/node/489da15e-1278-ce2a-e4e5-21d4f9df19ef/allocations: x509: certificate is valid for global.nomad, localhost, nomad-0ea.stag.awse, nomad.server.consul, nomad.stag.awse, server.global.nomad, not client..nomad

Why is it looking for a client certificate from the server?

Note that this has worked for several months without issue prior to the update.

Also, if I do -tls-skip-verify everything works fine.

Nomad client config

bind_addr = "0.0.0.0" # the default
datacenter = "awse"
name = "orgs-0a0.stag.awse"

advertise {
  http = "orgs-0a0.stag.awse"
}

data_dir  = "/var/nomad"

client {
  enabled       = true
  #node_class = "orgs-stag"
  meta {
     env= "stag"
     security_posture= "stag"
     role= "orgs"
  }
  options {
     "driver.blacklist" = "docker,qemu"
     "user.checked_drivers" = "exec,raw_exec"
     "user.blacklist" = "root,admin"
     # move to whitelist once this is resolved: https://github.com/hashicorp/nomad/issues/2158
     "user.whitelist" = "appuser"
  }
}

tls {
  http = true
  rpc  = true
  ca_file = "/etc/ssl/certs/ca.pem"
  cert_file = "/etc/nomad.d/nomad.cert"
  key_file = "/etc/nomad.d/nomad.key"
  verify_server_hostname = false
}

telemetry {
    datadog_address = "127.0.0.1:8125"
}

vault {
  enabled = true
  address = "https://vault.prod.awse:8200"
}

consul {
  client_service_name = "nomad-client-stag"
  client_auto_join = "true"
  server_service_name = "nomad-stag"
}

Server config

bind_addr = "0.0.0.0" # the default
datacenter = "awse"

data_dir  = "/var/nomad"


server {
  enabled          = true
  bootstrap_expect = 3
  encrypt          = "xxxxx"
  rejoin_after_leave = true
}

# we use prod consul here since the staging nomad server is used to run staging services, but we don't want to have to maintain
# duplicate copies of the the data so we hook it to prod consul.
# stag consul is really just for testing consul.
consul {
  token = "xxx-xxxx"
  server_auto_join = "true"
  server_service_name = "nomad-stag"
}

vault {
   enabled = true
   token = "xxxx-xxxx"
   create_from_role = "nomad-cluster-stag"
   address = "https://vault.prod.awse:8200"
   allow_unauthenticated = "false"
}


tls {
  http = true
  rpc  = true
  ca_file = "/etc/ssl/certs/ca.pem"
  cert_file = "/etc/nomad.d/nomad.cert"
  key_file = "/etc/nomad.d/nomad.key"
}

telemetry {
    datadog_address = "127.0.0.1:8125",
}
@dadgar
Copy link
Contributor

dadgar commented Aug 29, 2017

@dansteen Can you test this binary:
nomad.zip

You would only need to replace it for the CLI. No need to change server/client binaries.

@dansteen
Copy link
Author

testing

@dansteen
Copy link
Author

@dadgar looks good! Those issues are gone.

@dadgar
Copy link
Contributor

dadgar commented Aug 29, 2017

Awesome! Thank you so much for the quick turn around!

@dansteen
Copy link
Author

thank you!

@Labibme
Copy link

Labibme commented Dec 5, 2018

hello,
Please how to set "-tls-skip-verify"?
where i have to mention it?

thanks,

@github-actions
Copy link

I'm going to lock this issue because it has been closed for 120 days ⏳. This helps our maintainers find and focus on the active issues.
If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Nov 27, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants