Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

consul/connect: use additional constraints in scheduling connect tasks #10702

Merged
merged 1 commit into from
Jun 4, 2021

Conversation

shoenig
Copy link
Member

@shoenig shoenig commented Jun 3, 2021

This PR adds two additional constraints on Connect sidecar and gateway tasks,
making sure Nomad schedules them only onto nodes where Connect is actually
enabled on the Consul agent.

Consul requires connect.enabled = true and ports.grpc = <number> to be
explicitly set on agent configuration before Connect APIs will work. Until
now, Nomad would only validate a minimum version of Consul, which would cause
confusion for users who try to run Connect tasks on nodes where Consul is not
yet sufficiently configured. These contstraints prevent job scheduling on nodes
where Connect is not actually use-able.

Closes #10700

@shoenig
Copy link
Member Author

shoenig commented Jun 3, 2021

wait for #10699

Spot checking

all enabled
consul agent -dev
sudo nomad agent -dev-connect
$ nomad node status -self -verbose | grep consul 
consul.connect            = true
consul.datacenter         = dc1
consul.ft.namespaces      = true
consul.grpc               = 8502
consul.revision           = 22ce6c6ad
consul.server             = true
consul.sku                = ent
consul.version            = 1.9.5+ent
unique.consul.name        = x52
$ nomad job run example.nomad
==> Monitoring evaluation "918b7bca"
    Evaluation triggered by job "countdash"
==> Monitoring evaluation "918b7bca"
    Evaluation within deployment: "c096f7cd"
    Allocation "35f48d14" created: node "bbdecb86", group "dashboard"
    Allocation "929ff54a" created: node "bbdecb86", group "api"
    Evaluation status changed: "pending" -> "complete"
==> Evaluation "918b7bca" finished with status "complete"
connect not enabled
ports {
  grpc = 8502
}

connect {
  enabled = false
}
consul agent -dev -config-file=consul.hcl
sudo nomad agent -dev-connect
consul.connect            = false
consul.datacenter         = dc1
consul.ft.namespaces      = true
consul.grpc               = 8502
consul.revision           = 22ce6c6ad
consul.server             = true
consul.sku                = ent
consul.version            = 1.9.5+ent
unique.consul.name        = x52
==> Monitoring evaluation "a2cf9065"
    Evaluation triggered by job "countdash"
==> Monitoring evaluation "a2cf9065"
    Evaluation within deployment: "2fc8cb4d"
    Evaluation status changed: "pending" -> "complete"
==> Evaluation "a2cf9065" finished with status "complete" but failed to place all allocations:
    Task Group "dashboard" (failed to place 1 allocation):
      * Constraint "${attr.consul.connect} = true": 1 nodes excluded by filter
    Task Group "api" (failed to place 1 allocation):
      * Constraint "${attr.consul.connect} = true": 1 nodes excluded by filter
    Evaluation "d18a96e9" waiting for additional capacity to place remainder
grpc not enabled
ports {
  grpc = -1 # default in non-dev mode
}

connect {
  enabled = true
}
consul agent -dev -config-file=consul.hcl
sudo nomad agent -dev-connect
consul.connect            = true
consul.datacenter         = dc1
consul.ft.namespaces      = true
consul.grpc               = -1
consul.revision           = 22ce6c6ad
consul.server             = true
consul.sku                = ent
consul.version            = 1.9.5+ent
unique.consul.name        = x52
==> Monitoring evaluation "bae7adb9"
    Evaluation triggered by job "countdash"
==> Monitoring evaluation "bae7adb9"
    Evaluation within deployment: "a8d07837"
    Evaluation status changed: "pending" -> "complete"
==> Evaluation "bae7adb9" finished with status "complete" but failed to place all allocations:
    Task Group "api" (failed to place 1 allocation):
      * Constraint "${attr.consul.grpc} > 0": 1 nodes excluded by filter
    Task Group "dashboard" (failed to place 1 allocation):
      * Constraint "${attr.consul.grpc} > 0": 1 nodes excluded by filter
    Evaluation "415f6240" waiting for additional capacity to place remainder

This PR adds two additional constraints on Connect sidecar and gateway tasks,
making sure Nomad schedules them only onto nodes where Connect is actually
enabled on the Consul agent.

Consul requires `connect.enabled = true` and `ports.grpc = <number>` to be
explicitly set on agent configuration before Connect APIs will work. Until
now, Nomad would only validate a minimum version of Consul, which would cause
confusion for users who try to run Connect tasks on nodes where Consul is not
yet sufficiently configured. These contstraints prevent job scheduling on nodes
where Connect is not actually use-able.

Closes #10700
Copy link
Member

@tgross tgross left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@shoenig shoenig merged commit b0ac228 into main Jun 4, 2021
@shoenig shoenig deleted the f-cc-constraints branch June 4, 2021 13:11
shoenig added a commit that referenced this pull request Jun 14, 2021
PR #10702 added 2 new constraints
for connect jobs - one for Consul gRPC listener, and one for Connect being
enabled on Clients. Connect does not need to be enabled on clients, only
on Consul servers. Remove the extra constraint.

Discuss:
https://discuss.hashicorp.com/t/nomad-1-1-1-and-consul-connect-enabled-on-consul-clients/25295
@tgross tgross added this to the 1.1.1 milestone Jun 15, 2021
@github-actions
Copy link

I'm going to lock this pull request because it has been closed for 120 days ⏳. This helps our maintainers find and focus on the active contributions.
If you have found a problem that seems related to this change, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Nov 20, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Improve Connect scheduling
2 participants