Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

consul: probe consul namespace feature before using namespace api #10715

Merged
merged 2 commits into from
Jun 7, 2021

Conversation

shoenig
Copy link
Member

@shoenig shoenig commented Jun 7, 2021

This PR changes Nomad's wrapper around the Consul NamespaceAPI so that
it will detect if the Consul Namespaces feature is enabled before making
a request to the Namespaces API. Namespaces are not enabled in Consul OSS,
and require a suitable license to be used with Consul ENT.

Previously Nomad would check for a 404 status code when makeing a request
to the Namespaces API to "detect" if Consul OSS was being used. This does
not work for Consul ENT with Namespaces disabled, which returns a 500.

Now we avoid requesting the namespace API altogether if Consul is detected
to be the OSS sku, or if the Namespaces feature is not licensed. Since
Consul can be upgraded from OSS to ENT, or a new license applied, we cache
the value for 1 minute, refreshing on demand if expired.

Fixes https://github.com/hashicorp/nomad-enterprise/issues/575

Note that the ticket originally describes using attributes from #10688.
This turns out not to be possible due to a chicken-egg situation between
bootstrapping the agent and setting up the consul client. Also fun: the
Consul fingerprinter creates its own Consul client, because there is no
currently no way to pass the agent's client through the fingerprint factory.

This PR changes Nomad's wrapper around the Consul NamespaceAPI so that
it will detect if the Consul Namespaces feature is enabled before making
a request to the Namespaces API. Namespaces are not enabled in Consul OSS,
and require a suitable license to be used with Consul ENT.

Previously Nomad would check for a 404 status code when makeing a request
to the Namespaces API to "detect" if Consul OSS was being used. This does
not work for Consul ENT with Namespaces disabled, which returns a 500.

Now we avoid requesting the namespace API altogether if Consul is detected
to be the OSS sku, or if the Namespaces feature is not licensed. Since
Consul can be upgraded from OSS to ENT, or a new license applied, we cache
the value for 1 minute, refreshing on demand if expired.

Fixes hashicorp/nomad-enterprise#575

Note that the ticket originally describes using attributes from #10688.
This turns out not to be possible due to a chicken-egg situation between
bootstrapping the agent and setting up the consul client. Also fun: the
Consul fingerprinter creates its own Consul client, because there is no
[currently] no way to pass the agent's client through the fingerprint factory.
Copy link
Member

@tgross tgross left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. I've left a question/suggestion about the SKU and Namespaces function signatures, but not a blocker.

command/agent/consul/catalog_testing.go Show resolved Hide resolved
// potentially change over time.
type Self = map[string]map[string]interface{}

func SKU(info Self) (string, bool) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When the SKU and Namespaces functions were in the fingerprinter, it made sense that they returned (string, bool). But now that they're in a different package entirely, their signature is "weird". For SKU the bool is false iff the string is empty, and for Namespaces we're returning a stringified version of the bool.

Maybe it'd be better if SKU returned just a string and Namespaces returned just a bool, and let their callers in the fingerprint extractors and namespace client sort out returning the values the fingerprinter expects?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea, I'll do this

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did this in 4b3ed53

@asyd
Copy link

asyd commented Jun 14, 2021

Hello folks,

sorry to comment here but I have no access to nomad enterprise issues. Is that the root cause of the following terraform issue introduced by 1.1.0:

Error: error applying jobspec: Unexpected response code: 500 (job-submitter consul token denied: consul ACL token cannot use namespace "default")

Thanks

@shoenig
Copy link
Member Author

shoenig commented Jun 14, 2021

@asyd That error message looks like this bug, which should also be fixed in v1.1.1

@github-actions
Copy link

I'm going to lock this pull request because it has been closed for 120 days ⏳. This helps our maintainers find and focus on the active contributions.
If you have found a problem that seems related to this change, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Nov 20, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants