From c1b4238f1be96805a0d34f54ecbfd5d695affb23 Mon Sep 17 00:00:00 2001 From: grembo Date: Mon, 10 Jan 2022 20:41:38 +0100 Subject: [PATCH] Un-break templates when using vault stanza change_mode noop (#11783) Templates in nomad jobs make use of the vault token defined in the vault stanza when issuing credentials like client certificates. When using change_mode "noop" in the vault stanza, consul-template is not informed in case a vault token is re-issued (which can happen from time to time for various reasons, as described in https://www.nomadproject.io/docs/job-specification/vault). As a result, consul-template will keep using the old vault token to renew credentials and - once the token expired - stop renewing credentials. The symptom of this problem is a vault_token file that is newer than the issued credential (e.g., TLS certificate) in a job's /secrets directory. This change corrects this, so that h.updater.updatedVaultToken(token) is called, which will inform stakeholders about the new token and make sure, the new token is used by consul-template. Example job template fragment: vault { policies = ["nomad-job-policy"] change_mode = "noop" } template { data = <<-EOH {{ with secret "pki_int/issue/nomad-job" "common_name=myjob.service.consul" "ttl=90m" "alt_names=localhost" "ip_sans=127.0.0.1"}} {{ .Data.certificate }} {{ .Data.private_key }} {{ .Data.issuing_ca }} {{ end }} EOH destination = "${NOMAD_SECRETS_DIR}/myjob.crt" change_mode = "noop" } This fix does not alter the meaning of the three change modes of vault - "noop" - Take no action - "restart" - Restart the job - "signal" - send a signal to the task as the switch statement following line 232 contains the necessary logic. It is assumed that "take no action" was never meant to mean "don't tell consul-template about the new vault token". Successfully tested in a staging cluster consisting of multiple nomad client nodes. --- client/allocrunner/taskrunner/vault_hook.go | 6 +----- 1 file changed, 1 insertion(+), 5 deletions(-) diff --git a/client/allocrunner/taskrunner/vault_hook.go b/client/allocrunner/taskrunner/vault_hook.go index 016fbf6108ff..af511067478e 100644 --- a/client/allocrunner/taskrunner/vault_hook.go +++ b/client/allocrunner/taskrunner/vault_hook.go @@ -276,11 +276,7 @@ OUTER: token = "" h.logger.Error("failed to renew Vault token", "error", err) stopRenewal() - - // Check if we have to do anything - if h.vaultStanza.ChangeMode != structs.VaultChangeModeNoop { - updatedToken = true - } + updatedToken = true case <-h.ctx.Done(): stopRenewal() return