-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
identity: support jwt expiration and rotation #18262
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left a few minor nit-picks, but LGTM.
While testing to make sure the Prestart
hook was called on task restore I encountered a panic:
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x2 addr=0x10 pc=0x10336152c]
goroutine 247 [running]:
github.com/hashicorp/nomad/client/widmgr.(*WIDMgr).SignIdentities(0x0, 0x14000aa8570?, {0x14000710020, 0x1, 0x1})
github.com/hashicorp/nomad/client/widmgr/widmgr.go:57 +0x7c
github.com/hashicorp/nomad/client/allocrunner/taskrunner.(*identityHook).getIdentities(0x14000cc44e0, 0x1400086a600, 0x140001d1800)
github.com/hashicorp/nomad/client/allocrunner/taskrunner/identity_hook.go:141 +0x1a0
github.com/hashicorp/nomad/client/allocrunner/taskrunner.(*identityHook).Prestart(0x104d1bf80?, {0x103de8705?, 0x14000cc44e0?}, 0x140009fc820, 0x1791ffd12?)
github.com/hashicorp/nomad/client/allocrunner/taskrunner/identity_hook.go:59 +0x3c
github.com/hashicorp/nomad/client/allocrunner/taskrunner.(*TaskRunner).prestart(0x14000cb9880)
github.com/hashicorp/nomad/client/allocrunner/taskrunner/task_runner_hooks.go:267 +0x608
github.com/hashicorp/nomad/client/allocrunner/taskrunner.(*TaskRunner).Run(0x14000cb9880)
github.com/hashicorp/nomad/client/allocrunner/taskrunner/task_runner.go:594 +0x5dc
created by github.com/hashicorp/nomad/client/allocrunner.(*allocRunner).runTasks in goroutine 186
github.com/hashicorp/nomad/client/allocrunner/alloc_runner.go:400 +0x58
But looking at the stack trace, it's not related to these changes, and it happens on main
as well, so I opened #18356 to fix it.
task *structs.Task | ||
tokenDir string | ||
envBuilder *taskenv.Builder | ||
tr tokenSetter |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
tr tokenSetter | |
ts tokenSetter |
var retry uint64 | ||
|
||
for err := h.stopCtx.Err(); err == nil; { | ||
h.logger.Trace("waiting to renew identities", "num", len(reqs), "wait", wait) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
h.logger.Trace("waiting to renew identities", "num", len(reqs), "wait", wait) |
This seems a bit repetitive with the Debug
log message in the end of the loop. Since they are at the extremes they always run one after the other:
2023-08-29T10:23:22.631-0400 [DEBUG] client.alloc_runner.task_runner.task_hook.identity: waiting to renew workloading identities: alloc_id=49c8438f-a91e-701c-2abd-e91750bc03f0 task=hello next=10.716433902s
2023-08-29T10:23:22.631-0400 [TRACE] client.alloc_runner.task_runner.task_hook.identity: waiting to renew identities: alloc_id=49c8438f-a91e-701c-2abd-e91750bc03f0 task=hello num=1 wait=10.716433902s
2023-08-29T10:23:33.349-0400 [DEBUG] client.alloc_runner.task_runner.task_hook.identity: getting new signed identities: alloc_id=49c8438f-a91e-701c-2abd-e91750bc03f0 task=hello num=1
2023-08-29T10:23:33.352-0400 [TRACE] client: next heartbeat: period=12.479408493s
2023-08-29T10:23:33.352-0400 [DEBUG] client.alloc_runner.task_runner.task_hook.identity: waiting to renew workloading identities: alloc_id=49c8438f-a91e-701c-2abd-e91750bc03f0 task=hello next=10.997138552s
2023-08-29T10:23:33.352-0400 [TRACE] client.alloc_runner.task_runner.task_hook.identity: waiting to renew identities: alloc_id=49c8438f-a91e-701c-2abd-e91750bc03f0 task=hello num=1 wait=10.997138552s
Just noting here that with #18363 merged the new |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
b8bae3e
to
dbf8f7b
Compare
This reverts commit dbf8f7b.
I was defeated by HCL parsing and didn't want to block this PR until I figured it out. Created #18438 Sorry :( |
The initial intention behind the `vault.use_identity` configuration was to indicate to Nomad servers that they would need to sign a workload identities for allocs with a `vault` block. But in order to support identity renewal, #18262 and #18431 moved the token signing logic to the alloc runner since a new token needs to be signed prior to the TTL expiring. So #18343 implemented `use_identity` as a flag to indicate that the workload identity JWT flow should be used when deriving Vault tokens for tasks. But this configuration value is set on servers so it is not available to clients at the time of token derivation, making its meaning not clear: a job may end up using the identity-based flow even when `use_identity` is `false`. The only reliable signal available to clients at token derivation time is the presence of an `identity` block for Vault, and this is already configured with the `vault.default_identity` configuration block, making `vault.use_identity` redundant. This commit removes the `vault.use_identity` configuration and simplifies the logic on when an implicit Vault identity is injected into tasks.
The initial intention behind the `vault.use_identity` configuration was to indicate to Nomad servers that they would need to sign a workload identities for allocs with a `vault` block. But in order to support identity renewal, #18262 and #18431 moved the token signing logic to the alloc runner since a new token needs to be signed prior to the TTL expiring. So #18343 implemented `use_identity` as a flag to indicate that the workload identity JWT flow should be used when deriving Vault tokens for tasks. But this configuration value is set on servers so it is not available to clients at the time of token derivation, making its meaning not clear: a job may end up using the identity-based flow even when `use_identity` is `false`. The only reliable signal available to clients at token derivation time is the presence of an `identity` block for Vault, and this is already configured with the `vault.default_identity` configuration block, making `vault.use_identity` redundant. This commit removes the `vault.use_identity` configuration and simplifies the logic on when an implicit Vault identity is injected into tasks.
I'm going to lock this pull request because it has been closed for 120 days ⏳. This helps our maintainers find and focus on the active contributions. |
What is in this PR:
WorkloadIdentity.TTL
field for allowing users to specify the lifetime of a specific jwtWhat's not in this PR:
WorkloadIdentity.ChangeMode
- it shouldn't be necessary for the Consul/Vault refresh as they will use internal plumbing. Still planning on adding ASAP.I felt like this was a nice sized PR and will work on root key rotation next.