-
Notifications
You must be signed in to change notification settings - Fork 4.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use bool instead of storage call to update NumUses #4479
Conversation
vault/token_store.go
Outdated
@@ -1131,16 +1132,9 @@ func (ts *TokenStore) revokeSalted(ctx context.Context, saltedID string) (ret er | |||
lock.Lock() | |||
defer lock.Unlock() | |||
|
|||
// Lookup the token again to make sure something else didn't | |||
// revoke in the interim | |||
entry, err := ts.lookupSalted(ctx, saltedID, true) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe removing this call introduces a race condition. If another thread removes the token before we grab the write lock above we will restore the deleted entry.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, yeah good point.
@briankassouf I moved the deletion to be inside the defer (the bool now marks it for deletion) so that the lock is held throughout the deletion and |
defer func() { | ||
if ret != nil { | ||
if deleteEntry || ret != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is still potentially racy (if ts.view.Delete
returns an error it doesn't grab a lock). Gotta think through this a bit more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually this should be alright, since the lock will be held whenever deleteEntry
is true.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You could just shortcut this -- if deleteEntry is not true, and ret is nil, just return.
|
||
// Lookup the token again to make sure something else didn't | ||
// revoke in the interim | ||
entry, err := ts.lookupSalted(ctx, saltedID, true) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've been thinking about this, and I think removing this is okay. Marking useCount as -2 is in a protected critical path, so the only way we should be here is if the token was set to -2 by this goroutine.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Although this actually begs another question: do we even need to grab the lock here, then? Lookup won't return the token if use count is negative, so locking when the Delete happens shouldn't matter. Nothing else should be operating on it because at that point useCount is -2 so any other revocation will skip being run. If the Delete fails, we update the entry with -3, but the timing of that also doesn't matter -- anything that tries to revoke before that won't work (it'll hit the -2), and anything that tries after will keep going, but this goroutine will have exited.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we need to still grab the lock since the entry.NumUses == tokenRevocationInProgress
check happens before the write. There could theoretically be a case where 2 goroutines calling revokeSalted have checked that entry.NumUses
is not -2 before either one got to update it, and so the revokeSalted operation would go through twice.
// On failure we write -3, so if we hit -2 here we're already running a
// revocation operation. This can happen due to e.g. recursion into this
// function via the expiration manager's RevokeByToken.
I wonder if we need tokenRevocationInProgress
since RevokeByToken
does not remove the actual token due to more recent changes (m.revokeCommon(tokenLeaseID, false, true)
in here) so the recursion in the comment above might no longer be an issue.
Closing in favor of #4512. |
This change updates the defer func to use a local bool as a check instead of making a storage call. Since we know the state after
ts.view.Delete(ctx, path)
at the end ofrevokeSalted
(if deletion fails the bool is not updated), we can use a bool to determine whether the delete was successful. This eliminates the need for a storage call, which might be problematic if this request fails (e.g. connectivity issue) as the token will not be marked astokenRevocationFailed
, so subsequent calls to revoke this token (including revoke-force) will simply short-circuit and return with no errors due to https://github.com/hashicorp/vault/blob/master/vault/token_store.go#L1110-L1112.