Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use bool instead of storage call to update NumUses #4479

Closed
wants to merge 3 commits into from

Conversation

calvn
Copy link
Contributor

@calvn calvn commented Apr 27, 2018

This change updates the defer func to use a local bool as a check instead of making a storage call. Since we know the state after ts.view.Delete(ctx, path) at the end of revokeSalted (if deletion fails the bool is not updated), we can use a bool to determine whether the delete was successful. This eliminates the need for a storage call, which might be problematic if this request fails (e.g. connectivity issue) as the token will not be marked as tokenRevocationFailed, so subsequent calls to revoke this token (including revoke-force) will simply short-circuit and return with no errors due to https://github.com/hashicorp/vault/blob/master/vault/token_store.go#L1110-L1112.

@calvn calvn added this to the 0.10.2 milestone Apr 27, 2018
@calvn calvn requested a review from jefferai April 27, 2018 22:05
@@ -1131,16 +1132,9 @@ func (ts *TokenStore) revokeSalted(ctx context.Context, saltedID string) (ret er
lock.Lock()
defer lock.Unlock()

// Lookup the token again to make sure something else didn't
// revoke in the interim
entry, err := ts.lookupSalted(ctx, saltedID, true)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe removing this call introduces a race condition. If another thread removes the token before we grab the write lock above we will restore the deleted entry.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, yeah good point.

@calvn
Copy link
Contributor Author

calvn commented Apr 28, 2018

@briankassouf I moved the deletion to be inside the defer (the bool now marks it for deletion) so that the lock is held throughout the deletion and entry.NumUse update if the deletion does fail. Let me know if that works.

defer func() {
if ret != nil {
if deleteEntry || ret != nil {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is still potentially racy (if ts.view.Delete returns an error it doesn't grab a lock). Gotta think through this a bit more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually this should be alright, since the lock will be held whenever deleteEntry is true.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could just shortcut this -- if deleteEntry is not true, and ret is nil, just return.


// Lookup the token again to make sure something else didn't
// revoke in the interim
entry, err := ts.lookupSalted(ctx, saltedID, true)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've been thinking about this, and I think removing this is okay. Marking useCount as -2 is in a protected critical path, so the only way we should be here is if the token was set to -2 by this goroutine.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Although this actually begs another question: do we even need to grab the lock here, then? Lookup won't return the token if use count is negative, so locking when the Delete happens shouldn't matter. Nothing else should be operating on it because at that point useCount is -2 so any other revocation will skip being run. If the Delete fails, we update the entry with -3, but the timing of that also doesn't matter -- anything that tries to revoke before that won't work (it'll hit the -2), and anything that tries after will keep going, but this goroutine will have exited.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need to still grab the lock since the entry.NumUses == tokenRevocationInProgress check happens before the write. There could theoretically be a case where 2 goroutines calling revokeSalted have checked that entry.NumUses is not -2 before either one got to update it, and so the revokeSalted operation would go through twice.

	// On failure we write -3, so if we hit -2 here we're already running a
	// revocation operation. This can happen due to e.g. recursion into this
	// function via the expiration manager's RevokeByToken.

I wonder if we need tokenRevocationInProgress since RevokeByToken does not remove the actual token due to more recent changes (m.revokeCommon(tokenLeaseID, false, true) in here) so the recursion in the comment above might no longer be an issue.

@calvn
Copy link
Contributor Author

calvn commented May 4, 2018

Closing in favor of #4512.

@calvn calvn closed this May 4, 2018
@calvn calvn deleted the ts-revoke-mark branch September 19, 2018 23:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants