-
Notifications
You must be signed in to change notification settings - Fork 24
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: Pace requests to token server for new auth tokens #55
Conversation
core/cp4d_authenticator.go
Outdated
} | ||
|
||
var cp4dMuOne sync.Mutex |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I decided to use locks instead of goroutines for achieving the synchronous design due to the simplicity. I created two mutexes because I couldn't find a straight forward answer on whether or not I can use the same lock in two places at once, though I don't think the two locks will ever be used at the same time the way the code is setup
return "", err | ||
} else if authenticator.tokenData != nil && authenticator.needsRefresh() { | ||
// If refresh needed, kick off a go routine in the background to get a new token | ||
ch := make(chan error) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I created a goroutine to achieve the "multi-threading" we want. The select case below it should ensure that no other routine sits around and waits for the newly spawned routine to finish
core/cp4d_authenticator.go
Outdated
} | ||
|
||
// If the refresh time isn't set, then set it | ||
if authenticator.tokenRefreshTime == 0 && &authenticator.tokenData.ExpiresIn != nil && &authenticator.tokenData.Expiration != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mkistler @padamstx - now that I'm looking at this again, should I always be setting the correct refresh time when we request a new token? Currently, it will only be set when you request the first token, and when you refresh the token where we push it forward 1 min.. I think this will result in more token requests than we'd like
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes ... you should be updating tokenRefreshTime whenever you get a new token.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I realized that as I created the PR 🙂
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a good start but I think there are some issues to be worked out.
core/cp4d_authenticator.go
Outdated
authenticator.tokenData, err = newCp4dTokenData(tokenResponse) | ||
if err != nil { | ||
return "", err | ||
} else if authenticator.tokenData != nil && authenticator.needsRefresh() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think you need to test authenticator.tokenData != nil
here -- it should never be nil here because of the check at line 157.
if err != nil { | ||
return "", err | ||
} | ||
default: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't you need to update tokenRefreshTime
here? Otherwise you'll continually refresh the token.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes I think I do need to re-calculate the refresh time when we get a new token. Not sure why I kept that out originally.. I think I was following the java core too much
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In java, we don't pre-compute the refresh time... the Token object's "needsRefresh()" method will compute it on the fly if needed, and then determine whether or not the token needs to be refreshed.
core/cp4d_authenticator.go
Outdated
return nil | ||
} | ||
|
||
err := authenticator.getTokenData() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it is very dangerous to perform a network call while holding a mutex. We should find some other solution.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thinking about this further ... I suppose the other languages must do something similar. Hmmm.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the Java core, we do this inside the TokenRequestBasedAuthenticator's getToken() method:
- if we have no token at all, or the current token has expired, then we invoke a blocking request (synchronized) to the token server
- otherwise we have a valid token and we use that as-is, but if the token is within the refresh window (last 20% of lifetime I think), then we ALSO start a refresh in a separate thread.
Not sure what else we can do to improve on that
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mkistler - I think the only other thing I can think of to use in place of the mutex is a go channel blocking go routine.. Though I remember reading that under hood it still uses a mutex. I'll try to find that text again
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My concern here is ... what happens if the one token request we make "gets lost" -- not unheard of for a network request. Does the client-side time out after a while and simply fail the request ... allowing another to try the request again? If that's the case then this is probably fine.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
whatever happens in the Java core would also happen here, so yeah I guess a timeout would occur. If there is some other more fault-tolerant pattern we should be using in the Node, Java, Go cores (at a minimum) to avoid this, then we'll need to put our thinking caps on and come up with a better design.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed. I was being overly cautious. The timeout will cover us so this should be fine.
399b740
to
fa2f137
Compare
@mkistler @padamstx - I've pushed an amended commit that adds the missing calculation of the refreshTime everytime we get a new token. I've also moved back the property As far as the concerns over using locks for network calls, it looks like by default we set a client side timeout of 30 seconds. I've added two new tests to simulate a client side timeout, Note: the client timeout doesn't cover the time spent sending a request rather the time spent reading/receiving. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good! 👍
I left some comments/suggestions on stylistic changes that I trust you will handle as appropriate.
core/cp4d_authenticator.go
Outdated
@@ -63,6 +64,9 @@ type CloudPakForDataAuthenticator struct { | |||
tokenData *cp4dTokenData | |||
} | |||
|
|||
var cp4dMuOne sync.Mutex | |||
var cp4dMuTwo sync.Mutex |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would be better to use meaningful names for these mutexes -- to be clear on what they lock/control.
I recommend:
cp4dMuOne
->cp4dRequestTokenMutex
cp4dMuTwo
->cp4dNeedsRefreshMutex
core/iam_authenticator.go
Outdated
@@ -77,10 +78,12 @@ type IamAuthenticator struct { | |||
tokenData *iamTokenData | |||
} | |||
|
|||
var iamMuOne sync.Mutex | |||
var iamMuTwo sync.Mutex |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here too ... please use meaningful names for these mutexes.
} | ||
} | ||
|
||
// return an error if the access token is not valid or was not fetched | ||
if authenticator.tokenData == nil || authenticator.tokenData.AccessToken == "" { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's curious that here you check authenticator.tokenData.AccessToken == ""
but above you check !authenticator.tokenData.isTokenValid()
. Why the difference? Can't we make them both the same?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just an oversight on my end. I'll go ahead and add that
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So @mkistler I went ahead and tried to make this change, though it's important to noteisTokenValid
also checks to see if the token is expired. I think it's a good extra check to have, though the issue is that for cp4d's unit tests, the expiration time is calculated using some jwt token, and not by a mock value from actual token response. The token used in those tests is expired, so all the tests using it would fail.. Getting the unit tests to work would require either a new token that would eventually also expire, changing the GetCurrentTime
method to accept a mock clock, or changing how we set and calculate the expired time for a token. I just went ahead and reverted this change.
core/iam_authenticator.go
Outdated
return err | ||
} | ||
|
||
return nil |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Couldn't the above sequence of 6 lines be expressed simply as:
return authenticator.getTokenData()
fa2f137
to
2c4d670
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Discussed a few things with Jorge on slack. LGTM!
## [3.3.1](v3.3.0...v3.3.1) (2020-04-30) ### Bug Fixes * Pace requests to token server for new auth tokens ([#55](#55)) ([578399b](578399b))
🎉 This PR is included in version 3.3.1 🎉 The release is available on GitHub release Your semantic-release bot 📦🚀 |
This PR updates the design for token refresh by decoupling the refresh logic from active requests to be consistent with the other cores.
ref: https://github.ibm.com/arf/planning-sdk-squad/issues/1550