-
Notifications
You must be signed in to change notification settings - Fork 522
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
imdsclient: better retries with tokio retry and timeout #1841
imdsclient: better retries with tokio retry and timeout #1841
Conversation
d26b2bb
to
942cb17
Compare
|
942cb17
to
1b48686
Compare
|
1b48686
to
36e86d5
Compare
36e86d5
to
b14eab7
Compare
In this latest push rather than just using tokio retry and timeout on the token fetch, they are used for fetching IMDS data as well. This introduces things like RwLock to allow for the token to be refreshed within the timeout/retry logic. In a second commit I've also moved the token fetch out of ImdsClient::new() so that it no longer requires an await, but we can move that to a separate PR if folks are concerned it is out of scope. |
b14eab7
to
739e999
Compare
|
sources/imdsclient/src/lib.rs
Outdated
} | ||
|
||
async fn new_impl(imds_base_uri: String) -> Result<Self> { | ||
fn new_impl(imds_base_uri: String) -> Result<Self> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We're not allowing the user to override imds_base_uri
anywhere, so this method probably doesn't need to take it as an argument.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
new_impl is just for unit testing so that we can point the client to the server httptest creates.
sources/imdsclient/src/lib.rs
Outdated
pub enum Error { | ||
#[snafu(display("Response '{}' from '{}': {}", get_status_code(source), uri, source))] | ||
BadResponse { uri: String, source: reqwest::Error }, | ||
|
||
#[snafu(display("Fetched IMDSv2 session token"))] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this say something about the token being empty? Or am I reading this wrong?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was the token being fetched because it was empty, but actually this error isn't used anymore so I've removed it in this latest push.
sources/imdsclient/src/lib.rs
Outdated
@@ -422,6 +474,9 @@ mod error { | |||
#[snafu(display("Response was not UTF-8: {}", source))] | |||
NonUtf8Response { source: std::string::FromUtf8Error }, | |||
|
|||
#[snafu(display("IMDSv2 session token was refreshed."))] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This message is a little confusing to me. The error is UnauthorizedTokenRefreshed
, but the message says the token was refreshed. Maybe I'm missing some context here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My original line of thinking was the token was refreshed because the query returned a 401, but I went ahead and renamed this to TokenRefreshed.
#[snafu(display("Failed to fetch IMDSv2 session token"))] | ||
FailedFetchToken, | ||
|
||
#[snafu(display("Failed to read token within ImdsClient"))] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: Since these are "internal" implementation detail errors, let's make the message a bit clearer. Something like "Failed to read token from RwLock..."? (I'm not sold on that wording in particular)
To mitigate things like closed connections, IMDSv2 token and data fetches will continue to retry until success or until the timeout has been reached. The default timeout is 300s to match the ifup timeout in wicked, but can be set manually using `.with_timeout` when creating a new ImdsClient.
739e999
to
c6bc712
Compare
|
This moves the token fetch to just before the IMDS data fetch so that building a new ImdsClient no longer needs an await, nor does it return a result. `refresh_token` has been replaced by `clear_token` and `write_token`.
c6bc712
to
5e5200e
Compare
|
Description of changes:
To mitigate things like closed connections, IMDSv2 token and data
fetches will continue to retry until success or until the timeout has
been reached.
The default timeout is 300s to match the ifup timeout in wicked, but can
be set manually using
.with_timeout
when creating a new ImdsClient.Also moves the token fetch to just before the IMDS data fetch so
that building a new ImdsClient no longer needs an await.
refresh_token
has been replaced byclear_token
andwrite_token
.Testing done:
In addition to the unit testing...
aws-k8s-1.21
ami and launched instance.host-containers.admin.user-data
contained a base64-encoded block./.bottlerocket/host-containers/admin/user-data
contained JSON.sudo sheltie
to verify root shell was still available.pluto
with it's sub-commands to verify functionality.Terms of contribution:
By submitting this pull request, I agree that this contribution is dual-licensed under the terms of both the Apache License, version 2.0, and the MIT license.