-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Increasing http timeout for ec2metadata to 30 seconds #311
Increasing http timeout for ec2metadata to 30 seconds #311
Conversation
Signed-off-by: Matt Schurenko <matt.schurenko@gmail.com>
}, | ||
}, | ||
) | ||
config.WithCredentials(creds) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can't do this as-is because it overrides the default credentials loading order that we get from https://github.com/aws/aws-sdk-go/blob/fb9d53b0db7e801eb0d4fa021f5860794d845da3/aws/session/session.go#L416-L469. But I do think we can come up with a solution. We can either copy the aws session logic and modify it as needed, or (if we're ok changing the timeout for all ec2 calls), I think you could just set config.HTTPClient
and it should trickle down to the ec2 role provider's metadata client.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we're ok with a 30 second timeout across the board then adding config.WithHTTPClient
to getSession
seems to work:
func getSession(config *aws.Config) (*session.Session, error) {
config.WithHTTPClient(&http.Client{Timeout: 30 * time.Second})
sess, err := session.NewSession(config)
if err != nil {
return nil, errors.WithStack(err)
}
if _, err := sess.Config.Credentials.Get(); err != nil {
return nil, errors.WithStack(err)
}
return sess, nil
}
@mschurenko thanks for your PR! We're wrapping up the v0.7.0 release later this week, so we can look to get this in after then. Please see my comment about a behavioral change - let's discuss here the best way to get you what you need. |
@ncdc thanks for the quick response. It would be great to get this into the next release if possible. Please see my reply to your review. |
@ncdc as it turns out the problem was not solved by increasing the http.Client timeout (it was just a coincidence that it worked during my testing). The problem appears to be a race condition with ark and kube2iam where ark fetches the credentials before kube2iam has discovered the pod. If I create an init container that sleeps for a few seconds then the problem goes away. I'm not sure if we should try and fix this in ark so that it expires the credential cache if it gets an AccessDenied? It is a bit odd behaviour as ark is getting a valid access key from kube2iam, it is just the wrong access key. Also let me know if I should just close this PR and open an issue. Increasing the http.Client timeout does work but, for us, it was not the actual cause of the problem. |
@mschurenko since this isn't solving the problem, how about we close this PR and move it to an issue? Thanks! |
Closing PR and will open an issue |
Problem
I would like to have AWS IAM credentials fetched from the EC2 metadata service rather than provided by way of a file or environment variables.
See: https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/iam-roles-for-amazon-ec2.html#instance-metadata-security-credentials
In order to achieve this we are using kube2iam. The problem we have observed is that the default timeout to ec2metadata in the golang sdk is quite short. (I believe it's 1 second).
This causes issues where ark fails to retrieve access keys from kube2iam with the following error in the ark controller log:
This is a known issue with kube2iam.
The solution appears to be to increase the timeout in the client SDK. The Python SDK (boto) supports a
AWS_METADATA_SERVICE_TIMEOUT
environment variable which could be used to solve the problem. Unfortunately, it seems that the golang SDK provides no such mechanism.I know this is not an issue specifically with ark but since many people use kube2iam for temporary IAM credentials it is likely they will encounter this same issue.
Solution
This PR increases the ec2metadata client http timeout to 30 seconds which solves the problem.
If there is a better/more simple way to increase the timeout than what I've done in the PR that would be great.
I had to override the entire IAM credential provider chain just to change the defaults for the
EC2RoleProvider
.Also, setting the timeout to 30 seconds was pretty arbitrary. I'm not sure why it would be an issue to set it this high (or even higher) as I have it set to the last provider in the credential chain.
Signed-off-by: Matt Schurenko matt.schurenko@gmail.com