-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Get credentials from role has become slow #2972
Comments
Thanks for reaching out to us about this @alexd765. The behavior you describe sounds like it is taking a long time for the EC2 metadata service on your instance(s) to respond to a request for instance profile credentials. Do you see this behavior only on specific instance types/sizes? (If so, which ones?) Or does this occur consistently regardless of the instance type/size being used? Do you see a similar delay when attempting to reach the instance metadata service in a manner that does not involve the SDK (e.g. |
I see it on We are in eu-west-1.
Interestingly it is only getting the credentials from role that is slow. Getting credentials from environment vars is also instant. |
We also saw similar behavior, we noticed issues on the operations
The above curl is also super-fast for us (only tested on t2.small) |
Thanks for raising this issue, @alexd765, @nlundbo. Is your application running in a Kubernetes pod, Docker container, or using any IP forwarding/proxy? If so you may be impacted by a recent security change to EC2 instances, where the Instance Metadata service will limit the number of hops a request can make. Increasing hop limit on the instance should take care of long timeout issues on EC2 for customers who use any IP forwarding/proxying. The EC2 ModifyInstanceMetadataOptions operation can be used to update the hop limit needed for your application's use case. See https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ec2-instance-metadata.html for more information on EC2's IMDS update. |
Yes, my use case was code inside a docker-container. I tested changing the number hops - and it resolved the issue. I read somewhere that it should not affect deployments in ECS - I have not verified this yet. |
I'm also seeing this issue (I believe it's the same) and I've been able to pin point Before
while on
I run this app without specifying AWS env keys nor with a shared profile. |
@pmalekn - They introduced the new EC2 metadata feature in v1.25.38 - and a backwards compatibility issue by setting the default hop limit to 1 - which means the replies get dropped while transiting the docker bridge network. You have to increase the hop limit yourself or it just hit lengthy timeouts & retries before eventual fallback to the old method |
@tyrken Ok... so you're probably referring to 7814a7f#diff-68798fbeff74277f6e5c5ab6f1cf92c6R70454-R70460 Adding a snippet since the file is too big to show when linked
It then makes sense that it cannot reach the endpoint if I'm calling it from inside a docker container (because there's one more hop through docker bridge right?) or is my reasoning flawed? How can I change this "globally" then? Or do I have to do it per request basis (that doesn't sound feasible) |
No global option I know of, though hoping AWS might see sense & implement something like #2980 In the mean time you can only stay on old version of aws-sdk or update all EC2 instances to increase the hop limit, e.g. with:
|
Our problem is the same as described above.
I really hope their will be a better solution in the future. |
…etadata client (#3066) The PR address the issues related to EC2Metadata client having long timeouts in case of failure in obtaining EC2Metadata token, while making a request to IMDS. The PR reduces the timeout to 1 sec, and number of max retries to 2 for the EC2Metadata client. This would help reduce the long timeouts faced by the customers. Fixes #2972
=== ### Service Client Updates * `service/AWSMigrationHub`: Updates service API, documentation, and paginators * `service/codebuild`: Updates service API and documentation * Add encryption key override to StartBuild API in AWS CodeBuild. * `service/xray`: Updates service documentation * Documentation updates for xray ### SDK Enhancements * `aws`: Add configuration option enable the SDK to unmarshal API response header maps to normalized lower case map keys. ([#3033](#3033)) * Setting `aws.Config.LowerCaseHeaderMaps` to `true` will result in S3's X-Amz-Meta prefixed header to be unmarshaled to lower case Metadata member's map keys. ### SDK Bugs * `aws/ec2metadata` : Reduces request timeout for EC2Metadata client along with maximum number of retries ([#3066](#3066)) * Reduces latency while fetching response from EC2Metadata client running in a container to around 3 seconds * Fixes [#2972](#2972)
Release v1.27.2 (2020-01-07) === ### Service Client Updates * `service/AWSMigrationHub`: Updates service API, documentation, and paginators * `service/codebuild`: Updates service API and documentation * Add encryption key override to StartBuild API in AWS CodeBuild. * `service/xray`: Updates service documentation * Documentation updates for xray ### SDK Enhancements * `aws`: Add configuration option enable the SDK to unmarshal API response header maps to normalized lower case map keys. ([#3033](#3033)) * Setting `aws.Config.LowerCaseHeaderMaps` to `true` will result in S3's X-Amz-Meta prefixed header to be unmarshaled to lower case Metadata member's map keys. ### SDK Bugs * `aws/ec2metadata` : Reduces request timeout for EC2Metadata client along with maximum number of retries ([#3066](#3066)) * Reduces latency while fetching response from EC2Metadata client running in a container to around 3 seconds * Fixes [#2972](#2972)
@alexd765 's solution worked for me - thanks Alex! You can implement his solution in terraform like so: resource "aws_instance" "your_ec2_instance_name" {
params_go_here = var.blablabla
metadata_options {
# So docker can access ec2 metadata
# see https://github.com/aws/aws-sdk-go/issues/2972
http_put_response_hop_limit = 2
}
} Also note that it's not just go that runs into this problem - even just using curl as described in https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/instancedata-data-retrieval.html results in a long delay. |
Ran into the same problem using the AWS go SDK in an Elastic Beanstalk (EB) Docker environment, but didn't see a way to set Fixed (or rather sidestepped) it by setting AWS user credentials and default region as environment variables - hope it helps others in the same situation. |
We get same problem and walk-around works. |
Previously we were re-using our shared HTTP client, which has a rather high timeout (120 seconds) that causes the HTTP client to wait around for a long time. This is generally intentional (since it includes the time spent downloading a request body), but is a bad idea when running into EC2's IDMSv2 service that has a network-hop based limit. If that hop limit is exceeded, the requests just go to nowhere, causing the client to wait for a multiple of 120 seconds (~10 minutes were observed). This instead uses a special client for the EC2 instance metadata service that has a much lower timeout (1 second, like in the AWS SDK itself), to avoid the problem. See also aws/aws-sdk-go#2972
Previously we were re-using our shared HTTP client, which has a rather high timeout (120 seconds) that causes the HTTP client to wait around for a long time. This is generally intentional (since it includes the time spent downloading a request body), but is a bad idea when running into EC2's IDMSv2 service that has a network-hop based limit. If that hop limit is exceeded, the requests just go to nowhere, causing the client to wait for a multiple of 120 seconds (~10 minutes were observed). This instead uses a special client for the EC2 instance metadata service that has a much lower timeout (1 second, like in the AWS SDK itself), to avoid the problem. See also aws/aws-sdk-go#2972
Hi guys I am wondering is this same issue exist for go-sdk-v2 ? |
Can we remove the stale lifecycle on this one please @skmcgrail ? I'm presuming it's still an issue and the reason people aren't commenting on it is because of the workaround. A lot of people have ran into this problem and it would be good to have explicit confirmation that it is fixed when it gets fixed. |
Please fill out the sections below to help us address your issue.
Version of AWS SDK for Go?
v1.25.41
Version of Go (
go version
)?go version go1.11.1 linux/amd64
What issue did you see?
Getting credentials from an ec2 role takes 20 seconds.
We have this issue on consistently on multiple different ec2 instances since a couple of days ago.
To me it looks like an issue with the service, but I am not sure.
Steps to reproduce
The text was updated successfully, but these errors were encountered: