Error when retrieving credentials from iam-role: Credential refresh failed, response did not contain: access_key, secret_key, token, expiry_time #1617

kylegalbraith · 2018-11-28T18:45:07Z

We are seeing a strange issue relating to boto3 and botocore. The following error is being thrown sporadically when we try to read from S3 or utilize an SQS client.

Error when retrieving credentials from iam-role: Credential refresh failed, response did not contain: access_key, secret_key, token, expiry_time

It appears that the credentials are not correctly getting refreshed via the assumed IAM role. This a Python application running inside of a Docker container within EKS. An example piece of code is below.

def fetch_message(s3, bucket, key):
    response = s3.get_object(Bucket=bucket, Key=key)

Does anybody have any ideas why this is happening and whether or not this is a known issue with boto?

The text was updated successfully, but these errors were encountered:

JordonPhillips · 2018-12-11T22:22:05Z

It looks like you're sourcing credentials from the EC2 Instance Metadata and the request to fetch them failed. By default we don't retry those requests, but you can add retries with metadata_service_num_attempts and metadata_service_timeout in the config file (docs).

kylegalbraith · 2018-12-14T00:04:20Z

@JordonPhillips Thank you for the response, so would this be a matter of adding those environment variables into the container this is happening in?

TattiQ · 2019-01-17T13:26:48Z

In my case I get the same error but according to kube2iam logs the request for creds does not fail.

TattiQ · 2019-01-17T13:30:37Z

@JordonPhillips what if we are sourcing creds from a k8s pod which runs on a k8s worker node (ec2 instance) , so not directly on EC2 instance. do we set AWS_METADATA_SERVICE_NUM_ATTEMPTS as an env var to a pod? is it still legit then? thanks!

shshe · 2019-05-06T18:40:22Z

I'm also using kube2iam to have a pod assume an IAM role and seeing this error sporadically. It sometimes happens at the start of the container, but we've also seen it happen after the containers been running for a while. Any suggestions on workarounds? We've set AWS_METADATA_SERVICE_NUM_ATTEMPTS but it seems to have no effect.

TattiQ · 2019-05-07T05:47:54Z

@shshe what does botocore debug log level say ? also, do you use celery?

shshe · 2019-05-07T20:23:17Z

Hi @TattiQ , unfortunately, we didn't have DEBUG level turned on. I think our issue lies in kube2iam introducing latency when querying the EC2 metadata URI. Here's the related issue:

jtblin/kube2iam#31

We're currently trying the workaround with setting AWS_METADATA_SERVICE_NUM_ATTEMPTS and increasing AWS_METADATA_SERVICE_TIMEOUT. So far, the issue hasn't cropped up yet.

Edit: Yes, we do use celery and saw this in our celery app. But, we've also seen this issue crop up in a Kubernetes job that used multiprocessing + boto.

spinus · 2020-03-05T15:25:31Z

Hey guys, did you figure out this issue by any chance?

swetashre · 2020-03-24T21:36:50Z

Following up on this issue. Solution provided by @JordonPhillips here should fix the issue. Is anyone still getting the error ? if yes, please reopen a new issue. I would be happy to help.

no-response · 2020-03-31T22:11:20Z

This issue has been automatically closed because there has been no response to our request for more information from the original author. With only the information that is currently in the issue, we don't have enough information to take action. Please reach out if you have or find the answers we need so that we can investigate further.

chrisbrownnwyc · 2020-07-20T14:05:02Z

We're having this issue specifically with K8S as well. Did setting the Environment variables work?

kylegalbraith · 2020-07-20T14:16:43Z

We're having this issue specifically with K8S as well. Did setting the Environment variables work?

Yes, setting the environment variables (specifically the retry attempts) seems to have mostly resolved the issue for us. We are also in an EKS K8s environment.

martimors · 2020-10-27T14:52:32Z

Hi, I was facing this issue running python in a pod in an EKS cluster, and it seems at first glance the retries/timeout solution worked. Did anyone figure out a reason why these requests fail? I've seen pods restart hundreds of times because of this and I'm curious if there is something in the EKS setup that can be used to mitigate.

sethatron · 2021-09-07T18:32:33Z

Bump on this, I am also seeing this issue in kube2iam/EKS

martimors · 2021-10-19T12:23:30Z

Yep, still seeing this one year later.

ypicard · 2021-10-21T19:29:54Z

Just noticed this too. Hundreds of restart in one night when it never happened in the last 6 months. No configuration changes or anything.

alizdavoodi · 2022-01-05T09:26:04Z

This is also happening for us. (python ---> kube2iam ---> AWS)
We switch to the service account (native EKS way of authenticating) because of this matter.

arvindsree · 2022-04-08T07:34:29Z

Seems to be the error return while handling this issue boto/boto3#1751. The workaround when we hit this issue was to re-attach the instance metadata.
Increasing the retry attempts did not do anything

madina-iz · 2022-07-28T20:18:42Z

I ran into this issue while working in a Jupyter notebook on EC2 instance. When it first started happening this month, all I had to do is rerun the code and it would work again on the second or third try. However, additional attempts stopped working for me this week. After struggling with a few different options which didn't work in my case, I finally decided to upgrade my Python (from 3.6 to 3.7) and dask (from 2021.11.01 to 2022.2.0), and that fixed the issue for me completely.

JordonPhillips added the closing-soon label Dec 11, 2018

jamesls removed the closing-soon label Jul 3, 2019

swetashre added the closing-soon label Mar 24, 2020

no-response bot closed this as completed Mar 31, 2020

ssasidharanwp mentioned this issue Sep 30, 2020

Support for additional boto config for IAM roles senzing-garage/charts#117

Closed

patheard mentioned this issue Feb 25, 2022

Recreate and fix the batch saving Redis queue loop in Staging cds-snc/notification-planning#412

Closed

2 tasks

nivintw mentioned this issue Oct 8, 2022

Investigation: Add support for instance profiles stevearc/pypicloud#322

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Error when retrieving credentials from iam-role: Credential refresh failed, response did not contain: access_key, secret_key, token, expiry_time #1617

Error when retrieving credentials from iam-role: Credential refresh failed, response did not contain: access_key, secret_key, token, expiry_time #1617

kylegalbraith commented Nov 28, 2018

JordonPhillips commented Dec 11, 2018

kylegalbraith commented Dec 14, 2018

TattiQ commented Jan 17, 2019 •

edited

Loading

TattiQ commented Jan 17, 2019

shshe commented May 6, 2019

TattiQ commented May 7, 2019

shshe commented May 7, 2019 •

edited

Loading

spinus commented Mar 5, 2020

swetashre commented Mar 24, 2020

no-response bot commented Mar 31, 2020

chrisbrownnwyc commented Jul 20, 2020 •

edited

Loading

kylegalbraith commented Jul 20, 2020

martimors commented Oct 27, 2020

sethatron commented Sep 7, 2021

martimors commented Oct 19, 2021

ypicard commented Oct 21, 2021

alizdavoodi commented Jan 5, 2022

arvindsree commented Apr 8, 2022

madina-iz commented Jul 28, 2022

Error when retrieving credentials from iam-role: Credential refresh failed, response did not contain: access_key, secret_key, token, expiry_time #1617

Error when retrieving credentials from iam-role: Credential refresh failed, response did not contain: access_key, secret_key, token, expiry_time #1617

Comments

kylegalbraith commented Nov 28, 2018

JordonPhillips commented Dec 11, 2018

kylegalbraith commented Dec 14, 2018

TattiQ commented Jan 17, 2019 • edited Loading

TattiQ commented Jan 17, 2019

shshe commented May 6, 2019

TattiQ commented May 7, 2019

shshe commented May 7, 2019 • edited Loading

spinus commented Mar 5, 2020

swetashre commented Mar 24, 2020

no-response bot commented Mar 31, 2020

chrisbrownnwyc commented Jul 20, 2020 • edited Loading

kylegalbraith commented Jul 20, 2020

martimors commented Oct 27, 2020

sethatron commented Sep 7, 2021

martimors commented Oct 19, 2021

ypicard commented Oct 21, 2021

alizdavoodi commented Jan 5, 2022

arvindsree commented Apr 8, 2022

madina-iz commented Jul 28, 2022

TattiQ commented Jan 17, 2019 •

edited

Loading

shshe commented May 7, 2019 •

edited

Loading

chrisbrownnwyc commented Jul 20, 2020 •

edited

Loading