Prefetch credentials and refresh periodically to better emulate the metadata service. #31

marshallbrekka · 2016-11-08T02:27:40Z

As it currently stands, credentials are only requested from STS on an as needed basis.
Unfortunately this adds significant latency (I've seen > 100ms) compared to the metadata service (< 1ms).

This can cause some issues for clients (like boto) that have short timeouts for accessing the metadata service when going through the credentials provider chains, causing it to think that there aren't any credentials available.

This can be solved with three components

Fetch credentials as soon as the pod is added
A goroutine per pod to refresh the credentials before they expire
Stop the goroutine when the pod is deleted

I've submitted a PR #30 that implements this solution, but I'm happy to discuss other solutions, or modifications to the existing PR.

jtblin · 2016-11-27T21:45:39Z

As it currently stands, credentials are only requested from STS on an as needed basis.

This is not entirely correct, it will hit STS once every 15 minutes as the credentials are cached thereafter. This seemed like an acceptable tradeoff.

This can be solved with three components

Fetch credentials as soon as the pod is added

A goroutine per pod to refresh the credentials before they expire

Stop the goroutine when the pod is deleted

I don't think this solution will scale to thousands of nodes, potentially hundreds of thousands of pods as:

it would be hitting the AWS API limits pretty hard and potentially have knocked down effects on the entire AWS account. This is a show stopper imo.
there could be hundreds of thousands of goroutines running for kube2iam therefore using a lot of performance even if goroutines are cheap.

I would rather increase the caching duration to 60 minutes and/or make it configurable.

marshallbrekka · 2017-01-04T18:28:40Z

I think there is some misunderstanding about the proposed solution, as well as the issue(s) it was meant to solve, I'll try and explain it better below.

Issues to Resolve

1. Clients fail to get credentials on startup

This was the primary issue, where if a client failed to find credentials from the metadata service within its default timeout (for some this is 1s, and I have seen STS take longer than that on occasion), it would assume that the metadata service is not available, and move on down the credential provider chain.

This can essentially cause a Pod to fail on startup, and never recover, since the clients cache the credential provider decisions.

As it currently stands, credentials are only requested from STS on an as needed basis.

This is not entirely correct, it will hit STS once every 15 minutes as the credentials are cached thereafter. This seemed like an acceptable tradeoff.

In light of the issue described above, I don't think this tradeoff is acceptable.

2. Performance Penalties

This is much less important than the first issue, since it doesn't necessarily cause an outright failure, but for services where latency is important, adding an additional ~100ms on a request is highly undesirable.

Performance Concerns

I don't think this solution will scale to thousands of nodes, potentially hundreds of thousands of pods as:

it would be hitting the AWS API limits pretty hard and potentially have knocked down effects on the entire AWS account. This is a show stopper imo.

there could be hundreds of thousands of goroutines running for kube2iam therefore using a lot of performance even if goroutines are cheap.

I think both of these points are easily addressed by a few ideas.

If you limit the pods that kube2iam is monitoring to the ones that exist on the same worker, then you only need to scale the number of goroutines to the max number of pods that can run on a given worker (this is what the implementation in the PR was doing).

This also means that you should only be making AWS STS API calls 1x per pod per cache duration.

If you wanted to further limit the amount of API calls being made, you could share credentials for all pods on a given worker with the same IAM role, which if your pods were actively using their AWS credentials, would be the exact same as the status quo. However that reduces your ability to trace back API calls to a given pod for audit trail purposes, which I think is undesirable.

Summary

In general I don't think the performance of the implementation should be any worse than the status quo with the worst case scenario of pods consistently using credentials.

Thoughts?

emmanuel · 2017-01-20T00:29:39Z

We are experiencing timeouts in pods being served by kube2iam. The current blocking approach on first request causes this latency. Since it is triggering timeouts, and our goal for kube2iam is transparency, this makes the current approach unworkable from my perspective. I'm not sure that the proposed solution is ideal, but it seems like a good trade-off to me. Namely, the combination of:

preemptive fetch (kube2iam calls sts:AssumeRole as a pod is placed on a node)
limiting the cache to the pods on the current node (and only those with the kube2iam annotation)
preemptive refresh as late in the expiry window as reasonable (e.g., a few minutes before expiry)

Seems like together that would provide a reasonable balance between observed credential fetch latency, AWS API limits and kube2iam memory/CPU load.

That said, I'm like missing something important ;). What am I missing?

schlomo · 2017-06-09T08:59:17Z

IMHO kube2iam must guarantee to return valid IAM credentials within 1 second as this is the default of most AWS SDKs. I build something similar for the data center (https://github.com/ImmobilienScout24/afp-alppaca) which was also engineered along these lines.

SharpEdgeMarshall · 2017-11-09T15:18:16Z

I just finished setting up kube2iam and pods using boto3 restarted 3 times before getting credentials. There is any workaround or news about this issue?

UPDATE: Found a temporary fix for boto, there are 2 env vars AWS_METADATA_SERVICE_TIMEOUT and AWS_METADATA_SERVICE_NUM_ATTEMPTS

gwkunze · 2018-04-23T12:49:45Z

Ran into this problem as well, honestly I can't imagine anyone NOT running into this issue when using kube2iam for anything serious. Would a PR that implements the suggestions made by @marshallbrekka be acceptable?

jrnt30 · 2018-08-17T12:30:29Z

Yes, a PR would implementing these features would be welcome. There are a few other nuances with how this is done now with the informer which may change the implementation a bit but something that meets the following:

Constrained scope to the nodes in question
Grabbed credentials proactively (per pod still)
Refreshed the credentials behind the scenes

chrisghill · 2018-11-06T19:52:57Z

Any update on this? It's been over a year and I'm running into this issue now too. Going to try updating the ENV variables SharpEdgeMarshall mentioned, but would prefer a more permanent fix.

gabrielvcbessa · 2019-12-26T17:23:09Z

@chrisghill Also facing this issue. Even with the ENV variables setted up is not reliable enough.

chrisghill · 2020-01-07T00:02:09Z

@gabrielvcbessa what language are you using? I think those ENV variables only work in boto3 (python). The PHP library doesn't support them (at least the version we use doesn't support them).

marshallbrekka mentioned this issue Nov 8, 2016

pre-fetch credentials, and refresh them before they expire. #30

Closed

jtblin mentioned this issue Mar 19, 2017

Pods are sometimes assigned to the incorrect IAM role #46

Closed

mumoshu mentioned this issue Feb 2, 2018

'kube-node-drainer' does not work 'kube2iam' in 0.9.9 kubernetes-retired/kube-aws#1105

Closed

mschurenko mentioned this issue Feb 12, 2018

Increasing http timeout for ec2metadata to 30 seconds vmware-tanzu/velero#311

Closed

xvello mentioned this issue Apr 3, 2019

ec2 tags collection is broken when kube2iam is used DataDog/datadog-agent#3174

Merged

shshe mentioned this issue May 7, 2019

Error when retrieving credentials from iam-role: Credential refresh failed, response did not contain: access_key, secret_key, token, expiry_time boto/botocore#1617

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Prefetch credentials and refresh periodically to better emulate the metadata service. #31

Prefetch credentials and refresh periodically to better emulate the metadata service. #31

marshallbrekka commented Nov 8, 2016

jtblin commented Nov 27, 2016

marshallbrekka commented Jan 4, 2017

emmanuel commented Jan 20, 2017 •

edited

Loading

schlomo commented Jun 9, 2017

SharpEdgeMarshall commented Nov 9, 2017 •

edited

Loading

gwkunze commented Apr 23, 2018

jrnt30 commented Aug 17, 2018

chrisghill commented Nov 6, 2018

gabrielvcbessa commented Dec 26, 2019 •

edited

Loading

chrisghill commented Jan 7, 2020

Prefetch credentials and refresh periodically to better emulate the metadata service. #31

Prefetch credentials and refresh periodically to better emulate the metadata service. #31

Comments

marshallbrekka commented Nov 8, 2016

jtblin commented Nov 27, 2016

marshallbrekka commented Jan 4, 2017

Issues to Resolve

1. Clients fail to get credentials on startup

2. Performance Penalties

Performance Concerns

Summary

emmanuel commented Jan 20, 2017 • edited Loading

schlomo commented Jun 9, 2017

SharpEdgeMarshall commented Nov 9, 2017 • edited Loading

gwkunze commented Apr 23, 2018

jrnt30 commented Aug 17, 2018

chrisghill commented Nov 6, 2018

gabrielvcbessa commented Dec 26, 2019 • edited Loading

chrisghill commented Jan 7, 2020

emmanuel commented Jan 20, 2017 •

edited

Loading

SharpEdgeMarshall commented Nov 9, 2017 •

edited

Loading

gabrielvcbessa commented Dec 26, 2019 •

edited

Loading