Don't hang forever if there is no role defined for a pod #28

mikkeloscar · 2016-11-03T12:51:14Z

Currently a call to metadata from a pod without a role will hang forever because it retries to get the role from the IP indefinitely: https://github.com/jtblin/kube2iam/blob/master/cmd/server.go#L61-L66.

If the retry really is needed I think it would be better to limit it to X number of retries and then respond. Hanging the connection forever is not nice for the calling application.

I know this could be solved by setting a default role, but it should also work when no default role is defined IMO.

I wouldn't mind making a PR fixing this, but I would like your (@jtblin) opinion before I start, if you don't mind.

Do we really need to retry or would it be ok to respond with a 404, this obviously gives a false positive if kube2iam just is too slow to recognize the pod annotation?
If we need to retry, can we limit it to a finite number of retries?

mikkeloscar · 2016-11-03T12:55:02Z

It just occurred to me that it could also be solved by simply asking the API server if the pod has a role annotation in case it's not already in the roleByIP table. What do you think about that?

jtblin · 2016-11-20T20:12:03Z

Currently a call to metadata from a pod without a role will hang forever because it retries to get the role from the IP indefinitely: https://github.com/jtblin/kube2iam/blob/master/cmd/server.go#L61-L66.

It won't hang forever but the defaults of the backoff algorithm is pretty long (15'): https://godoc.org/github.com/cenkalti/backoff#pkg-examples

If the retry really is needed I think it would be better to limit it to X number of retries and then respond. Hanging the connection forever is not nice for the calling application.

The retry is there to avoid a race condition in case we haven't got the role data yet. I agree that the defaults are too long.

It just occurred to me that it could also be solved by simply asking the API server if the pod has a role annotation in case it's not already in the roleByIP table. What do you think about that?

I would prefer that we change the values for the retry to a few seconds max rather than adding an API call. It's easy to do by configuring the backoff returned by backoff.NewExponentialBackOff().

mikkeloscar · 2016-11-27T20:36:04Z

Hey, thanks for the feedback. Did you check any of the PRs (#30,#32). It seems like the try to solve related issues in different ways. If you plan to merge any of those then I'll wait with a fix for this. Otherwise I'll make a PR just reducing the Backoff limit like you suggest.

jtblin · 2016-11-27T21:48:03Z

I don't think #30 is an acceptable solution, entered my comments on the issue. I am not sure which problem #32 is trying to solve.

This makes the max interval and max elapsed time configurabel for the exponential backoff used when getting a role based on source IP. The defaults are still the same e.g. 1 minute for MaxInterval and 15 minutes for the MaxElapsedTime. Fix jtblin#28

This makes the max interval and max elapsed time configurabe for the exponential backoff used when getting a role based on source IP. The defaults are still the same e.g. 1 minute for MaxInterval and 15 minutes for the MaxElapsedTime. Fix #28 * Use sane max interval & max elapsed time defaults

This makes the max interval and max elapsed time configurabe for the exponential backoff used when getting a role based on source IP. The defaults are still the same e.g. 1 minute for MaxInterval and 15 minutes for the MaxElapsedTime. Fix jtblin#28 * Use sane max interval & max elapsed time defaults

mikkeloscar mentioned this issue Nov 29, 2016

Make backoff MaxInterval and MaxElapsedTime configurable #33

Merged

jtblin closed this as completed in #33 Dec 4, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Don't hang forever if there is no role defined for a pod #28

Don't hang forever if there is no role defined for a pod #28

mikkeloscar commented Nov 3, 2016

mikkeloscar commented Nov 3, 2016 •

edited

Loading

jtblin commented Nov 20, 2016

mikkeloscar commented Nov 27, 2016

jtblin commented Nov 27, 2016

Don't hang forever if there is no role defined for a pod #28

Don't hang forever if there is no role defined for a pod #28

Comments

mikkeloscar commented Nov 3, 2016

mikkeloscar commented Nov 3, 2016 • edited Loading

jtblin commented Nov 20, 2016

mikkeloscar commented Nov 27, 2016

jtblin commented Nov 27, 2016

mikkeloscar commented Nov 3, 2016 •

edited

Loading