-
Notifications
You must be signed in to change notification settings - Fork 66
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Retry the image pull once after 5 seconds #792
Retry the image pull once after 5 seconds #792
Conversation
Signed-off-by: Jose R. Gonzalez <jose@flutes.dev>
check workload preflight-green |
check workload preflight-green |
Sorry, the job failure was not related to PR content. It was related to operator-sdk rebuild that made the format of IBM operator bundle to be deprecated. I removed this test from the list of mandatory tests. |
check workload preflight-green |
Factor: 1.0, | ||
Jitter: 0.1, | ||
Steps: 2, | ||
})) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @komish, looks good for me, you change "0s -> 1s -> 4s" default strategy to "0s -> 5s", that could improve the situation in case of longer outages.
Maybe we could even keep the exponential strategy if the retry interval is large enough, something like "0s -> 5s -> 15s" strategy.
remote.Backoff{
Duration: 5,
Factor: 2.0,
Jitter: 0.1,
Steps: 3,
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For the moment, I'll merge with 0s --> 5s, and if this proves not useful enough, we may consider a 3-step exponential. Keep us posted.
@tkrishtop: changing LGTM is restricted to collaborators In response to this: Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: acornett21, bcrochet, komish, tkrishtop The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
This PR should allow crane the ability to retry an image pull after 5 seconds when a pull failure occurs.
The one thing this may be missing is a RetryPredicate. I'm unsure if we need to specify one, but the default RetryPredicate captures a few error cases.
CC @tkrishtop to see if something like this may help with failure cases you've observed.
fixes #785
Signed-off-by: Jose R. Gonzalez jose@flutes.dev