Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Retry pending deployments longer before failing them #13550

Merged
merged 1 commit into from
Mar 28, 2017
Merged

Retry pending deployments longer before failing them #13550

merged 1 commit into from
Mar 28, 2017

Conversation

0xmichalis
Copy link
Contributor

@mfojtik ptal

@0xmichalis
Copy link
Contributor Author

@mfojtik didn't we see this flake somewhere? Failed deployment because of a missing deployer but the deployer actually existed?

@0xmichalis
Copy link
Contributor Author

[test]

@mfojtik
Copy link
Contributor

mfojtik commented Mar 27, 2017

@Kargakis yes, there was one. good catch!

@@ -20,6 +20,8 @@ import (
"github.com/openshift/origin/pkg/util"
)

const maxRetryCount = 10
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please drop a comment here explaining what is this retry count

@mfojtik
Copy link
Contributor

mfojtik commented Mar 27, 2017

this LGTM

@openshift-bot
Copy link
Contributor

Evaluated for origin test up to df79f89

@0xmichalis
Copy link
Contributor Author

[merge]

@openshift-bot
Copy link
Contributor

continuous-integration/openshift-jenkins/test FAILURE (https://ci.openshift.redhat.com/jenkins/job/test_pull_request_origin/476/) (Base Commit: 959be87)

@0xmichalis
Copy link
Contributor Author

#11114 [merge]

@0xmichalis
Copy link
Contributor Author

#12072 [merge]

@openshift-bot
Copy link
Contributor

Evaluated for origin merge up to df79f89

@openshift-bot
Copy link
Contributor

openshift-bot commented Mar 28, 2017

continuous-integration/openshift-jenkins/merge SUCCESS (https://ci.openshift.redhat.com/jenkins/job/merge_pull_request_origin/212/) (Base Commit: 959be87) (Image: devenv-rhel7_6104)

// The first requeue is after 5ms and subsequent requeues grow exponentially.
// This effectively can extend up to 5^10ms which caps to 1000s:
//
// 5ms, 25ms, 125ms, 625ms, 3s, 16s, 78s, 390s, 1000s, 1000s
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bparees @csrwng you guys need to be aware for what you are signing up with 60 retries ^

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thx @Kargakis What determines the rate of backoff for each requeue? Is it configurable?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not configurable currently. See here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess though you can construct your own ratelimiter

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thx

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants