Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reconcile loops using RequeueAfter never reset their failure count #392

Closed
Tehsmash opened this issue Apr 11, 2019 · 5 comments · Fixed by #396
Closed

Reconcile loops using RequeueAfter never reset their failure count #392

Tehsmash opened this issue Apr 11, 2019 · 5 comments · Fixed by #396
Labels
kind/bug Categorizes issue or PR as related to a bug. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release.

Comments

@Tehsmash
Copy link

Controllers calling reconcile loops that stabilise by calling return result.Result{RequeueAfter: resyncPeriod}, nil to periodically re-run reconciliation never call c.Queue.Forget to clear the workqueue failure count for a particular request.

As the Failure count is never cleared this count builds up over the lifetime of the CR and quickly results in the back-off timeout reaching its maximum limit for any error even if the reconcile loop is mostly stable. This is not ideal as the maximum limit for the timeout is ~16mins which slows down the reconciliation loop considerably.

We've experimented with using the resyncPeriod set on the manager as a workaround however from other tickets this doesn't seem to be recommended and does have side effects such as causing conflicts when calling client.Update as the object we were operating on is now out of date.

@DirectXMan12
Copy link
Contributor

/kind feature
/priority important-soon

@k8s-ci-robot k8s-ci-robot added kind/feature Categorizes issue or PR as related to a new feature. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. labels Apr 11, 2019
@DirectXMan12
Copy link
Contributor

the current behavior is highly non-intuitive, so we might want to just straight-up change it with a fallback for the current behavior in a new field

cc @droot

@DirectXMan12
Copy link
Contributor

(I'd almost mark it as a bug)

@droot
Copy link
Contributor

droot commented Apr 11, 2019

I agree "requeueAfter" should not result in incrementing the failureCount. This should be marked as bug.

Thanks @Tehsmash for the detailed description, it helped in understanding the issue.

@droot droot added kind/bug Categorizes issue or PR as related to a bug. and removed kind/feature Categorizes issue or PR as related to a new feature. labels Apr 11, 2019
alaypatel07 added a commit to alaypatel07/controller-runtime that referenced this issue Apr 12, 2019
@alaypatel07
Copy link
Contributor

@droot @DirectXMan12 The controller can check if err is nil and issue a c.Queue.Forget(obj) here. That would solve the problem. Thoughts?

Working on the PR if no one is onto it. Thanks

alaypatel07 added a commit to alaypatel07/controller-runtime that referenced this issue Apr 12, 2019
alaypatel07 added a commit to alaypatel07/controller-runtime that referenced this issue Apr 12, 2019
alaypatel07 added a commit to alaypatel07/controller-runtime that referenced this issue Apr 12, 2019
vincepri pushed a commit to vincepri/controller-runtime that referenced this issue Jul 24, 2019
DirectXMan12 pushed a commit that referenced this issue Jan 31, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants