Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ignorable Errors #377

Closed
DirectXMan12 opened this issue Mar 27, 2019 · 11 comments
Closed

Ignorable Errors #377

DirectXMan12 opened this issue Mar 27, 2019 · 11 comments
Labels
kind/feature Categorizes issue or PR as related to a new feature. lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete.

Comments

@DirectXMan12
Copy link
Contributor

We have a couple classes of errors that should not trigger requeues (not-found, already-exists), as they're not errors that will be solved by requeuing (not-found will remain not-found until a cache updates, at which point you should get a requeue from the cache update, for instance, and already-exists is the same way).

The quickest, most-likely-to-not-break-users way to solve this is to have helpers for ignorable errors, along the lines of

func IgnoreNotFound(err error) error {
    if apierrors.IsNotFound(err) {
        return nil
    }
    return err
}

An easier/possibly better solution is to apply that logic on the results of reconciler by default, and allow an option to force overriding this in Result, but this could potentially break people in the case of non-watched/incorrectly-watched resources.

At any rate, the helpers are a good first step -- I find myself writing them a lot by hand.

/kind feature
/priority important-longterm.

@k8s-ci-robot k8s-ci-robot added the kind/feature Categorizes issue or PR as related to a new feature. label Mar 27, 2019
@DirectXMan12 DirectXMan12 added the priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete. label Mar 27, 2019
@adracus
Copy link
Contributor

adracus commented May 14, 2019

Related: Comment in #357 + Fix in #428

@adohe-zz
Copy link

/close

@k8s-ci-robot
Copy link
Contributor

@adohe: Closing this issue.

In response to this:

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@DirectXMan12
Copy link
Contributor Author

Not quite actually fixed -- the helper helps, but I'd like to figure out something automatic

@DirectXMan12 DirectXMan12 reopened this May 31, 2019
@adohe-zz
Copy link

adohe-zz commented Jun 1, 2019

sure I will give a try to find something automatic.

@fejta-bot
Copy link

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Aug 30, 2019
@DirectXMan12
Copy link
Contributor Author

/lifecycle frozen

@k8s-ci-robot k8s-ci-robot added lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Sep 3, 2019
@negz
Copy link
Contributor

negz commented Oct 7, 2019

https://godoc.org/github.com/crossplaneio/crossplane-runtime/pkg/resource#Ignore

By way of an experience report, the pattern proposed by this issue is exactly what we do in Crossplane and it has been working well for us.

the helper helps, but I'd like to figure out something automatic

While I agree that you pretty much never want to requeue a NotFound error pertaining to the resource under reconciliation I'm a little worried about the existing requeue logic becoming "cleverer" to handle such situations. We've found the existing logic to be one of the biggest foot-guns in controller-runtime, which has lead to most of our reconcilers having large comment blocks at every return so that uninitiated readers can understand our intent around the reconcile.Result, error tuple we're returning in each scenario.

My suspicion is that having reconciler authors explicitly ignore such errors by convention will lead to reconciler logic that is more readable and easier to reason about than if controller-runtime tried to solve this magically.

@DirectXMan12
Copy link
Contributor Author

I'm a little worried about the existing requeue logic becoming "cleverer" to handle such situations

Yeah, that's my hesitance too, but I've seen a lot of people not realizing that requeuing on not found errors might cause them to see repeated backoff, and it's super-common boiler plate to write if IgnoreNotFound(err) == nil { ... }.

Still, the real experience report is super useful. Thanks 👍

@vincepri
Copy link
Member

Closing in favor of #617

/close

@k8s-ci-robot
Copy link
Contributor

@vincepri: Closing this issue.

In response to this:

Closing in favor of #617

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/feature Categorizes issue or PR as related to a new feature. lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete.
Projects
None yet
Development

No branches or pull requests

7 participants