-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reconciler panics should not crash the manager #797
Comments
Yeah, this would be great to have :) |
Something like apimachinery's |
We can revisit the milestone if the design doesn't have breaking changes. Folks might be relying on panics to detect failures today. /priority important-soon |
@vincepri: Please ensure the request meets the requirements listed here. If this request no longer meets these requirements, the label can be removed In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
Issues go stale after 90d of inactivity. If this issue is safe to close now please do so with Send feedback to sig-testing, kubernetes/test-infra and/or fejta. |
/lifecycle frozen |
/assign |
@varshaprasad96 Hi, are you still pursuing this? Mind if I take a stab at it :) ? |
@FillZpp sure, please feel free to create a PR for this. |
/assign @FillZpp |
Currently, an unhandled panic in a reconciler will not be recovered from, and will likely cause the manager binary to crash. This is a problem, since a panic might be triggered by a single resource in an unexpected state, so that one bad resource could prevent all other resources from being processed. Since Kubernetes is likely to restart the manager pod after a crash, this can also cause the manager to DOS the Kubernetes API server as it continually restarts.
In my project, I wrote this utility function:
Every time I pass a reconciler to
Complete
, I wrap it with this. It ensures that any panics raised by the reconciler are converted to normal errors.The text was updated successfully, but these errors were encountered: