-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reconciler stops responding after some minutes when using the builder.OnlyMetadata watch option #1789
Comments
I've created two CLI tools that hopefully demonstrate the problem more clearly. Using client-goThis one works and periodically re-established the watch.
Using controller-runtimeThis one stops responding after ~7min
|
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
I'm using
builder.OnlyMetadata
in a controller that watches Secrets, so as to avoid caching all the secret data in-memory.The controller works initially and my
Reconcile
function is invoked everytime I add / delete / update Secrets in my cluster.But after some minutes (~8-9 minutes) the controller stops responding to changes in Secrets.
I've increased the logging verbosity so that I can see client-go messages and I notice that:
For Watches without the
OnlyMetadata
optionThere are periodic log message such as:
suggesting that the watch is regularly bein re-established.
For watches with
OnlyMetadata
optionThere are no regular "Watch close" messages.
And at about the time the other resource watches have been closed and re-opened,
the controller stops responding to changes to those resources with the
OnlyMetadata
(PartialObjectMetaData) setting.And it may be unrelated, but I notice a slight difference in the initial log messages when I use OnlyMetada
That last GET request includes both a
timeout
and atimeoutSeconds
querystring parameter,unlike non-only-metadata GET requests which only have the
timeoutSeconds
querystring parameter.Recreating the problem
I've created a repo with a very simple kubebuilder project and controller which watches and logs changes to Secrets:
There's an envtest based test which recreates the failure within 10 minutes.
/bug
The text was updated successfully, but these errors were encountered: