Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deletion of resources shouldn't fail if the webhook server isn't available #56

Closed
jpkrohling opened this issue Sep 22, 2020 · 5 comments · Fixed by #313
Closed

Deletion of resources shouldn't fail if the webhook server isn't available #56

jpkrohling opened this issue Sep 22, 2020 · 5 comments · Fixed by #313
Labels
bug Something isn't working hacktoberfest

Comments

@jpkrohling
Copy link
Member

When the operator isn't up and running, removing resources will fail with:

$ kubectl delete otelcol opentelemetrycollector-sample
Error from server (InternalError): Internal error occurred: failed calling webhook "vopentelemetrycollector.kb.io": Post "https://opentelemetry-operator-webhook-service.opentelemetry-operator-system.svc:443/validate-opentelemetry-io-v1alpha1-opentelemetrycollector?timeout=30s": dial tcp 10.96.165.160:443: connect: connection refused

$ kubectl get otelcols
NAME                            MODE         VERSION   AGE
opentelemetrycollector-sample   deployment   0.10.0    3h39m
@jpkrohling jpkrohling added bug Something isn't working hacktoberfest labels Sep 22, 2020
@jpkrohling jpkrohling changed the title Delete of resources shouldn't fail if the webhook server isn't available Deletion of resources shouldn't fail if the webhook server isn't available Sep 22, 2020
@gramidt
Copy link
Member

gramidt commented Jan 23, 2021

@jpkrohling - One potential thought is to split the "validating-webhook-configuration" into two validating webhooks. The second one would be for the "DELETE" and would have failurePolicy: Ignore. This would allow deletion of the resource, but would leave the cleanup of the underlying resources to the user, including removing finalizers when applicable. Thoughts?

@jpkrohling
Copy link
Member Author

This would allow deletion of the resource, but would leave the cleanup of the underlying resources to the user, including removing finalizers when applicable

Not sure we do anything at all in our code. When a CR is deleted, all the resources that are owned by this CR are deleted automatically. In our reconciliation function, we'll see the delete event but won't find the CR, in which case we'll just return. So, I think your idea would work nicely.

if err := r.Get(ctx, req.NamespacedName, &instance); err != nil {
if !apierrors.IsNotFound(err) {
log.Error(err, "unable to fetch OpenTelemetryCollector")
}
// we'll ignore not-found errors, since they can't be fixed by an immediate
// requeue (we'll need to wait for a new notification), and we can get them
// on deleted requests.
return ctrl.Result{}, client.IgnoreNotFound(err)
}

@gramidt
Copy link
Member

gramidt commented Jan 25, 2021

Great! This will be straightforward to implement then, @jpkrohling. This is now in my backlog to tackle.

@VineethReddy02
Copy link
Contributor

@jpkrohling I can work on this!

@jpkrohling
Copy link
Member Author

Go ahead! Can you please double-check the kubebuilder docs to see if they have a recommendation about this? What we have right now is following what was bootstrapped by kubebuilder when I first bootstrapped this operator, and things might have changed since then.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working hacktoberfest
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants