Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CRD is hanging while deleting with "foregroundDeletion" policy #1755

Closed
kamolhasan opened this issue Jul 28, 2020 · 18 comments
Closed

CRD is hanging while deleting with "foregroundDeletion" policy #1755

kamolhasan opened this issue Jul 28, 2020 · 18 comments
Labels
kind/bug Categorizes issue or PR as related to a bug. kind/external upstream bugs

Comments

@kamolhasan
Copy link

What happened:

I'm running a CRD controller. On the deployment of the CRD, the controller creates a set of k8s services, statefulset, role, rolebinding etc. The operator also sets the ownerReference (CRD) with ownerReference.blockOwnerDeletion=true of those objects.

Now, when I delete the CRD with foregroudDeletion policy.

CRD is hanging, I checked the dependent objects, the deletionTimestamp and finalizer are set. But somehow the garbage collector isn't cleaning up those.

- apiVersion: v1
  kind: Service
  metadata:
    creationTimestamp: "2020-07-28T04:22:13Z"
    deletionGracePeriodSeconds: 0
    deletionTimestamp: "2020-07-28T04:23:21Z"
    finalizers:
    - foregroundDeletion

What you expected to happen:

The created services, statefulset, role, rolebinding etc. will be deleted first, and once all those are deleted by the garbage collector the CRD is removed.

Anything else we need to know?:

Also, when I described the services, I encountered warning like below:

Warning  ClusterIPNotAllocated  90s (x3 over 19m)  ipallocator-repair-controller  Cluster IP 10.102.62.89 is not allocated; repairing
Warning  FailedToUpdateEndpointSlices  18m   endpoint-slice-controller  Error updating Endpoint Slices for Service demo/topology-es-master: Error updating topology-es-master-xwwdt EndpointSlice for Service demo/topology-es-master: endpointslices.discovery.k8s.io "topology-es-master-xwwdt" not found

Environment:

$ kind version
kind v0.8.1 go1.14.2 linux/amd64

$ kubectl version --short
Client Version: v1.18.3
Server Version: v1.18.2

@kamolhasan kamolhasan added the kind/bug Categorizes issue or PR as related to a bug. label Jul 28, 2020
@BenTheElder
Copy link
Member

this appears to be kubernetes/kubernetes#87603

@BenTheElder BenTheElder added the kind/external upstream bugs label Jul 28, 2020
@BenTheElder
Copy link
Member

we're not doing anything special for clusterIP allocation or garbage collector settings, so I'm pretty sure this is purely an upstream kubernetes bug you've found.

@kamolhasan
Copy link
Author

@BenTheElder Any idea how to work around this? I can delete with backgroundDeletion policy, but I also need to make sure that the dependent objects are removed.

@BenTheElder
Copy link
Member

I don't think there's a good workaround, there's a fix in progress upstream. I've commented there.

@ctron
Copy link

ctron commented Jul 28, 2020

I am seeing the same problem with our own CRDs (not Service!). Everything works in Minikube, CRC, and OpenShift. But in Kind the finalizer foregroundDeletion never gets deleted and the resource is stuck.

@BenTheElder
Copy link
Member

@ctron I need a little more information than that, is this across the same kubernetes version?
kind's Kubernetes behaviors are generally upstream kubernetes source + kubeadm for configuration (we deviate as little as possible from defaults)

@ctron
Copy link

ctron commented Jul 29, 2020

It seems to work on Minikube (1.17.3), OpenShift (1.18.3), but fails on Kind (0.8.1 -> kindest/node:v1.18.2).

Let me know if you need more information.

@neolit123
Copy link
Member

both kind and minikube use kubeadm under the hood, so i'm curious to what is the difference here.

please try a matching minikube version (k8s = v1.18.2):
https://github.com/kubernetes/minikube/releases/tag/v1.10.1

Bump Default Kubernetes version v1.18.2 and update newest

@ctron
Copy link

ctron commented Jul 29, 2020

@neolit123 Unfortunately that isn't possible due to: kubernetes/minikube#8414

@neolit123
Copy link
Member

i'd appreciate if this is reproduced with a raw kubeadm setup too.

@ctron
Copy link

ctron commented Jul 29, 2020

I looks like I can select the Kubernetes version with Minikube using --kubernetes-version=v1.18.2 … I will try that.

@ctron
Copy link

ctron commented Jul 29, 2020

So I can confirm that using Minikube with 1.18.2 shows the same problem. Looks like this is a regression in Kubernetes.

@BenTheElder
Copy link
Member

BenTheElder commented Jul 29, 2020

you might try --image=kindest/node:v1.17.5@sha256:ab3f9e6ec5ad8840eeb1f76c89bb7948c77bbf76bcebe1a8b59790b8ae9a283a for kind v0.8.1 in the meantime.

https://github.com/kubernetes-sigs/kind/releases/tag/v0.8.0#New-Features

@ctron
Copy link

ctron commented Jul 29, 2020

Just tested with Kubernetes 1.18.6, same issue

@BenTheElder
Copy link
Member

Since this is reproduced in minikube, and the original in kubernetes/kubernetes#87603, I'm going to close this in the KIND tracker.
If w fix is identified upstream and a release is cut, we'll be sure to provide a pre-built image with it.
In the meantime we do provide pre-built other images, and it's possible (though currently a bit of work) to build your own images at fairly arbitrary versions.

@ctron
Copy link

ctron commented Aug 3, 2020

Btw … switching back to 1.17.x with Kind works as well.

@BenTheElder
Copy link
Member

Excellent. If you can identify the kubernetes bug please file an issue with the kubernetes/kubernetes tracker so we can get it fixed upstream.

@BenTheElder
Copy link
Member

Or kubernetes/kubeadm if it turns out to be some kubeadm setting.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug. kind/external upstream bugs
Projects
None yet
Development

No branches or pull requests

4 participants