Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[0.1] Deleting Cluster does not delete #985

Closed
liztio opened this issue Jun 6, 2019 · 27 comments
Closed

[0.1] Deleting Cluster does not delete #985

liztio opened this issue Jun 6, 2019 · 27 comments
Assignees
Labels
kind/bug Categorizes issue or PR as related to a bug. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release.

Comments

@liztio
Copy link
Contributor

liztio commented Jun 6, 2019

/kind bug

What steps did you take and what happened:

  1. Create a cluster in the cluster API
  2. Add a Machine or MachineDeployment
  3. Wait for all objects to become available
  4. kubectl delete the cluster
  5. The cluster gets a deleted-at, but nothing else gets deleted

What did you expect to happen:

The Machines, then MachineDeployments should be marked to be deleted, then the cluster deleted once their resources have been completely removed

Anything else you would like to add:
[Miscellaneous information that will assist in solving the issue.]

Environment:

  • Cluster-api version: master
  • Minikube/KIND version: 0.1.0-alpha
  • Kubernetes version: (use kubectl version): 1.13
  • OS (e.g. from /etc/os-release): Ubuntu Bionic
@k8s-ci-robot k8s-ci-robot added the kind/bug Categorizes issue or PR as related to a bug. label Jun 6, 2019
@liztio
Copy link
Contributor Author

liztio commented Jun 6, 2019

/assign
/cc @ncdc

@timothysc timothysc added this to the v1alpha2 milestone Jun 21, 2019
@timothysc timothysc added the priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. label Jun 21, 2019
@liztio
Copy link
Contributor Author

liztio commented Jun 26, 2019

Just finished an extended debugging session with @ncdc. Some background:

There are three deletion techniques the kubernetes apiserver supports: orphan, foreground, and background. By default, kubectl delete does a background deletion.

During a background deletion, the parent object is deleted, then dependent objects are garbage collected. In a foreground deletion, objects are deleted bottom-up on the dependency graph, objects with no dependencies first.

The latter is what is required for Cluster API, because usually a cluster can't be cleaned up until it's empty (for example, CAPA can't delete the VPC and subnets associated with a cluster unless the machines inside it have already been removed).

Unfortunately, there's currently no way to trigger a foreground deletion from kubectl. It will always default to background, and even if the foregroundDeletion finalizer is added to the object, deleting it from kubectl will cause the finalizer to be removed. (See kubernetes/kubectl#672)

For now, the best workaround is to manually delete clusters using curl:

curl -i -X DELETE http://localhost:8001/apis/cluster.k8s.io/v1alpha1/namespaces/default/clusters/sweetie \
 -d '{"kind":"DeleteOptions","apiVersion":"v1","propagationPolicy":"Foreground"}' \
 -H "Content-Type: application/json

@AlainRoy
Copy link

@liztio: If you delete bottom-up, does that mean that you delete the Machines before the MachineSets Might the Machines potentially be recreated by the MachineSets?

Apologies if this is a stupid question--I don't know all the details of how cluster deletion works.

@Lokicity
Copy link

Thanks for the explanation. i am running into a similar issue where if I deletes a machine deployment using this client

	*discovery.DiscoveryClient
	clusterV1alpha1 *clusterv1alpha1.ClusterV1alpha1Client

If I delete the machine deployment, it doesn't delete the machine deployment object nor the machineset object. However, I can see that my cloud provider VM is getting deleted and recreated constantly. I am not entirely sure if it necessarily related for my case. I will add the "ForgroundDeletion" policy to delete machine deployment. My understanding is that, this should trigger the deletion of machine set first, then delete the machine deployment.

@Lokicity
Copy link

Lokicity commented Jun 27, 2019

Hi Liz, I am not using kubect and I am using the ClusterV1alpha1Client to delete machine deployment. Even with the DeletePropogationForeground set, the machine deployment is not deleted properly. It seems to run into some race condition and there is contention between machineset and machine deploytment. One is trying to remove the vm and one is trying to create the vm. After 10-20 minutes, all nodes will eventually be deleted. This looks like a bug in CAPA and I can provide more detail if needed.

@jimmidyson
Copy link
Member

Shouldn't we be using finalizers here and controller triggering deletion of dependent objects? Coupled with ownerReference.blockOwnerDeletion=true on dependent objects (with ownership set up), and we should have bottom up deletion, no?

@jimmidyson
Copy link
Member

The machine controller shouldn't be trying to create machines for machinesets the are deleting (deletiontimestamp is set).

@ncdc
Copy link
Contributor

ncdc commented Jun 27, 2019

@Lokicity please open a new issue for your machine deployment issue - thanks!

@ncdc
Copy link
Contributor

ncdc commented Jun 27, 2019

FYI, once we break apart the Cluster and move providerSpec into its own custom resource, background deletion will work and this won't be an issue any more:

  1. kubectl delete cluster - cluster is removed from etcd because we don't need any finalizers on it
  2. Garbage collection deletes all objects that had the cluster as an owner
    1. MachineDeployments
    2. MachineSets
    3. Machines
    4. Provider machine infra custom resources
    5. Provider cluster infra custom resources (this is where the CAPA VPC, security groups, etc will be handled)

@jimmidyson
Copy link
Member

That means that Cluster will be removed before resources for that cluster are actually released. Is that desirable functionality? I'd prefer to see that my Cluster exists until all resources are cleaned up. If we don't do this, then I can't list clusters and see that my cluster is in deleting (or whatever) phase.

@detiber
Copy link
Member

detiber commented Jun 27, 2019

@Lokicity Passing in metav1.DeleteOptions with a PropagationPolicy of metav1.DeletePropatationForeground to the Delete method should resolve the issue when using ClusterV1alpha1Client. This is what is done in clusterctl.

@detiber
Copy link
Member

detiber commented Jun 27, 2019

@jimmidyson I would also expect the Cluster object to stick around until the dependent resources are removed, I'm just not sure that we can enforce that with kubectl today since there is no way to enforce foreground propagation there.

@jimmidyson
Copy link
Member

Like I said above, finalizers and controller managed deletion would do it, analagous to namespace finalizer.

@liztio
Copy link
Contributor Author

liztio commented Jun 27, 2019

@jimmidyson I wanted to avoid completely reimplementing the wheel w.r.t. finalizers. the GC already has the behaviour we want - delete all dependents before deleting the owner object. Why should we have to do that ourselves?

@liztio
Copy link
Contributor Author

liztio commented Jun 27, 2019

@AlainRoy the machines are deleted before the machinesets, but the machinesets are marked as being deleted (by the deletedAt timestamp). This way they know not to create new deployments.

@jimmidyson
Copy link
Member

@liztio Can you force that behaviour? I don't think you can (but would love to know if it is possible!), and as you've said above, the default is to do a background deletion. As far as I'm aware, if we want to consistently only delete clusters after all their dependents have been deleted, that has to happen through finalizers.

@liztio
Copy link
Contributor Author

liztio commented Jun 27, 2019

I'm grumpy rn that apiserver removes the foregroundDeletion finalizer I added in. There has to be a better way than "reimplement the GC from scratch." I've seen the GC code, it's really complicated and there's a lot of edge cases, I don't trust us to get it right first try.

@ncdc
Copy link
Contributor

ncdc commented Jun 27, 2019

We first need to decide if we want the cluster to stick around "Terminating" while its various resources are being cleaned up. If no, this problem should go away with alpha2. If yes, then we probably need to have conversations with the apimachinery & kubectl folks about how to handle this... or we could add logic to the MachineDeployment, MachineSet, and Machine controllers to watch Clusters, and if a Cluster has a nonzero deletion timestamp, issue deletes for the MD/MS/Machines.

@detiber
Copy link
Member

detiber commented Jun 27, 2019

@ncdc I'm partial to say yes the Cluster and other parent objects should stick around until their child resources/objects are cleaned up.

@jimmidyson
Copy link
Member

I agree with @detiber.

@ncdc
Copy link
Contributor

ncdc commented Jun 28, 2019

Ok, then I think we need to deal with the fact that kubectl defaults to background deletion and doesn't allow you to specify foreground, which results in the apiserver/garbage collection ignoring the foreground deletion finalizer.

Unfortunately I think the only way to guarantee that we don't end up with a cluster stuck deleting is to implement the dependent deletion logic that we get for free with foreground deletion. I'd say this is necessary because even if kubectl supported foreground deletion, kubectl isn't the only way to issue a delete request, and someone somewhere would likely try to delete a cluster in the background and run into this issue again.

@dlipovetsky
Copy link
Contributor

dlipovetsky commented Jul 17, 2019

I talked with @ncdc about this in #cluster-api slack yesterday (thanks for answering all my questions!), and I wanted to add a summary of that here:

For a cascading delete, the apiserver decides which strategy to use by looking for the strategy finalizer (e.g. foregroundDeletion) on the parent object. The client can override the strategy; kubectl always selects the background strategy. When the background strategy is chosen, the foregroundDeletion finalizer is removed from the parent object, and if no other finalizers exist, the parent object is immediately deleted.

As a kubectl user, it seems like the important question is "Am I willing to wait for cascading deletes?" If Yes, then use foreground strategy. Otherwise, use background strategy. This is not a question kubectl asks today. Note that the "Am I willing to wait for deletes?" question is already asked by the --wait flag.

However, the choice of strategy impacts correctness of the deletes for CAPA. If you use the background strategy to delete the cluster, the delete will never succeed because of AWS dependency issues that require machine objects to be deleted prior to the cluster object.

If the wrong choice can lead to incorrect behavior, the user experience will be terrible. So either CAPA needs to work around upstream cascading delete behavior (see the comment just above), or that behavior needs to change.

@ncdc ncdc changed the title Deleting Cluster does not delete [0.1] Deleting Cluster does not delete Jul 24, 2019
@ncdc
Copy link
Contributor

ncdc commented Jul 24, 2019

I think #1180 is sufficient to close this, and #1190 will make it stronger.
/close

@k8s-ci-robot
Copy link
Contributor

@ncdc: Closing this issue.

In response to this:

I think #1180 is sufficient to close this, and #1190 will make it stronger.
/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@dlipovetsky
Copy link
Contributor

@ncdc If I'm reading #1180 correctly, it implements controller-side cascading delete. Would we use server-side cascading delete if we didn't have the strategy issue? If so, LMK if it's worth filing an issue to revisit this in the future and reach out to sig-apimachinery in the meantime.

@ncdc
Copy link
Contributor

ncdc commented Jul 25, 2019

@dlipovetsky yes, it does implement server-side cascading delete. I assume we would revert this if we didn't have the strategy issue. I'm not sure about an issue - we need correct behavior, 100% of the time, and we can't rely on users (or tooling) correctly setting foreground deletion. Do you think there's an alternative approach?

@cadedaniel
Copy link

I'm commenting to add an anecdote; I am implementing a controller/CRD for something unrelated and came across this same issue. I was hoping for a server-side implementation of #1180, it seems to be the better option from a framework point of view. @dlipovetsky if you create an issue against sig-apimachinery can you please post the link here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release.
Projects
None yet
Development

No branches or pull requests

10 participants