Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

quick create-delete leaves orphaned objects #551

Closed
sdudoladov opened this issue Apr 29, 2019 · 5 comments · Fixed by #654
Closed

quick create-delete leaves orphaned objects #551

sdudoladov opened this issue Apr 29, 2019 · 5 comments · Fixed by #654
Labels

Comments

@sdudoladov
Copy link
Member

sdudoladov commented Apr 29, 2019

Orphaned pods/endpoints are left in the cluster if the ADD event is followed by DELETE within a short time period for the same cluster.

This error manifests as

time="2019-04-29T12:17:52Z" level=info msg="\"ADD\" event has been queued" cluster-name=default/acid-minimal-cluster pkg=controller worker=0
...
time="2019-04-29T12:17:53Z" level=info msg="waiting for the cluster being ready" cluster-name=default/acid-minimal-cluster pkg=cluster worker=0
time="2019-04-29T12:17:56Z" level=debug msg="Waiting for 2 pods to become ready" cluster-name=default/acid-minimal-cluster pkg=cluster worker=0
time="2019-04-29T12:18:00Z" level=info msg="\"DELETE\" event has been queued" cluster-name=default/acid-minimal-cluster pkg=controller worker=0
...
time="2019-04-29T12:18:29Z" level=info msg="statefulset \"default/acid-minimal-cluster\" has been deleted" cluster-name=default/acid-minimal-cluster pkg=cluster worker=0
time="2019-04-29T12:18:29Z" level=debug msg="deleting pods" cluster-name=default/acid-minimal-cluster pkg=cluster worker=0
time="2019-04-29T12:18:29Z" level=debug msg="no pods to delete" cluster-name=default/acid-minimal-cluster pkg=cluster worker=0
...
time="2019-04-29T12:18:29Z" level=debug msg="deleting PVCs" cluster-name=default/acid-minimal-cluster pkg=cluster worker=0
time="2019-04-29T12:18:29Z" level=debug msg="no PVCs to delete" cluster-name=default/acid-minimal-cluster pkg=cluster worker=0
...
time="2019-04-29T12:18:32Z" level=debug msg="removing leftover Patroni objects (endpoints or configmaps)" cluster-name=default/acid-minimal-cluster pkg=cluster worker=0
time="2019-04-29T12:18:32Z" level=warning msg="could not remove leftover patroni objects; could not fetch Patroni Endpoint \"/\": an empty namespace may not be set when a resource name is provided" cluster-name=default/acid-minimal-cluster pkg=cluster worker=0
time="2019-04-29T12:18:32Z" level=info msg="cluster has been deleted" cluster-name=default/acid-minimal-cluster pkg=controller worker=0

reproducible both with kind and actual k8s

this issue also prevents creating a new cluster with the same name afterwards

@FxKu
Copy link
Member

FxKu commented May 23, 2019

Aside from finalizers #450, ownerReference #498 might also help.

@Jan-M
Copy link
Member

Jan-M commented May 23, 2019

Both options need to be investigated with care, the only real delete we care about is the delete of the "postgresql" object. Other objects can be deleted (e.g. the statefulset) and no interruption or impact is expected.

@davisford
Copy link

davisford commented Aug 2, 2019

Both options need to be investigated with care, the only real delete we care about is the delete of the "postgresql" object. Other objects can be deleted (e.g. the statefulset) and no interruption or impact is expected.

@Jan-M this is what I'm seeing. I have Terraform scripts that build and tear down the whole cluster, but the secondary read replica we spawn never gets removed even though the operator and the postgresql object is removed.

$ kc get postgresqls.acid.zalan.do -A
No resources found.
$ kc get pods -A
NAMESPACE     NAME                               READY   STATUS    RESTARTS   AGE
default       foo-cluster-1                   1/1     Running   0          19h
kube-system   coredns-5c98db65d4-fm6c5           1/1     Running   0          19h
kube-system   coredns-5c98db65d4-lptdx           1/1     Running   0          19h
kube-system   etcd-minikube                      1/1     Running   0          19h
kube-system   kube-addon-manager-minikube        1/1     Running   0          19h
kube-system   kube-apiserver-minikube            1/1     Running   0          19h
kube-system   kube-controller-manager-minikube   1/1     Running   0          19h
kube-system   kube-proxy-xdg4c                   1/1     Running   0          19h
kube-system   kube-scheduler-minikube            1/1     Running   0          19h
kube-system   storage-provisioner                1/1     Running   0          19h

The foo-cluster-1 is left behind.... it is a PG db node that was part of the original cluster.

Other resources are also left behind, secrets, services, endpoints, persistent volumes + claims.

What is the best way to handle this?

FYI, the logs for that orphaned pg cluster pod, just spewing this error over and over from Patroni:

2019-08-02 14:40:12,125 ERROR: watch
Traceback (most recent call last):
  File "/usr/local/lib/python3.6/dist-packages/patroni/dcs/kubernetes.py", line 428, in watch
    _request_timeout=(1, timeout + 1)):
  File "/usr/local/lib/python3.6/dist-packages/kubernetes/watch/watch.py", line 115, in stream
    resp = func(*args, **kwargs)
  File "/usr/local/lib/python3.6/dist-packages/patroni/dcs/kubernetes.py", line 50, in wrapper
    return getattr(self._api, func)(*args, **kwargs)
  File "/usr/local/lib/python3.6/dist-packages/kubernetes/client/apis/core_v1_api.py", line 12528, in list_namespaced_endpoints
    (data) = self.list_namespaced_endpoints_with_http_info(namespace, **kwargs)
  File "/usr/local/lib/python3.6/dist-packages/kubernetes/client/apis/core_v1_api.py", line 12630, in list_namespaced_endpoints_with_http_info
    collection_formats=collection_formats)
  File "/usr/local/lib/python3.6/dist-packages/kubernetes/client/api_client.py", line 335, in call_api
    _preload_content, _request_timeout)
  File "/usr/local/lib/python3.6/dist-packages/kubernetes/client/api_client.py", line 148, in __call_api
    _request_timeout=_request_timeout)
  File "/usr/local/lib/python3.6/dist-packages/kubernetes/client/api_client.py", line 371, in request
    headers=headers)
  File "/usr/local/lib/python3.6/dist-packages/kubernetes/client/rest.py", line 250, in GET
    query_params=query_params)
  File "/usr/local/lib/python3.6/dist-packages/kubernetes/client/rest.py", line 240, in request
    raise ApiException(http_resp=r)
kubernetes.client.rest.ApiException: (401)
Reason: Unauthorized
HTTP response headers: HTTPHeaderDict({'Content-Type': 'application/json', 'Date': 'Fri, 02 Aug 2019 14:40:12 GMT', 'Content-Length': '129'})
HTTP response body: b'{"kind":"Status","apiVersion":"v1","metadata":{},"status":"Failure","message":"Unauthorized","reason":"Unauthorized","code":401}\n

It looks like the secret it had been using was deleted for the service account which causes it to just go into a continuous failed state:

Events:
  Type     Reason       Age                 From               Message
  ----     ------       ----                ----               -------
  Warning  FailedMount  65s (x20 over 25m)  kubelet, minikube  MountVolume.SetUp failed for volume "postgres-operator-token-5v5x2" : secret "postgres-operator-token-5v5x2" not found

EDIT -- I understand why the pv/pvc aren't deleted. That's undesirable for a StatefulSet. Also noted reading the k8s docs on StatefulSet:

StatefulSets do not provide any guarantees on the termination of pods when a StatefulSet is deleted. To achieve ordered and graceful termination of the pods in the StatefulSet, it is possible to scale the StatefulSet down to 0 prior to deletion.

This may be why the -1 cluster pod node is orphaned? Might it be possible for the operator to thus scale the set down to 0 prior to delete?

@davisford
Copy link

@Jan-M looking at the code, it appears that it is just attempting to delete the StatefulSet as opposed to the recommended approach of scaling it down to zero prior to deletion.

@Jan-M
Copy link
Member

Jan-M commented Aug 5, 2019

But it is followed up by deletePods if I see this correctly.

But you are right, we will look into the right place of the docs on how to delete a statefulset. I did not know that the delete maybe does not delete pod.

Maybe got mixed up here, where it is mentioned that kubectl scales down too.
https://kubernetes.io/docs/tasks/run-application/delete-stateful-set/

@sdudoladov

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants