Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Orphaned pods fail to get cleaned up #38498

Closed
sebbonnet opened this issue Dec 9, 2016 · 59 comments
Closed

Orphaned pods fail to get cleaned up #38498

sebbonnet opened this issue Dec 9, 2016 · 59 comments
Assignees
Labels
area/os/coreos kind/bug Categorizes issue or PR as related to a bug. sig/node Categorizes an issue or PR as relevant to SIG Node. sig/storage Categorizes an issue or PR as relevant to SIG Storage.

Comments

@sebbonnet
Copy link

sebbonnet commented Dec 9, 2016

Kubernetes version

Server Version: version.Info{Major:"1", Minor:"4", GitVersion:"v1.4.6", GitCommit:"e569a27d02001e343cb68086bc06d47804f62af6", GitTreeState:"clean", BuildDate:"2016-11-12T05:16:27Z", GoVersion:"go1.6.3", Compiler:"gc", Platform:"linux/amd64"}

Environment:

  • Cloud provider or hardware configuration: AWS
  • OS (e.g. from /etc/os-release): Ubuntu 16.04.1 LTS
  • Kernel (e.g. uname -a): Linux 4.4.0-53-generic #74-Ubuntu SMP Fri Dec 2 15:59:10 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux

What happened:
syslogs are getting spammed every 2 seconds with these kubelet errors:

Dec  9 13:14:02 ip-10-50-242-179 start-kubelet.sh[31129]: E1209 13:14:02.300355   31129 kubelet_volumes.go:159] Orphaned pod "ff614192-bcc4-11e6-a20e-0a591a8e83d7" found, but error open /var/lib/kubelet/pods/ff614192-bcc4-11e6-a20e-0a591a8e83d7/volumes: no such file or directory occured during reading volume dir from disk
Dec  9 13:14:02 ip-10-50-242-179 start-kubelet.sh[31129]: E1209 13:14:02.300373   31129 kubelet_getters.go:249] Could not read directory /var/lib/kubelet/pods/ff769116-bcf4-11e6-a20e-0a591a8e83d7/volumes: open /var/lib/kubelet/pods/ff769116-bcf4-11e6-a20e-0a591a8e83d7/volumes: no such file or directory

We get the above 2 log entries for all the non-running pods (2150) every 2 seconds.
So our logs get into the Gb pretty quickly

There are 2160 pods in /var/lib/kubelet/pods/

~# ls /var/lib/kubelet/pods/ | wc -l
2160

But only 10 are running and attached to volumes

~# df -h | grep kubelet
/dev/xvdf                  256G  232M  256G   1% /var/lib/kubelet
tmpfs                      7.4G  8.0K  7.4G   1% /var/lib/kubelet/pods/5b884f1f-bbcd-11e6-a20e-0a591a8e83d7/volumes/kubernetes.io~secret/secrets
tmpfs                      7.4G   12K  7.4G   1% /var/lib/kubelet/pods/5b884f1f-bbcd-11e6-a20e-0a591a8e83d7/volumes/kubernetes.io~secret/default-token-lfu24
tmpfs                      7.4G   12K  7.4G   1% /var/lib/kubelet/pods/15302286-bbaa-11e6-a20e-0a591a8e83d7/volumes/kubernetes.io~secret/default-token-m0h9s
tmpfs                      7.4G   12K  7.4G   1% /var/lib/kubelet/pods/b0395433-a546-11e6-9670-0a591a8e83d7/volumes/kubernetes.io~secret/default-token-n79fe
tmpfs                      7.4G   12K  7.4G   1% /var/lib/kubelet/pods/1198c11a-bd25-11e6-a20e-0a591a8e83d7/volumes/kubernetes.io~secret/default-token-np531
tmpfs                      7.4G   12K  7.4G   1% /var/lib/kubelet/pods/473d7d51-bd25-11e6-a20e-0a591a8e83d7/volumes/kubernetes.io~secret/default-token-smuz3
tmpfs                      7.4G   12K  7.4G   1% /var/lib/kubelet/pods/e17b1a95-bd36-11e6-a20e-0a591a8e83d7/volumes/kubernetes.io~secret/default-token-1xs9g
tmpfs                      7.4G   12K  7.4G   1% /var/lib/kubelet/pods/2a36441b-bd57-11e6-a20e-0a591a8e83d7/volumes/kubernetes.io~secret/default-token-qbw68
tmpfs                      7.4G   12K  7.4G   1% /var/lib/kubelet/pods/cf6c04f4-bd64-11e6-a20e-0a591a8e83d7/volumes/kubernetes.io~secret/default-token-n79fe
tmpfs                      7.4G  8.0K  7.4G   1% /var/lib/kubelet/pods/24130c15-bdf5-11e6-98c0-0615e1fbbfc7/volumes/kubernetes.io~secret/secrets
tmpfs                      7.4G   12K  7.4G   1% /var/lib/kubelet/pods/24130c15-bdf5-11e6-98c0-0615e1fbbfc7/volumes/kubernetes.io~secret/default-token-9ksrm
tmpfs                      7.4G   12K  7.4G   1% /var/lib/kubelet/pods/a271290c-bdf6-11e6-98c0-0615e1fbbfc7/volumes/kubernetes.io~secret/default-token-n79fe
@sebbonnet
Copy link
Author

I've attached about a day worth (about 1G) of kubelet logs taken from one of our instances.
These were tar bzip2 and split up in 3 chunks. The zip extension is simply to get passed gihub validation rule on filenames:
kubelet.log.tbz2.part-aa.zip
kubelet.log.tbz2.part-ab.zip
kubelet.log.tbz2.part-ac.zip

@gnufied
Copy link
Member

gnufied commented Dec 12, 2016

For non-running pods are the pod directories completely empty? i.e - is /var/lib/kubelet/pods/ff614192-bcc4-11e6-a20e-0a591a8e83d7 empty or there are files or directories in it?

@calebamiles calebamiles added kind/bug Categorizes issue or PR as related to a bug. sig/node Categorizes an issue or PR as relevant to SIG Node. labels Dec 12, 2016
@calebamiles
Copy link
Contributor

Could someone from @kubernetes/sig-node help to triage this issue further, thanks!

@yujuhong yujuhong added the sig/storage Categorizes an issue or PR as relevant to SIG Storage. label Dec 12, 2016
@gnufied
Copy link
Member

gnufied commented Dec 12, 2016

I can look into this if nobody is on it.

@dchen1107
Copy link
Member

cc/ @dashpole since he is working on disk management improvements (#16563). @gnufied please feel free to send us the pr to fix the issue.

@sebbonnet
Copy link
Author

@gnufied all non-runnings have only one file of 0 byte:

~# find /var/lib/kubelet/pods/ff614192-bcc4-11e6-a20e-0a591a8e83d7
/var/lib/kubelet/pods/ff614192-bcc4-11e6-a20e-0a591a8e83d7
/var/lib/kubelet/pods/ff614192-bcc4-11e6-a20e-0a591a8e83d7/containers
/var/lib/kubelet/pods/ff614192-bcc4-11e6-a20e-0a591a8e83d7/containers/entitlements-manager
/var/lib/kubelet/pods/ff614192-bcc4-11e6-a20e-0a591a8e83d7/containers/entitlements-manager/bcd8c82d

~# ls -l /var/lib/kubelet/pods/ff614192-bcc4-11e6-a20e-0a591a8e83d7/containers/entitlements-manager/bcd8c82d
-rw-r--r-- 1 root root 0 Dec  7 21:35 /var/lib/kubelet/pods/ff614192-bcc4-11e6-a20e-0a591a8e83d7/containers/entitlements-manager/bcd8c82d

@gnufied
Copy link
Member

gnufied commented Dec 13, 2016

@sebbonnet did you upgrade your k8s version from older release? There should be a volumes directory inside the container directories - so I am thinking this might have been missed in upgrade path.

But afaict version starting from 1.3 did create the volume directory, so were those not-running pods created by even older version?

@jsravn
Copy link
Contributor

jsravn commented Dec 13, 2016

@gnufied These aren't old pods, afaik this particular cluster was on 1.3 for a long time (since around 1.3 release) and then upgraded to 1.4. The dates of the pods w/ missing volumes are recent, although it seems to happen in clumps:

drwxr-x---    3 root root   24 Dec  7 21:35 fb257507-bcc4-11e6-a20e-0a591a8e83d7
drwxr-x---    3 root root   24 Dec  8 06:39 fb3e8e96-bd10-11e6-a20e-0a591a8e83d7
drwxr-x---    3 root root   24 Dec  7 18:57 fb451bbc-bcae-11e6-a20e-0a591a8e83d7
drwxr-x---    3 root root   24 Dec  7 20:09 fbe37d82-bcb8-11e6-a20e-0a591a8e83d7
drwxr-x---    3 root root   24 Dec  7 22:25 fbfe1aae-bccb-11e6-a20e-0a591a8e83d7
drwxr-x---    3 root root   24 Dec  7 18:14 fc002cd2-bca8-11e6-a20e-0a591a8e83d7
drwxr-x---    3 root root   24 Dec  7 16:27 fc05019c-bc99-11e6-a20e-0a591a8e83d7
drwxr-x---    3 root root   24 Dec  7 18:57 fc05910c-bcae-11e6-a20e-0a591a8e83d7
drwxr-x---    3 root root   24 Dec  8 01:59 fc2d1b2d-bce9-11e6-a20e-0a591a8e83d7
drwxr-x---    3 root root   24 Dec  8 06:39 fcaf6784-bd10-11e6-a20e-0a591a8e83d7
drwxr-x---    3 root root   24 Dec  8 03:18 fd53dfe9-bcf4-11e6-a20e-0a591a8e83d7
drwxr-x---    3 root root   24 Dec  7 21:35 fd78cb18-bcc4-11e6-a20e-0a591a8e83d7
drwxr-x---    3 root root   24 Dec  8 06:03 fdae7c3f-bd0b-11e6-a20e-0a591a8e83d7
drwxr-x---    3 root root   24 Dec  7 20:09 fdf278b5-bcb8-11e6-a20e-0a591a8e83d7
drwxr-x---    3 root root   24 Dec  7 16:27 fe2e7092-bc99-11e6-a20e-0a591a8e83d7
drwxr-x---    3 root root   24 Dec  7 18:14 fe587537-bca8-11e6-a20e-0a591a8e83d7
drwxr-x---    3 root root   24 Dec  7 22:25 fe615976-bccb-11e6-a20e-0a591a8e83d7
drwxr-x---    3 root root   24 Dec  8 06:03 fe63ba90-bd0b-11e6-a20e-0a591a8e83d7
drwxr-x---    3 root root   24 Dec  8 01:59 fe6f7b44-bce9-11e6-a20e-0a591a8e83d7
drwxr-x---    3 root root   24 Dec  7 23:43 fe8fe92e-bcd6-11e6-a20e-0a591a8e83d7
drwxr-x---    3 root root   24 Dec  8 06:39 fed13706-bd10-11e6-a20e-0a591a8e83d7
drwxr-x---    3 root root   24 Dec  7 18:57 fedf0059-bcae-11e6-a20e-0a591a8e83d7
drwxr-x---    3 root root   24 Dec  7 23:43 ff4a4486-bcd6-11e6-a20e-0a591a8e83d7
drwxr-x---    3 root root   24 Dec  7 21:35 ff614192-bcc4-11e6-a20e-0a591a8e83d7
drwxr-x---    3 root root   24 Dec  8 03:18 ff769116-bcf4-11e6-a20e-0a591a8e83d7
drwxr-x---    3 root root   24 Dec  8 06:03 ffb20ab8-bd0b-11e6-a20e-0a591a8e83d7
drwxr-x---    3 root root   24 Dec  7 22:18 ffbfff64-bcca-11e6-a20e-0a591a8e83d7

And none since then. Other nodes have different dates:

drwxr-x---  3 root root   24 Nov 28 09:01 326df813-b549-11e6-a74f-0a591a8e83d7
drwxr-x---  3 root root   24 Nov 28 09:01 33fa748c-b549-11e6-a74f-0a591a8e83d7
drwxr-x---  3 root root   24 Nov 28 09:01 354bb718-b549-11e6-a74f-0a591a8e83d7
drwxr-x---  3 root root   24 Nov 28 09:01 36d6dc84-b549-11e6-a74f-0a591a8e83d7

Since we only saw this in 1.4.6, I thought it may be related to the changes in #33203 - the offending log line comes from that change. Looking briefly, it seems wrong to me that it requires a volume directory to be present - what about partial recovery situations?

@Random-Liu
Copy link
Member

Looking briefly, it seems wrong to me that it requires a volume directory to be present - what about partial recovery situations?

Agree. Although it's weird that volume directory does not present, it should not block cleanup.

/cc @saad-ali @jingxu97

@Random-Liu
Copy link
Member

Random-Liu commented Dec 16, 2016

@sebbonnet
Kubernetes 1.3.5 is released on Aug 11th, https://github.com/kubernetes/kubernetes/releases/tag/v1.3.5
The PR added volumes directory is cherry-picked on Aug 18th, #30824
Kubernetes 1.3.6 is released on Aug 26th, https://github.com/kubernetes/kubernetes/releases/tag/v1.3.6

If your previous Kubernetes version is older than 1.3.6, then it will probably hit this problem after upgrade.

Anyway, we should not block cleanup because of the absence of volume directory. @jingxu97

@jingxu97
Copy link
Contributor

I checked a little bit and found the "volumes" directory under pod directory has been used long time ago, at least release 1.2 also use it. I am not sure why "volumes" directories are all deleted in this case.

But I agree cleanup should not be blocked by this. Will work on a fix soon.

@Random-Liu
Copy link
Member

Random-Liu commented Dec 16, 2016

I checked a little bit and found the "volumes" directory under pod directory has been used long time ago, at least release 1.2 also use it.

Yeah, you are right. The PR I pasted above only moved the names around. And thanks for the fix! :)

@sebbonnet
Copy link
Author

@Random-Liu @jingxu97 thanks for looking into this. For the record our previous version was 1.3.8

@kabakaev
Copy link

I experience the same issue when a k8s node is rebooted. After restart, kubelet is unable to clean the crashed docker containers. K8s version is 1.4.

As a quick-and-dirty workaround, container cleanup will continue if the "volumes" directory is created manually:
tail -n 100 /var/log/kubernetes/kubelet.log | grep "found, but error open /var/lib/kubelet/pods" | perl -pe 's#^.*(/var/lib/kubelet/pods/.+/volumes).+$#$1#'|sort -u|xargs mkdir

k8s-github-robot pushed a commit that referenced this issue Dec 28, 2016
Automatic merge from submit-queue (batch tested with PRs 38909, 39213)

Add path exist check in getPodVolumePathListFromDisk

Add the path exist check in the function. If the path does not exist,
return empty list and nil error.

fix issue #38498
@yujuhong
Copy link
Contributor

yujuhong commented Jan 3, 2017

@jingxu97 should your fix be cherry-picked to 1.4?

@dhawal55
Copy link
Contributor

dhawal55 commented Jan 5, 2017

In our case, we have a new cluster (no upgrade) using v1.5.1. and still see a lot of these errors:

Jan 05 04:41:15 ip-x-x-x-x kubelet-wrapper[2331]: E0105 04:41:15.266077    2331 kubelet_volumes.go:110] Orphaned pod "01cb497c-d221-11e6-98cc-02c67b45e563" found, but error <nil> occured during reading volume dir from disk
Jan 05 04:41:15 ip-x-x-x-x kubelet-wrapper[2331]: E0105 04:41:15.266166    2331 kubelet_volumes.go:110] Orphaned pod "022b28b0-d2d2-11e6-9038-0288d7377d03" found, but error <nil> occured during reading volume dir from disk

I'm able to reproduce it as follows:

  • create a pvc
  • create a pod that refers to the pvc
  • delete the pod

We are running kubelet as rkt(v1.20.0) container on CoreOS:

Environment="RKT_RUN_ARGS=--uuid-file-save=/var/run/kubelet-pod.uuid \
  --volume dns,kind=host,source=/etc/resolv.conf \
  --mount volume=dns,target=/etc/resolv.conf \
  --volume rkt,kind=host,source=/opt/bin/host-rkt \
  --mount volume=rkt,target=/usr/bin/rkt \
  --volume var-lib-rkt,kind=host,source=/var/lib/rkt \
  --mount volume=var-lib-rkt,target=/var/lib/rkt \
  --volume stage,kind=host,source=/tmp \
  --mount volume=stage,target=/tmp \
  --volume var-log,kind=host,source=/var/log \
  --mount volume=var-log,target=/var/log"
ExecStartPre=/usr/bin/mkdir -p /var/log/containers
ExecStartPre=-/usr/bin/rkt rm --uuid-file=/var/run/kubelet-pod.uuid
ExecStart=/usr/lib/coreos/kubelet-wrapper \
  --rkt-path=/usr/bin/rkt \
  --rkt-stage1-image=coreos.com/rkt/stage1-coreos \
  --kubeconfig=/etc/kubernetes/kubeconfig \
  --require-kubeconfig \
  --register-node=true \
  --allow-privileged=true \
  --config=/etc/kubernetes/manifests \
  --cluster-dns=25.0.0.10 \
  --cluster-domain=cluster.local \
  --cloud-provider=aws \
  --cadvisor-port=4194 \
  --image-gc-high-threshold=80 \
  --image-gc-low-threshold=70 \
  --kube-reserved=cpu=200m,memory=500Mi \
  --system-reserved=cpu=150m,memory=250Mi
ExecStop=-/usr/bin/rkt stop --uuid-file=/var/run/kubelet-pod.uuid
Restart=always
RestartSec=10
[Install]
WantedBy=multi-user.target```

@jingxu97
Copy link
Contributor

jingxu97 commented Jan 5, 2017

@dhawal55 , could you please confirm for how long you see this message? It is normal you might see this for a little bit. It would be also very helpful if you could share your kubelet log for us to take a look. Thanks!

@SleepyBrett
Copy link

SleepyBrett commented Jan 5, 2017

@jingxu97 I'm a coworker of @dhawal55, we're seeing about 400k of these messages every hour in a 9 node cluster. They seem to echo forever because the directories never get cleaned up.

Our exact message is:

E0105 22:06:53.020094    2315 kubelet_volumes.go:110] Orphaned pod "03af2851-d0ac-11e6-98cc-02c67b45e563" found, but error <nil> occured during reading volume dir from disk

the pod uid changes, there are many, but the message for a specific uid gets rereported over and over.

screen shot 2017-01-05 at 2 16 26 pm

Each bar is 3 hours...

@jingxu97
Copy link
Contributor

jingxu97 commented Jan 5, 2017

@SleepyBrett @dhawal55 , we tried the same steps you mentioned above, but could not reproduce the error. Could you share your kubelet log so that we can check why the volume directories are not getting cleaned up? Thanks!

@SleepyBrett
Copy link

SleepyBrett commented Jan 5, 2017

I'm trying to gather the logs now, but fyi after logging into the machine i saw this:

Failed Units: 42
  var-lib-kubelet-plugins-kubernetes.io-aws\x2debs-mounts-aws-us\x2dwest\x2d2a-vol\x2d312e2db9.mount
  var-lib-kubelet-plugins-kubernetes.io-aws\x2debs-mounts-aws-us\x2dwest\x2d2a-vol\x2daa2e2d22.mount
  var-lib-kubelet-plugins-kubernetes.io-aws\x2debs-mounts-aws-us\x2dwest\x2d2a-vol\x2db82e2d30.mount
  var-lib-kubelet-plugins-kubernetes.io-aws\x2debs-mounts-aws-us\x2dwest\x2d2a-vol\x2dc12e2d49.mount
  var-lib-kubelet-plugins-kubernetes.io-aws\x2debs-mounts-aws-us\x2dwest\x2d2a-vol\x2dcf2e2d47.mount
  var-lib-kubelet-plugins-kubernetes.io-aws\x2debs-mounts-aws-us\x2dwest\x2d2a-vol\x2dda2e2d52.mount
  var-lib-kubelet-pods-0137d42f\x2dd393\x2d11e6\x2dae49\x2d02575f5c41f7-volumes-kubernetes.io\x7eaws\x2debs-pvc\x2de5aaae97\x2dd38d\x2d11e6\x2dae49\x2d02575f5c41f7.mount
  var-lib-kubelet-pods-024ab4d9\x2dd373\x2d11e6\x2da64e\x2d02c67b45e563-volumes-kubernetes.io\x7eaws\x2debs-pvc\x2d6937252b\x2dd315\x2d11e6\x2d98cc\x2d02c67b45e563.mount
  var-lib-kubelet-pods-07b65c30\x2dd399\x2d11e6\x2dae49\x2d02575f5c41f7-volumes-kubernetes.io\x7eaws\x2debs-pvc\x2de59d2808\x2dd38d\x2d11e6\x2dae49\x2d02575f5c41f7.mount
  var-lib-kubelet-pods-11bc149c\x2dd357\x2d11e6\x2da64e\x2d02c67b45e563-volumes-kubernetes.io\x7eaws\x2debs-pvc\x2d6937252b\x2dd315\x2d11e6\x2d98cc\x2d02c67b45e563.mount
  var-lib-kubelet-pods-1832020d\x2dd394\x2d11e6\x2dae49\x2d02575f5c41f7-volumes-kubernetes.io\x7eaws\x2debs-pvc\x2d614fdad9\x2dd38c\x2d11e6\x2dae49\x2d02575f5c41f7.mount
  var-lib-kubelet-pods-212d6a28\x2dd33b\x2d11e6\x2da64e\x2d02c67b45e563-volumes-kubernetes.io\x7eaws\x2debs-pvc\x2d6937252b\x2dd315\x2d11e6\x2d98cc\x2d02c67b45e563.mount
  var-lib-kubelet-pods-23ecb8ce\x2dd38e\x2d11e6\x2dae49\x2d02575f5c41f7-volumes-kubernetes.io\x7eaws\x2debs-pvc\x2de59d2808\x2dd38d\x2d11e6\x2dae49\x2d02575f5c41f7.mount
  var-lib-kubelet-pods-2f509e87\x2dd37e\x2d11e6\x2da64e\x2d02c67b45e563-volumes-kubernetes.io\x7eaws\x2debs-pvc\x2d6937252b\x2dd315\x2d11e6\x2d98cc\x2d02c67b45e563.mount
  var-lib-kubelet-pods-3ec1f209\x2dd362\x2d11e6\x2da64e\x2d02c67b45e563-volumes-kubernetes.io\x7eaws\x2debs-pvc\x2d6937252b\x2dd315\x2d11e6\x2d98cc\x2d02c67b45e563.mount
  var-lib-kubelet-pods-4e333c7f\x2dd346\x2d11e6\x2da64e\x2d02c67b45e563-volumes-kubernetes.io\x7eaws\x2debs-pvc\x2d6937252b\x2dd315\x2d11e6\x2d98cc\x2d02c67b45e563.mount
  var-lib-kubelet-pods-5c5690d9\x2dd389\x2d11e6\x2dbcc3\x2d0288d7377d03-volumes-kubernetes.io\x7eaws\x2debs-pvc\x2d6937252b\x2dd315\x2d11e6\x2d98cc\x2d02c67b45e563.mount
  var-lib-kubelet-pods-5da48138\x2dd32a\x2d11e6\x2da64e\x2d02c67b45e563-volumes-kubernetes.io\x7eaws\x2debs-pvc\x2d6937252b\x2dd315\x2d11e6\x2d98cc\x2d02c67b45e563.mount
  var-lib-kubelet-pods-6bfd9cac\x2dd36d\x2d11e6\x2da64e\x2d02c67b45e563-volumes-kubernetes.io\x7eaws\x2debs-pvc\x2d6937252b\x2dd315\x2d11e6\x2d98cc\x2d02c67b45e563.mount
  var-lib-kubelet-pods-746099a1\x2dd399\x2d11e6\x2dae49\x2d02575f5c41f7-volumes-kubernetes.io\x7eaws\x2debs-pvc\x2d35417245\x2dd399\x2d11e6\x2dae49\x2d02575f5c41f7.mount
  var-lib-kubelet-pods-7b391804\x2dd351\x2d11e6\x2da64e\x2d02c67b45e563-volumes-kubernetes.io\x7eaws\x2debs-pvc\x2d6937252b\x2dd315\x2d11e6\x2d98cc\x2d02c67b45e563.mount
  var-lib-kubelet-pods-7c93f28c\x2dd390\x2d11e6\x2dae49\x2d02575f5c41f7-volumes-kubernetes.io\x7eaws\x2debs-pvc\x2d7c926248\x2dd390\x2d11e6\x2dae49\x2d02575f5c41f7.mount
  var-lib-kubelet-pods-895e126f\x2dd394\x2d11e6\x2dbcc3\x2d0288d7377d03-volumes-kubernetes.io\x7eaws\x2debs-pvc\x2d6937252b\x2dd315\x2d11e6\x2d98cc\x2d02c67b45e563.mount
  var-lib-kubelet-pods-8aaa786c\x2dd335\x2d11e6\x2da64e\x2d02c67b45e563-volumes-kubernetes.io\x7eaws\x2debs-pvc\x2d6937252b\x2dd315\x2d11e6\x2d98cc\x2d02c67b45e563.mount
  var-lib-kubelet-pods-912dcd7a\x2dd390\x2d11e6\x2dae49\x2d02575f5c41f7-volumes-kubernetes.io\x7eaws\x2debs-pvc\x2de59d2808\x2dd38d\x2d11e6\x2dae49\x2d02575f5c41f7.mount
  var-lib-kubelet-pods-98cdb61d\x2dd378\x2d11e6\x2da64e\x2d02c67b45e563-volumes-kubernetes.io\x7eaws\x2debs-pvc\x2d6937252b\x2dd315\x2d11e6\x2d98cc\x2d02c67b45e563.mount
  var-lib-kubelet-pods-9a1c07b4\x2dd319\x2d11e6\x2d98cc\x2d02c67b45e563-volumes-kubernetes.io\x7eaws\x2debs-pvc\x2d6937252b\x2dd315\x2d11e6\x2d98cc\x2d02c67b45e563.mount
  var-lib-kubelet-pods-a83f0b24\x2dd35c\x2d11e6\x2da64e\x2d02c67b45e563-volumes-kubernetes.io\x7eaws\x2debs-pvc\x2d6937252b\x2dd315\x2d11e6\x2d98cc\x2d02c67b45e563.mount
  var-lib-kubelet-pods-b7b0558c\x2dd340\x2d11e6\x2da64e\x2d02c67b45e563-volumes-kubernetes.io\x7eaws\x2debs-pvc\x2d6937252b\x2dd315\x2d11e6\x2d98cc\x2d02c67b45e563.mount
  var-lib-kubelet-pods-c4a706ab\x2dd396\x2d11e6\x2dae49\x2d02575f5c41f7-volumes-kubernetes.io\x7eaws\x2debs-pvc\x2de59d2808\x2dd38d\x2d11e6\x2dae49\x2d02575f5c41f7.mount
  var-lib-kubelet-pods-c5d3884d\x2dd383\x2d11e6\x2da64e\x2d02c67b45e563-volumes-kubernetes.io\x7eaws\x2debs-pvc\x2d6937252b\x2dd315\x2d11e6\x2d98cc\x2d02c67b45e563.mount
  var-lib-kubelet-pods-c721737a\x2dd324\x2d11e6\x2da64e\x2d02c67b45e563-volumes-kubernetes.io\x7eaws\x2debs-pvc\x2d6937252b\x2dd315\x2d11e6\x2d98cc\x2d02c67b45e563.mount
  var-lib-kubelet-pods-d544dd02\x2dd367\x2d11e6\x2da64e\x2d02c67b45e563-volumes-kubernetes.io\x7eaws\x2debs-pvc\x2d6937252b\x2dd315\x2d11e6\x2d98cc\x2d02c67b45e563.mount
  var-lib-kubelet-pods-e4b629d7\x2dd34b\x2d11e6\x2da64e\x2d02c67b45e563-volumes-kubernetes.io\x7eaws\x2debs-pvc\x2d6937252b\x2dd315\x2d11e6\x2d98cc\x2d02c67b45e563.mount
  var-lib-kubelet-pods-f2d9ca2a\x2dd38e\x2d11e6\x2dbcc3\x2d0288d7377d03-volumes-kubernetes.io\x7eaws\x2debs-pvc\x2d6937252b\x2dd315\x2d11e6\x2d98cc\x2d02c67b45e563.mount
  var-lib-kubelet-pods-f4277caf\x2dd32f\x2d11e6\x2da64e\x2d02c67b45e563-volumes-kubernetes.io\x7eaws\x2debs-pvc\x2d6937252b\x2dd315\x2d11e6\x2d98cc\x2d02c67b45e563.mount
  var-lib-rkt-pods-run-772ff08e\x2d3567\x2d4cab\x2db9d0\x2d6f599245431f-stage1-rootfs-opt-stage2-hyperkube-rootfs-var-lib-kubelet-plugins-kubernetes.io-aws\x2debs-mounts-aws-us\x2dwest\x2d2a-vol\x2d312e2db9.mount
  var-lib-rkt-pods-run-772ff08e\x2d3567\x2d4cab\x2db9d0\x2d6f599245431f-stage1-rootfs-opt-stage2-hyperkube-rootfs-var-lib-kubelet-plugins-kubernetes.io-aws\x2debs-mounts-aws-us\x2dwest\x2d2a-vol\x2daa2e2d22.mount
  var-lib-rkt-pods-run-772ff08e\x2d3567\x2d4cab\x2db9d0\x2d6f599245431f-stage1-rootfs-opt-stage2-hyperkube-rootfs-var-lib-kubelet-plugins-kubernetes.io-aws\x2debs-mounts-aws-us\x2dwest\x2d2a-vol\x2db82e2d30.mount
  var-lib-rkt-pods-run-772ff08e\x2d3567\x2d4cab\x2db9d0\x2d6f599245431f-stage1-rootfs-opt-stage2-hyperkube-rootfs-var-lib-kubelet-plugins-kubernetes.io-aws\x2debs-mounts-aws-us\x2dwest\x2d2a-vol\x2dc12e2d49.mount
  var-lib-rkt-pods-run-772ff08e\x2d3567\x2d4cab\x2db9d0\x2d6f599245431f-stage1-rootfs-opt-stage2-hyperkube-rootfs-var-lib-kubelet-plugins-kubernetes.io-aws\x2debs-mounts-aws-us\x2dwest\x2d2a-vol\x2dcf2e2d47.mount
  var-lib-rkt-pods-run-772ff08e\x2d3567\x2d4cab\x2db9d0\x2d6f599245431f-stage1-rootfs-opt-stage2-hyperkube-rootfs-var-lib-kubelet-plugins-kubernetes.io-aws\x2debs-mounts-aws-us\x2dwest\x2d2a-vol\x2dda2e2d52.mount

@jingxu97 we consider logs a bit sensitive, can I pass them to you on the kubernetes slack server?

@gnufied
Copy link
Member

gnufied commented Jan 5, 2017

@SleepyBrett And just to double check - is /var/lib/kubelet/pods/<pod-id>/volumes directory non-existant for you as original reporter as well? Or if it exists - is it empty or contains some files and folder?

For example, for an Orphan pod here is how it looks for me:

root@xaos:/var/lib/kubelet/pods/53ee3cc8-b021-11e6-8ecf-4ccc6a88c120# tree .
├── containers
│   └── nginx-with-vol
│       ├── 71a32652
│       └── 773bb46a
├── etc-hosts
├── plugins
└── volumes
    └── kubernetes.io~glusterfs
        └── pvc-cb87fdaf-b01a-11e6-8ecf-4ccc6a88c120

And this consistently produces the same warning you are seeing. But original reporter had a case where - the volumes directory was not present at all. Is that the case for you too?

@jingxu97 I think it may be worth changing log level for Orphaned pod %q found, but error %v occured during reading volume dir from disk because:

a. The error is misleading in case, volume directory can be read but is non-empty and hence we are not deleting it (to protect against data loss etc)

b. It seems this happens quite a bit and I am not sure if printing this as an error is helping. :(

@SleepyBrett
Copy link

SleepyBrett commented Jan 6, 2017

@gnufied

E0106 00:11:45.531653    1717 kubelet_volumes.go:110] Orphaned pod "3ec1f209-d362-11e6-a64e-02c67b45e563" found, but error <nil> occured during reading volume dir from disk
ip-172-24-243-43 pods # find ./3ec1f209-d362-11e6-a64e-02c67b45e563
./3ec1f209-d362-11e6-a64e-02c67b45e563
./3ec1f209-d362-11e6-a64e-02c67b45e563/volumes
./3ec1f209-d362-11e6-a64e-02c67b45e563/volumes/kubernetes.io~aws-ebs
./3ec1f209-d362-11e6-a64e-02c67b45e563/volumes/kubernetes.io~aws-ebs/pvc-6937252b-d315-11e6-98cc-02c67b45e563

@gnufied
Copy link
Member

gnufied commented Jan 6, 2017

Okay so orphaned pods may not get cleaned up properly for variety of reasons - I think we have at least one ticket to handle such cases.

But what you are seeing is intended behavior. I have opened a simple PR that splits the messaging between two - #39503

@dhawal55
Copy link
Contributor

Sorry, i was not active for a while.

@gnufied Can you link to the issue for dealing with orphaned pods? We are seeing AWS nodes misbehave where they fail to attach any volume and I'm trying to figure out if it could be related to these orphaned pods and the failed units on the node.

@chrigl
Copy link

chrigl commented Mar 1, 2017

@jingxu97. Yes I waited for some time. At some points, I even left the orphans over night, and they still were there on the next morning.

@rootfs. You are right. When I run kubelet without rkt, the problem does not exists.
I think this needs to follow up in coreos/bugs#1831.

Thank you guys for your input!

For the record. The "un-containerized" test

# mkdir -p /opt/bin && cp /proc/$(pgrep kubelet)/root/hyperkube /opt/bin/
# systemctl stop kubelet.service
# systemctl cat kubelet-no-rkt.service
# /etc/systemd/system/kubelet-no-rkt.service
[Service]
ExecStartPre=/usr/bin/mkdir -p /etc/kubernetes/manifests
EnvironmentFile=/etc/environment
EnvironmentFile=/etc/environment-kubelet
EnvironmentFile=/etc/environment-os-servername
ExecStart=/opt/bin/hyperkube kubelet \
  --require-kubeconfig \
  --kubeconfig=/etc/kubernetes/worker-kubeconfig.yaml \
  --network-plugin-dir=/etc/kubernetes/cni/net.d \
  --register-node=true \
  --allow-privileged=true \
  --pod-manifest-path=/etc/kubernetes/manifests \
  --hostname-override=${OS_SERVER_NAME} \
  --tls-cert-file=/etc/kubernetes/ssl/worker.pem \
  --tls-private-key-file=/etc/kubernetes/ssl/worker-key.pem \
  --cluster_dns=10.3.0.53 \
  --cluster_domain=cluster.local

Restart=always
RestartSec=10
[Install]
WantedBy=multi-user.target

# systemctl start kubelet-no-rkt.service
# ps -ef | grep kubelet
root      2355     1  7 09:14 ?        00:00:01 /opt/bin/hyperkube kubelet --require-kubeconfig --kubeconfig=/etc/kubernetes/worker-kubeconfig.yaml --network-plugin-dir=/etc/kubernetes/cni/net.d --register-node=true --allow-privileged=true --pod-manifest-path=/etc/kubernetes/manifests --hostname-override=kubi2-kube-worker-re3egr3ck7tl-0-zysdhjsaglf51 --tls-cert-file=/etc/kubernetes/ssl/worker.pem --tls-private-key-file=/etc/kubernetes/ssl/worker-key.pem --cluster_dns=10.3.0.53 --cluster_domain=cluster.local

# docker ps
CONTAINER ID        IMAGE                                                COMMAND                  CREATED             STATUS              PORTS               NAMES
e526aaa5eb1c        nginx:latest                                         "nginx -g 'daemon off"   22 seconds ago      Up 22 seconds                           k8s_nginx-test.4e20122b_nginx-test-3245317857-hl2v6_default_cbabdb5c-fe5f-11e6-a0e1-fa163ed1b02a_b9af961c
1c60bbb2bad9        gcr.io/google_containers/pause-amd64:3.0             "/pause"                 24 seconds ago      Up 24 seconds                           k8s_POD.d8dbe16c_nginx-test-3245317857-hl2v6_default_cbabdb5c-fe5f-11e6-a0e1-fa163ed1b02a_e81cdc03

# mount | grep cbabdb5c-fe5f-11e6-a0e1-fa1
tmpfs on /var/lib/kubelet/pods/cbabdb5c-fe5f-11e6-a0e1-fa163ed1b02a/volumes/kubernetes.io~secret/default-token-rv8md type tmpfs (rw,relatime,seclabel)
tmpfs on /var/lib/rkt/pods/run/dfede8ae-a2e4-411f-a2cc-729020b60b5c/stage1/rootfs/opt/stage2/hyperkube/rootfs/var/lib/kubelet/pods/cbabdb5c-fe5f-11e6-a0e1-fa163ed1b02a/volumes/kubernetes.io~secret/default-token-rv8md type tmpfs (rw,relatime,seclabel)

# systemctl stop kubelet-no-rkt.service
# sleep 10
# systemctl start kubelet-no-rkt.service
# mount | grep cbabdb5c-fe5f-11e6-a0e1-fa1
tmpfs on /var/lib/kubelet/pods/cbabdb5c-fe5f-11e6-a0e1-fa163ed1b02a/volumes/kubernetes.io~secret/default-token-rv8md type tmpfs (rw,relatime,seclabel)
tmpfs on /var/lib/rkt/pods/run/dfede8ae-a2e4-411f-a2cc-729020b60b5c/stage1/rootfs/opt/stage2/hyperkube/rootfs/var/lib/kubelet/pods/cbabdb5c-fe5f-11e6-a0e1-fa163ed1b02a/volumes/kubernetes.io~secret/default-token-rv8md type tmpfs (rw,relatime,seclabel)

chris@home $kubectl delete pod/nginx-test-3245317857-hl2v6
pod "nginx-test-3245317857-hl2v6" deleted

# mount | grep cbabdb5c-fe5f-11e6-a0e1-fa1 || echo "Nothing"
Nothing

While still exists when using rkt.

# systemctl stop kubelet-no-rkt.service
# ps -ef | grep kubelet
root     12914  6882  0 09:27 pts/0    00:00:00 grep --colour=auto kubelet
# systemctl start kubelet.service
# ps -ef | grep kubelet
root     13021     1 99 09:27 ?        00:00:02 /usr/bin/rkt run --uuid-file-save=/var/lib/coreos/kubelet-wrapper.uuid --trust-keys-from-https --volume etc-kubernetes,kind=host,source=/etc/kubernetes,readOnly=false --volume etc-ssl-certs,kind=host,source=/etc/ssl/certs,readOnly=true --volume usr-share-certs,kind=host,source=/usr/share/ca-certificates,readOnly=true --volume var-lib-docker,kind=host,source=/var/lib/docker,readOnly=false --volume var-lib-kubelet,kind=host,source=/var/lib/kubelet,readOnly=false --volume os-release,kind=host,source=/usr/lib/os-release,readOnly=true --volume run,kind=host,source=/run,readOnly=false --mount volume=etc-kubernetes,target=/etc/kubernetes --mount volume=etc-ssl-certs,target=/etc/ssl/certs --mount volume=usr-share-certs,target=/usr/share/ca-certificates --mount volume=var-lib-docker,target=/var/lib/docker --mount volume=var-lib-kubelet,target=/var/lib/kubelet --mount volume=os-release,target=/etc/os-release --mount volume=run,target=/run --stage1-from-dir=stage1-fly.aci quay.io/coreos/hyperkube:v1.5.2_coreos.2 --exec=/kubelet -- --require-kubeconfig --kubeconfig=/etc/kubernetes/worker-kubeconfig.yaml --network-plugin-dir=/etc/kubernetes/cni/net.d --register-node=true --allow-privileged=true --pod-manifest-path=/etc/kubernetes/manifests --hostname-override=kubi2-kube-worker-re3egr3ck7tl-0-zysdhjsaglf51 --tls-cert-file=/etc/kubernetes/ssl/worker.pem --tls-private-key-file=/etc/kubernetes/ssl/worker-key.pem --cluster_dns=10.3.0.53 --cluster_domain=cluster.local
root     13066  6882  0 09:27 pts/0    00:00:00 grep --colour=auto kubelet


# systemctl cat kubelet.service
# /etc/systemd/system/kubelet.service
[Service]
ExecStartPre=/usr/bin/mkdir -p /etc/kubernetes/manifests
EnvironmentFile=/etc/environment
EnvironmentFile=/etc/environment-kubelet
EnvironmentFile=/etc/environment-os-servername
Environment="RKT_RUN_ARGS=--uuid-file-save=/var/lib/coreos/kubelet-wrapper.uuid"
ExecStartPre=-/usr/bin/rkt rm --uuid-file=/var/lib/coreos/kubelet-wrapper.uuid
ExecStart=/usr/lib/coreos/kubelet-wrapper \
  --require-kubeconfig \
  --kubeconfig=/etc/kubernetes/worker-kubeconfig.yaml \
  --network-plugin-dir=/etc/kubernetes/cni/net.d \
  --register-node=true \
  --allow-privileged=true \
  --pod-manifest-path=/etc/kubernetes/manifests \
  --hostname-override=${OS_SERVER_NAME} \
  --tls-cert-file=/etc/kubernetes/ssl/worker.pem \
  --tls-private-key-file=/etc/kubernetes/ssl/worker-key.pem \
  --cluster_dns=10.3.0.53 \
  --cluster_domain=cluster.local
ExecStop=-/usr/bin/rkt stop --uuid-file=/var/lib/coreos/kubelet-wrapper.uuid

Restart=always
RestartSec=10
[Install]
WantedBy=multi-user.target

# docker ps
CONTAINER ID        IMAGE                                                COMMAND                  CREATED             STATUS              PORTS               NAMES
ac7e9a5725fa        nginx:latest                                         "nginx -g 'daemon off"   1 seconds ago       Up 1 seconds                            k8s_nginx-test.4e20122b_nginx-test-3245317857-36hd9_default_8b5a54aa-fe61-11e6-a0e1-fa163ed1b02a_c3e78ffd
8fe71b21e537        gcr.io/google_containers/pause-amd64:3.0             "/pause"                 3 seconds ago       Up 2 seconds                            k8s_POD.d8dbe16c_nginx-test-3245317857-36hd9_default_8b5a54aa-fe61-11e6-a0e1-fa163ed1b02a_92e3d2da

# mount | grep 8b5a54aa-fe61-11e6-a0e1-
tmpfs on /var/lib/rkt/pods/run/a3ce979c-cbed-44a4-b8f3-4c736fdbda3a/stage1/rootfs/opt/stage2/hyperkube/rootfs/var/lib/kubelet/pods/8b5a54aa-fe61-11e6-a0e1-fa163ed1b02a/volumes/kubernetes.io~secret/default-token-rv8md type tmpfs (rw,relatime,seclabel)
tmpfs on /var/lib/kubelet/pods/8b5a54aa-fe61-11e6-a0e1-fa163ed1b02a/volumes/kubernetes.io~secret/default-token-rv8md type tmpfs (rw,relatime,seclabel)

# mount | grep 8b5a54aa-fe61-11e6-a0e1-
tmpfs on /var/lib/kubelet/pods/8b5a54aa-fe61-11e6-a0e1-fa163ed1b02a/volumes/kubernetes.io~secret/default-token-rv8md type tmpfs (rw,relatime,seclabel)
tmpfs on /var/lib/rkt/pods/run/f3d0b790-9ab1-463a-9e25-930d7dc50949/stage1/rootfs/opt/stage2/hyperkube/rootfs/var/lib/kubelet/pods/8b5a54aa-fe61-11e6-a0e1-fa163ed1b02a/volumes/kubernetes.io~secret/default-token-rv8md type tmpfs (rw,relatime,seclabel)
tmpfs on /var/lib/kubelet/pods/8b5a54aa-fe61-11e6-a0e1-fa163ed1b02a/volumes/kubernetes.io~secret/default-token-rv8md type tmpfs (rw,relatime,seclabel)

chris@home $ kubectl delete pod/nginx-test-3245317857-36hd9
pod "nginx-test-3245317857-36hd9" deleted

# journalctl -u kubelet -f
Mar 01 09:31:36 kubi2-kube-worker-re3egr3ck7tl-0-zysdhjsaglf51.novalocal kubelet-wrapper[15818]: E0301 09:31:36.857857   15818 nestedpendingoperations.go:262] Operation for "\"kubernetes.io/secret/8b5a54aa-fe61-11e6-a0e1-fa163ed1b02a-default-token-rv8md\" (\"8b5a54aa-fe61-11e6-a0e1-fa163ed1b02a\")" failed. No retries permitted until 2017-03-01 09:31:40.857818576 +0000 UTC (durationBeforeRetry 4s). Error: UnmountVolume.TearDown failed for volume "kubernetes.io/secret/8b5a54aa-fe61-11e6-a0e1-fa163ed1b02a-default-token-rv8md" (volume.spec.Name: "default-token-rv8md") pod "8b5a54aa-fe61-11e6-a0e1-fa163ed1b02a" (UID: "8b5a54aa-fe61-11e6-a0e1-fa163ed1b02a") with: rename /var/lib/kubelet/pods/8b5a54aa-fe61-11e6-a0e1-fa163ed1b02a/volumes/kubernetes.io~secret/default-token-rv8md /var/lib/kubelet/pods/8b5a54aa-fe61-11e6-a0e1-fa163ed1b02a/volumes/kubernetes.io~secret/wrapped_default-token-rv8md.deleting~760049944: device or resource busy

# mount | grep 8b5a54aa-fe61-11e6-a0e1-
tmpfs on /var/lib/kubelet/pods/8b5a54aa-fe61-11e6-a0e1-fa163ed1b02a/volumes/kubernetes.io~secret/default-token-rv8md type tmpfs (rw,relatime,seclabel)

@lucab
Copy link
Contributor

lucab commented Mar 1, 2017

# systemctl stop kubelet.service

# mount | grep cbabdb5c-fe5f-11e6-a0e1-fa1
tmpfs on /var/lib/kubelet/pods/cbabdb5c-fe5f-11e6-a0e1-fa163ed1b02a/volumes/kubernetes.io~secret/default-token-rv8md type tmpfs (rw,relatime,seclabel)
tmpfs on /var/lib/rkt/pods/run/dfede8ae-a2e4-411f-a2cc-729020b60b5c/stage1/rootfs/opt/stage2/hyperkube/rootfs/var/lib/kubelet/pods/cbabdb5c-fe5f-11e6-a0e1-fa163ed1b02a/volumes/kubernetes.io~secret/default-token-rv8md type tmpfs (rw,relatime,seclabel)

Several things are wrong in here:

  1. there is an alive/non-gc kubelet rkt-pod even if your log shows a stop command. It would be interesting to double-check if it really exited and what logs say.
  2. the kubelet is leaking a mount, which it should be cleaning up before exiting.

@lucab
Copy link
Contributor

lucab commented Mar 1, 2017

@rootfs how is the kubelet run under docker in openshift? In particular, what are the options on the /var/lib/kubelet volume?

My current understanding/guess is that the following steps are occurring here:

  1. kubelet performs a mount
  2. mount is back-propagated to host
  3. kubelet exits without cleaning this mount

At this point the host environment is tainted as there is a leaked mount-point which confuses the kubelet.

@chrigl
Copy link

chrigl commented Mar 1, 2017

@lucab The rkt mount remains, because it did not delete the rkt-pod, but stopped it. There were no rkt processes running.

However. rkt seems not to clean up on stop, but only on rm. See below.

According to https://github.com/coreos/coreos-kubernetes/blob/master/Documentation/kubelet-wrapper.md, the cleanup should happen on the next start, but not on stop.

# rkt list
UUID            APP             IMAGE NAME                                      STATE   CREATED         STARTED         NETWORKS
7fea9388        flannel         quay.io/coreos/flannel:v0.6.2                   running 2 hours ago     2 hours ago
88583037        flannel         quay.io/coreos/flannel:v0.6.2                   exited  2 hours ago     2 hours ago
df41213d        hyperkube       quay.io/coreos/hyperkube:v1.5.3_coreos.0        running 1 hour ago      1 hour ago
# mount | grep df41213d | wc -l
40
# systemctl stop kubelet.service
# rkt list
UUID            APP             IMAGE NAME                                      STATE   CREATED         STARTED         NETWORKS
7fea9388        flannel         quay.io/coreos/flannel:v0.6.2                   running 2 hours ago     2 hours ago
88583037        flannel         quay.io/coreos/flannel:v0.6.2                   exited  2 hours ago     2 hours ago
df41213d        hyperkube       quay.io/coreos/hyperkube:v1.5.3_coreos.0        exited  2 hours ago     2 hours ago
# ps -ef | grep kubelet
root     17029 15006  0 12:31 pts/0    00:00:00 grep --colour=auto kubelet
# ps -ef | grep hyperkube
root      1806  1791  1 10:31 ?        00:01:52 /hyperkube proxy --master=https://10.0.0.20 --kubeconfig=/etc/kubernetes/worker-kubeconfig.yaml --proxy-mode=iptables
root     17101 15006  0 12:31 pts/0    00:00:00 grep --colour=auto hyperkube
# mount | grep df41213d | wc -l
40
# rkt rm df41213d
"df41213d-f07f-4b60-af8e-859a096bdb9a"
# mount | grep df41213d | wc -l
0

Where, when "the node mount" is unmounted by the no-rkt-kubelet, the rtk-mount goes away as well. Because of this, the no-rkt-kubelet works, even with one staled rkt-mount.

# mount | grep 39733284-fe7c-11e6-
tmpfs on /var/lib/kubelet/pods/39733284-fe7c-11e6-8c85-fa163e8dfc03/volumes/kubernetes.io~secret/default-token-35lx5 type tmpfs (rw,relatime,seclabel)
tmpfs on /var/lib/rkt/pods/run/e702024b-8f4e-4cbc-9c69-1f17dc1f3194/stage1/rootfs/opt/stage2/hyperkube/rootfs/var/lib/kubelet/pods/39733284-fe7c-11e6-8c85-fa163e8dfc03/volumes/kubernetes.io~secret/default-token-35lx5 type tmpfs (rw,relatime,seclabel)
tmpfs on /var/lib/kubelet/pods/39733284-fe7c-11e6-8c85-fa163e8dfc03/volumes/kubernetes.io~secret/default-token-35lx5 type tmpfs (rw,relatime,seclabel)
# umount /var/lib/kubelet/pods/39733284-fe7c-11e6-8c85-fa163e8dfc03/volumes/kubernetes.io~secret/default-token-35lx5
# mount | grep 39733284-fe7c-11e6-
tmpfs on /var/lib/kubelet/pods/39733284-fe7c-11e6-8c85-fa163e8dfc03/volumes/kubernetes.io~secret/default-token-35lx5 type tmpfs (rw,relatime,seclabel)

Anyway. One more example:

# systemctl start kubelet.service
# docker ps
CONTAINER ID        IMAGE                                      COMMAND                  CREATED             STATUS              PORTS               NAMES
7bd6148ab3bf        nginx:latest                               "nginx -g 'daemon off"   11 seconds ago      Up 10 seconds                           k8s_nginx-test.a0d511bb_nginx-test-3245317857-gwrxg_default_39733284-fe7c-11e6-8c85-fa163e8dfc03_5587593e

# mount | grep 39733284-fe7c-11e6-
tmpfs on /var/lib/rkt/pods/run/aec0c8f1-6ad4-408e-829a-6ba8cdcd199d/stage1/rootfs/opt/stage2/hyperkube/rootfs/var/lib/kubelet/pods/39733284-fe7c-11e6-8c85-fa163e8dfc03/volumes/kubernetes.io~secret/default-token-35lx5 type tmpfs (rw,relatime,seclabel)
tmpfs on /var/lib/kubelet/pods/39733284-fe7c-11e6-8c85-fa163e8dfc03/volumes/kubernetes.io~secret/default-token-35lx5 type tmpfs (rw,relatime,seclabel)

# systemctl restart kubelet

chris@home $ kubectl run --image=nginx:latest --replicas=5 nginx-test

# mount | grep 39733284-fe7c-11e6-
tmpfs on /var/lib/kubelet/pods/39733284-fe7c-11e6-8c85-fa163e8dfc03/volumes/kubernetes.io~secret/default-token-35lx5 type tmpfs (rw,relatime,seclabel)
# mount | grep 39733284-fe7c-11e6-
tmpfs on /var/lib/kubelet/pods/39733284-fe7c-11e6-8c85-fa163e8dfc03/volumes/kubernetes.io~secret/default-token-35lx5 type tmpfs (rw,relatime,seclabel)
tmpfs on /var/lib/rkt/pods/run/467215ed-975d-4d84-a46c-6ccc172569d5/stage1/rootfs/opt/stage2/hyperkube/rootfs/var/lib/kubelet/pods/39733284-fe7c-11e6-8c85-fa163e8dfc03/volumes/kubernetes.io~secret/default-token-35lx5 type tmpfs (rw,relatime,seclabel)
tmpfs on /var/lib/kubelet/pods/39733284-fe7c-11e6-8c85-fa163e8dfc03/volumes/kubernetes.io~secret/default-token-35lx5 type tmpfs (rw,relatime,seclabel)

# systemctl restart kubelet
# mount | grep 39733284-fe7c-11e6-
tmpfs on /var/lib/kubelet/pods/39733284-fe7c-11e6-8c85-fa163e8dfc03/volumes/kubernetes.io~secret/default-token-35lx5 type tmpfs (rw,relatime,seclabel)
tmpfs on /var/lib/kubelet/pods/39733284-fe7c-11e6-8c85-fa163e8dfc03/volumes/kubernetes.io~secret/default-token-35lx5 type tmpfs (rw,relatime,seclabel)
tmpfs on /var/lib/rkt/pods/run/3a666f31-d349-4438-a0f3-89410f439905/stage1/rootfs/opt/stage2/hyperkube/rootfs/var/lib/kubelet/pods/39733284-fe7c-11e6-8c85-fa163e8dfc03/volumes/kubernetes.io~secret/default-token-35lx5 type tmpfs (rw,relatime,seclabel)
tmpfs on /var/lib/kubelet/pods/39733284-fe7c-11e6-8c85-fa163e8dfc03/volumes/kubernetes.io~secret/default-token-35lx5 type tmpfs (rw,relatime,seclabel)

 # rkt list
UUID            APP             IMAGE NAME                                      STATE   CREATED         STARTED         NETWORKS
3a666f31        hyperkube       quay.io/coreos/hyperkube:v1.5.3_coreos.0        running 2 minutes ago   2 minutes ago
7fea9388        flannel         quay.io/coreos/flannel:v0.6.2                   running 2 hours ago     2 hours ago
88583037        flannel         quay.io/coreos/flannel:v0.6.2                   exited  2 hours ago     2 hours ago

And one more round with watching pgrep kubelets mounts. Like you could see, the rkt pod only knows about one mount as max. Where as there are a counting number on the node.

If hyperkube-rkt comes up, there is no /var/lib/kubelet/pods/39733284-fe7c-11e6-8c85-fa163e8dfc03/volumes/kubernetes.io~secret/default-token-35lx5 in its local mounts. So it does so, which again results in another mount on the node.

# sleep 3 && systemctl restart kubelet &  while true; do sleep 1 && pgrep kubelet && grep "39733284-fe7c-11e6-8c85-fa163e8dfc03" /proc/$(pgrep kubelet)/mounts; done
26330
tmpfs /var/lib/kubelet/pods/39733284-fe7c-11e6-8c85-fa163e8dfc03/volumes/kubernetes.io~secret/default-token-35lx5 tmpfs rw,seclabel,relatime 0 0
26330
tmpfs /var/lib/kubelet/pods/39733284-fe7c-11e6-8c85-fa163e8dfc03/volumes/kubernetes.io~secret/default-token-35lx5 tmpfs rw,seclabel,relatime 0 0
26330
tmpfs /var/lib/kubelet/pods/39733284-fe7c-11e6-8c85-fa163e8dfc03/volumes/kubernetes.io~secret/default-token-35lx5 tmpfs rw,seclabel,relatime 0 0
31749
31749
31749
31749
31749
31749
31749
tmpfs /var/lib/kubelet/pods/39733284-fe7c-11e6-8c85-fa163e8dfc03/volumes/kubernetes.io~secret/default-token-35lx5 tmpfs rw,seclabel,relatime 0 0
31749
tmpfs /var/lib/kubelet/pods/39733284-fe7c-11e6-8c85-fa163e8dfc03/volumes/kubernetes.io~secret/default-token-35lx5 tmpfs rw,seclabel,relatime 0 0
31749

@rootfs
Copy link
Contributor

rootfs commented Mar 1, 2017

@lucab kubelet container uses ns_mounter

@rootfs
Copy link
Contributor

rootfs commented Mar 1, 2017

cc @lpabon @philips for CoreOS related questions.

@calebamiles calebamiles modified the milestone: v1.6 Mar 8, 2017
@dchen1107
Copy link
Member

cc/ @matchstick Can you triage this one for sig-storage. Is the bug targeted for 1.6 release?

@jingxu97
Copy link
Contributor

I think this long thread contains at least two different issues.

  1. CleanupOrphanedPod failed if there is no "volume" directory under pod directory. This issue is fixed by PR Add path exist check in getPodVolumePathListFromDisk #38909 which is also cherrypicked to 1.4
  2. When running kubelet in rkt fly, mountpoint may not be correctly recognized as tmpfs so unmount is not triggered. Needs some help from CoreOS

@dchen1107
Copy link
Member

@calebamiles I am reassigning this one to you. Please triage it to see if it is still targeted for 1.6. Thanks!

@pwittrock pwittrock removed this from the v1.6 milestone Mar 11, 2017
@victorgp
Copy link

We are having the same issue, and i found that the volumes directory of the problematic pods have tons of directories:

(i'm running version 1.5.4)

Example:

# find /var/lib/kubelet/pods/bbbe338d-dde8-11e6-8992-00163e492e7c/volumes/kubernetes.io~secret | wc -l
43616

And i count all of them, in a /var/lib/kubelet with only 39 pods, this is the number of total files i get:

# find /var/lib/kubelet/pods | wc -l
680436

That is why the kubelet cannot get the volumes and shows this message:

Orphaned pod "b936774d-1016-11e7-bf2c-00163eda98cc" found, but error <nil> occured during reading volume dir from disk

Probably is timing out due to the huge amount of volume directories

@gnufied
Copy link
Member

gnufied commented Mar 27, 2017

@victorgp Can you confirm for us - if the directories that are not getting cleaned up are tmpfs mount points or just plain directories?

@lucab
Copy link
Contributor

lucab commented Mar 27, 2017

I've just realized this may be just missing a recursive=true flag in the rkt volume options. However I didn't check what's the original invocation, and it may be better to double-check it has no unwanted side-effects from the kubelet point-of-view.

@chrigl
Copy link

chrigl commented Mar 28, 2017

@lucab I can confirm this. with

--- /usr/lib/coreos/kubelet-wrapper     2017-03-14 20:40:23.000000000 +0000
+++ kubelet-wrapper     2017-03-28 08:20:39.000000000 +0000
@@ -66,7 +66,7 @@
        --volume etc-ssl-certs,kind=host,source=/etc/ssl/certs,readOnly=true \
        --volume usr-share-certs,kind=host,source=/usr/share/ca-certificates,readOnly=true \
        --volume var-lib-docker,kind=host,source=/var/lib/docker,readOnly=false \
-       --volume var-lib-kubelet,kind=host,source=/var/lib/kubelet,readOnly=false \
+       --volume var-lib-kubelet,kind=host,source=/var/lib/kubelet,readOnly=false,recursive=true \
        --volume os-release,kind=host,source=/usr/lib/os-release,readOnly=true \
        --volume run,kind=host,source=/run,readOnly=false \
        --mount volume=etc-kubernetes,target=/etc/kubernetes \

the mounts do not pile up, and thus, deletion of pods works even after a restart of kubelet, and no Orphans are "created".

Of course, I am not someone who is able to test for side-effects.

@lucab
Copy link
Contributor

lucab commented Mar 29, 2017

@jingxu97 I think you can close this, as the first half of it should be fixed by #38909 and the other half is captured at coreos/bugs#1831 (comment).

lucab added a commit to lucab/coreos-overlay that referenced this issue Apr 5, 2017
… mount

So far `/var/lib/kubelet` was mounted as an implicit non-recursive mount.
This changes the wrapper to an explicit recursive mount.

As shown in kubernetes/kubernetes#38498 (comment),
current non-recursive behavior seems to confuse the kubelet which
is incapable of cleaning up resources for orphaned pods, as the
extisting mountpoints for them are not available inside kubelet
chroot.
With `recursive=true`, those mounts are made available in the
chroot and can be unmounted on the host-side from kubelet chroot
via shared back-propagation.

Fixes coreos/bugs#1831
@jingxu97
Copy link
Contributor

close the issue and please open it up again if the issue is not resolved in your case.

euank pushed a commit to euank/coreos-overlay that referenced this issue May 30, 2017
… mount

So far `/var/lib/kubelet` was mounted as an implicit non-recursive mount.
This changes the wrapper to an explicit recursive mount.

As shown in kubernetes/kubernetes#38498 (comment),
current non-recursive behavior seems to confuse the kubelet which
is incapable of cleaning up resources for orphaned pods, as the
extisting mountpoints for them are not available inside kubelet
chroot.
With `recursive=true`, those mounts are made available in the
chroot and can be unmounted on the host-side from kubelet chroot
via shared back-propagation.

Fixes coreos/bugs#1831
euank pushed a commit to euank/coreos-overlay that referenced this issue May 30, 2017
… mount

So far `/var/lib/kubelet` was mounted as an implicit non-recursive mount.
This changes the wrapper to an explicit recursive mount.

As shown in kubernetes/kubernetes#38498 (comment),
current non-recursive behavior seems to confuse the kubelet which
is incapable of cleaning up resources for orphaned pods, as the
extisting mountpoints for them are not available inside kubelet
chroot.
With `recursive=true`, those mounts are made available in the
chroot and can be unmounted on the host-side from kubelet chroot
via shared back-propagation.

Fixes coreos/bugs#1831
ChrisMcKenzie pushed a commit to ChrisMcKenzie/coreos-overlay that referenced this issue Dec 9, 2017
… mount

So far `/var/lib/kubelet` was mounted as an implicit non-recursive mount.
This changes the wrapper to an explicit recursive mount.

As shown in kubernetes/kubernetes#38498 (comment),
current non-recursive behavior seems to confuse the kubelet which
is incapable of cleaning up resources for orphaned pods, as the
extisting mountpoints for them are not available inside kubelet
chroot.
With `recursive=true`, those mounts are made available in the
chroot and can be unmounted on the host-side from kubelet chroot
via shared back-propagation.

Fixes coreos/bugs#1831
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/os/coreos kind/bug Categorizes issue or PR as related to a bug. sig/node Categorizes an issue or PR as relevant to SIG Node. sig/storage Categorizes an issue or PR as relevant to SIG Storage.
Projects
None yet
Development

Successfully merging a pull request may close this issue.