Skip to content
This repository has been archived by the owner on Oct 16, 2020. It is now read-only.

rkt-gc.service fail to GC some containers - "device or resource busy #1825

Closed
ghost opened this issue Feb 23, 2017 · 1 comment
Closed

rkt-gc.service fail to GC some containers - "device or resource busy #1825

ghost opened this issue Feb 23, 2017 · 1 comment

Comments

@ghost
Copy link

ghost commented Feb 23, 2017

Issue Report

Bug

Bug

Container Linux Version

NAME="Container Linux by CoreOS"
ID=coreos
VERSION=1298.3.0
VERSION_ID=1298.3.0
BUILD_ID=2017-02-02-0148
PRETTY_NAME="Container Linux by CoreOS 1298.3.0 (Ladybug)"

(same behaviour in Stable)

Environment

AWS

Expected Behavior

rkt-gc.service doesn't fail.

Actual Behavior

Container Linux by CoreOS beta (1298.3.0)
Failed Units: 1
  rkt-gc.service
Feb 22 06:39:36 ip-10-150-4-32.ap-southeast-2.compute.internal systemd[1]: Started Garbage Collection for rkt.
Feb 22 18:39:51 ip-10-150-4-32.ap-southeast-2.compute.internal systemd[1]: Starting Garbage Collection for rkt...
Feb 22 18:39:51 ip-10-150-4-32.ap-southeast-2.compute.internal rkt[24166]: Garbage collecting pod "0b02c2b2-ca99-49c0-8db5-fb3a84795f9b"
Feb 22 18:39:52 ip-10-150-4-32.ap-southeast-2.compute.internal rkt[24166]: Garbage collecting pod "13b47696-efba-45a3-a637-5d9e2c7d0e44"
Feb 22 18:39:52 ip-10-150-4-32.ap-southeast-2.compute.internal rkt[24166]: Garbage collecting pod "2ca7e110-f051-42e1-9000-07aba3129a22"
Feb 22 18:39:53 ip-10-150-4-32.ap-southeast-2.compute.internal rkt[24166]: Garbage collecting pod "2d49e671-5ea5-43c6-9447-bbe0982a732b"
Feb 22 18:39:53 ip-10-150-4-32.ap-southeast-2.compute.internal rkt[24166]: Garbage collecting pod "4c3fe5d3-5c0f-4f7f-b50c-36197bb90bb3"
Feb 22 18:39:53 ip-10-150-4-32.ap-southeast-2.compute.internal rkt[24166]: Garbage collecting pod "5442c7ed-6b70-4323-a259-2936aab8a153"
Feb 22 18:39:54 ip-10-150-4-32.ap-southeast-2.compute.internal rkt[24166]: gc: unable to remove pod "5442c7ed-6b70-4323-a259-2936aab8a153": remove /var/lib/rkt/pods/exited-garbage/5442c7ed-6b70-4323-a259-2936aab8a153/stage1/rootfs: device or resource busy
Feb 22 18:39:54 ip-10-150-4-32.ap-southeast-2.compute.internal systemd[1]: rkt-gc.service: Main process exited, code=exited, status=254/n/a
Feb 22 18:39:54 ip-10-150-4-32.ap-southeast-2.compute.internal systemd[1]: Failed to start Garbage Collection for rkt.
Feb 22 18:39:54 ip-10-150-4-32.ap-southeast-2.compute.internal systemd[1]: rkt-gc.service: Unit entered failed state.

The pod failing to be removed is always the flannel #2 container that's run part of the flannel-docker-opts.service systemd unit.

Once the GC has failed once, it will always fail and not delete other RKT containers.

Reproduction Steps

  1. Use flanneld.service
  2. Let the garbage collector kick off

Other Information

Once the garbage collector has tried to delete the pod, rkt list will return an error too.

core@ip-10-150-4-32 ~ $ rkt list
UUID		APP		IMAGE NAME					STATE		CREATED		STARTED		NETWORKS
455cacd4	flannel		quay.io/coreos/flannel:v0.6.2			running		1 day ago	1 day ago
6c403147	hyperkube	quay.io/coreos/hyperkube:v1.5.2_coreos.0	running		20 hours ago	20 hours ago
bf7e9a0b	hyperkube	quay.io/coreos/hyperkube:v1.5.2_coreos.0	exited garbage	1 day ago	1 day ago
e1e8f1b9	awscli		quay.io/coreos/awscli:master			exited garbage	1 day ago	1 day ago
ffd57c73	awscli		quay.io/coreos/awscli:master			exited garbage	1 day ago	1 day ago
list: 1 error(s) encountered when listing pods:
list: ----------------------------------------
list: Unable to read pod 5442c7ed-6b70-4323-a259-2936aab8a153 manifest:
  error reading pod manifest
list: ----------------------------------------
list: misc:
list:   rkt's appc version: 0.8.9
list:

I believe some mounts are not cleaned-up properly ?

$ mount | grep "/var/lib/rkt/pods/exited-garbage/5442c7ed-6b70-4323-a259-2936aab8a153/stage1/rootfs"
overlay on /var/lib/rkt/pods/exited-garbage/bf7e9a0b-e09e-42d6-991a-b1ad5613501e/stage1/rootfs/opt/stage2/hyperkube/rootfs/var/lib/rkt/pods/exited-garbage/5442c7ed-6b70-4323-a259-2936aab8a153/stage1/rootfs type overlay (rw,relatime,context="system_u:object_r:svirt_lxc_file_t:s0:c14,c535",lowerdir=/var/lib/rkt/cas/tree/deps-sha512-072e9dc20265bde06c74957e931a488559585920671657bf93b62850f2539bda/rootfs,upperdir=/var/lib/rkt/pods/run/5442c7ed-6b70-4323-a259-2936aab8a153/overlay/deps-sha512-072e9dc20265bde06c74957e931a488559585920671657bf93b62850f2539bda/upper,workdir=/var/lib/rkt/pods/run/5442c7ed-6b70-4323-a259-2936aab8a153/overlay/deps-sha512-072e9dc20265bde06c74957e931a488559585920671657bf93b62850f2539bda/work)
overlay on /var/lib/rkt/pods/exited-garbage/bf7e9a0b-e09e-42d6-991a-b1ad5613501e/stage1/rootfs/opt/stage2/hyperkube/rootfs/var/lib/rkt/pods/exited-garbage/5442c7ed-6b70-4323-a259-2936aab8a153/stage1/rootfs/opt/stage2/flannel/rootfs type overlay (rw,relatime,context="system_u:object_r:svirt_lxc_file_t:s0:c14,c535",lowerdir=/var/lib/rkt/cas/tree/deps-sha512-694c67901159a6549a0b4ff9fd5c183375125ab3312e882ec155c645e3d23991/rootfs,upperdir=/var/lib/rkt/pods/run/5442c7ed-6b70-4323-a259-2936aab8a153/overlay/deps-sha512-694c67901159a6549a0b4ff9fd5c183375125ab3312e882ec155c645e3d23991/upper/flannel,workdir=/var/lib/rkt/pods/run/5442c7ed-6b70-4323-a259-2936aab8a153/overlay/deps-sha512-694c67901159a6549a0b4ff9fd5c183375125ab3312e882ec155c645e3d23991/work/flannel)
devtmpfs on /var/lib/rkt/pods/exited-garbage/bf7e9a0b-e09e-42d6-991a-b1ad5613501e/stage1/rootfs/opt/stage2/hyperkube/rootfs/var/lib/rkt/pods/exited-garbage/5442c7ed-6b70-4323-a259-2936aab8a153/stage1/rootfs/opt/stage2/flannel/rootfs/dev type devtmpfs (rw,nosuid,seclabel,size=8201580k,nr_inodes=2050395,mode=755)
tmpfs on /var/lib/rkt/pods/exited-garbage/bf7e9a0b-e09e-42d6-991a-b1ad5613501e/stage1/rootfs/opt/stage2/hyperkube/rootfs/var/lib/rkt/pods/exited-garbage/5442c7ed-6b70-4323-a259-2936aab8a153/stage1/rootfs/opt/stage2/flannel/rootfs/dev/shm type tmpfs (rw,nosuid,nodev,seclabel)
devpts on /var/lib/rkt/pods/exited-garbage/bf7e9a0b-e09e-42d6-991a-b1ad5613501e/stage1/rootfs/opt/stage2/hyperkube/rootfs/var/lib/rkt/pods/exited-garbage/5442c7ed-6b70-4323-a259-2936aab8a153/stage1/rootfs/opt/stage2/flannel/rootfs/dev/pts type devpts (rw,nosuid,noexec,relatime,seclabel,gid=5,mode=620,ptmxmode=000)
hugetlbfs on /var/lib/rkt/pods/exited-garbage/bf7e9a0b-e09e-42d6-991a-b1ad5613501e/stage1/rootfs/opt/stage2/hyperkube/rootfs/var/lib/rkt/pods/exited-garbage/5442c7ed-6b70-4323-a259-2936aab8a153/stage1/rootfs/opt/stage2/flannel/rootfs/dev/hugepages type hugetlbfs (rw,relatime,seclabel)
mqueue on /var/lib/rkt/pods/exited-garbage/bf7e9a0b-e09e-42d6-991a-b1ad5613501e/stage1/rootfs/opt/stage2/hyperkube/rootfs/var/lib/rkt/pods/exited-garbage/5442c7ed-6b70-4323-a259-2936aab8a153/stage1/rootfs/opt/stage2/flannel/rootfs/dev/mqueue type mqueue (rw,relatime,seclabel)
proc on /var/lib/rkt/pods/exited-garbage/bf7e9a0b-e09e-42d6-991a-b1ad5613501e/stage1/rootfs/opt/stage2/hyperkube/rootfs/var/lib/rkt/pods/exited-garbage/5442c7ed-6b70-4323-a259-2936aab8a153/stage1/rootfs/opt/stage2/flannel/rootfs/proc type proc (rw,nosuid,nodev,noexec,relatime)
xenfs on /var/lib/rkt/pods/exited-garbage/bf7e9a0b-e09e-42d6-991a-b1ad5613501e/stage1/rootfs/opt/stage2/hyperkube/rootfs/var/lib/rkt/pods/exited-garbage/5442c7ed-6b70-4323-a259-2936aab8a153/stage1/rootfs/opt/stage2/flannel/rootfs/proc/xen type xenfs (rw,relatime)
systemd-1 on /var/lib/rkt/pods/exited-garbage/bf7e9a0b-e09e-42d6-991a-b1ad5613501e/stage1/rootfs/opt/stage2/hyperkube/rootfs/var/lib/rkt/pods/exited-garbage/5442c7ed-6b70-4323-a259-2936aab8a153/stage1/rootfs/opt/stage2/flannel/rootfs/proc/sys/fs/binfmt_misc type autofs (rw,relatime,fd=35,pgrp=1,timeout=0,minproto=5,maxproto=5,direct,pipe_ino=13414)
sysfs on /var/lib/rkt/pods/exited-garbage/bf7e9a0b-e09e-42d6-991a-b1ad5613501e/stage1/rootfs/opt/stage2/hyperkube/rootfs/var/lib/rkt/pods/exited-garbage/5442c7ed-6b70-4323-a259-2936aab8a153/stage1/rootfs/opt/stage2/flannel/rootfs/sys type sysfs (rw,nosuid,nodev,noexec,relatime,seclabel)
securityfs on /var/lib/rkt/pods/exited-garbage/bf7e9a0b-e09e-42d6-991a-b1ad5613501e/stage1/rootfs/opt/stage2/hyperkube/rootfs/var/lib/rkt/pods/exited-garbage/5442c7ed-6b70-4323-a259-2936aab8a153/stage1/rootfs/opt/stage2/flannel/rootfs/sys/kernel/security type securityfs (rw,nosuid,nodev,noexec,relatime)
tmpfs on /var/lib/rkt/pods/exited-garbage/bf7e9a0b-e09e-42d6-991a-b1ad5613501e/stage1/rootfs/opt/stage2/hyperkube/rootfs/var/lib/rkt/pods/exited-garbage/5442c7ed-6b70-4323-a259-2936aab8a153/stage1/rootfs/opt/stage2/flannel/rootfs/sys/fs/cgroup type tmpfs (ro,nosuid,nodev,noexec,seclabel,mode=755)
cgroup on /var/lib/rkt/pods/exited-garbage/bf7e9a0b-e09e-42d6-991a-b1ad5613501e/stage1/rootfs/opt/stage2/hyperkube/rootfs/var/lib/rkt/pods/exited-garbage/5442c7ed-6b70-4323-a259-2936aab8a153/stage1/rootfs/opt/stage2/flannel/rootfs/sys/fs/cgroup/systemd type cgroup (rw,nosuid,nodev,noexec,relatime,xattr,release_agent=/usr/lib/systemd/systemd-cgroups-agent,name=systemd)
cgroup on /var/lib/rkt/pods/exited-garbage/bf7e9a0b-e09e-42d6-991a-b1ad5613501e/stage1/rootfs/opt/stage2/hyperkube/rootfs/var/lib/rkt/pods/exited-garbage/5442c7ed-6b70-4323-a259-2936aab8a153/stage1/rootfs/opt/stage2/flannel/rootfs/sys/fs/cgroup/blkio type cgroup (rw,nosuid,nodev,noexec,relatime,blkio)
cgroup on /var/lib/rkt/pods/exited-garbage/bf7e9a0b-e09e-42d6-991a-b1ad5613501e/stage1/rootfs/opt/stage2/hyperkube/rootfs/var/lib/rkt/pods/exited-garbage/5442c7ed-6b70-4323-a259-2936aab8a153/stage1/rootfs/opt/stage2/flannel/rootfs/sys/fs/cgroup/net_cls,net_prio type cgroup (rw,nosuid,nodev,noexec,relatime,net_cls,net_prio)
cgroup on /var/lib/rkt/pods/exited-garbage/bf7e9a0b-e09e-42d6-991a-b1ad5613501e/stage1/rootfs/opt/stage2/hyperkube/rootfs/var/lib/rkt/pods/exited-garbage/5442c7ed-6b70-4323-a259-2936aab8a153/stage1/rootfs/opt/stage2/flannel/rootfs/sys/fs/cgroup/devices type cgroup (rw,nosuid,nodev,noexec,relatime,devices)
cgroup on /var/lib/rkt/pods/exited-garbage/bf7e9a0b-e09e-42d6-991a-b1ad5613501e/stage1/rootfs/opt/stage2/hyperkube/rootfs/var/lib/rkt/pods/exited-garbage/5442c7ed-6b70-4323-a259-2936aab8a153/stage1/rootfs/opt/stage2/flannel/rootfs/sys/fs/cgroup/cpu,cpuacct type cgroup (rw,nosuid,nodev,noexec,relatime,cpu,cpuacct)
cgroup on /var/lib/rkt/pods/exited-garbage/bf7e9a0b-e09e-42d6-991a-b1ad5613501e/stage1/rootfs/opt/stage2/hyperkube/rootfs/var/lib/rkt/pods/exited-garbage/5442c7ed-6b70-4323-a259-2936aab8a153/stage1/rootfs/opt/stage2/flannel/rootfs/sys/fs/cgroup/freezer type cgroup (rw,nosuid,nodev,noexec,relatime,freezer)
cgroup on /var/lib/rkt/pods/exited-garbage/bf7e9a0b-e09e-42d6-991a-b1ad5613501e/stage1/rootfs/opt/stage2/hyperkube/rootfs/var/lib/rkt/pods/exited-garbage/5442c7ed-6b70-4323-a259-2936aab8a153/stage1/rootfs/opt/stage2/flannel/rootfs/sys/fs/cgroup/pids type cgroup (rw,nosuid,nodev,noexec,relatime,pids)
cgroup on /var/lib/rkt/pods/exited-garbage/bf7e9a0b-e09e-42d6-991a-b1ad5613501e/stage1/rootfs/opt/stage2/hyperkube/rootfs/var/lib/rkt/pods/exited-garbage/5442c7ed-6b70-4323-a259-2936aab8a153/stage1/rootfs/opt/stage2/flannel/rootfs/sys/fs/cgroup/memory type cgroup (rw,nosuid,nodev,noexec,relatime,memory)
cgroup on /var/lib/rkt/pods/exited-garbage/bf7e9a0b-e09e-42d6-991a-b1ad5613501e/stage1/rootfs/opt/stage2/hyperkube/rootfs/var/lib/rkt/pods/exited-garbage/5442c7ed-6b70-4323-a259-2936aab8a153/stage1/rootfs/opt/stage2/flannel/rootfs/sys/fs/cgroup/cpuset type cgroup (rw,nosuid,nodev,noexec,relatime,cpuset)
cgroup on /var/lib/rkt/pods/exited-garbage/bf7e9a0b-e09e-42d6-991a-b1ad5613501e/stage1/rootfs/opt/stage2/hyperkube/rootfs/var/lib/rkt/pods/exited-garbage/5442c7ed-6b70-4323-a259-2936aab8a153/stage1/rootfs/opt/stage2/flannel/rootfs/sys/fs/cgroup/perf_event type cgroup (rw,nosuid,nodev,noexec,relatime,perf_event)
cgroup on /var/lib/rkt/pods/exited-garbage/bf7e9a0b-e09e-42d6-991a-b1ad5613501e/stage1/rootfs/opt/stage2/hyperkube/rootfs/var/lib/rkt/pods/exited-garbage/5442c7ed-6b70-4323-a259-2936aab8a153/stage1/rootfs/opt/stage2/flannel/rootfs/sys/fs/cgroup/hugetlb type cgroup (rw,nosuid,nodev,noexec,relatime,hugetlb)
pstore on /var/lib/rkt/pods/exited-garbage/bf7e9a0b-e09e-42d6-991a-b1ad5613501e/stage1/rootfs/opt/stage2/hyperkube/rootfs/var/lib/rkt/pods/exited-garbage/5442c7ed-6b70-4323-a259-2936aab8a153/stage1/rootfs/opt/stage2/flannel/rootfs/sys/fs/pstore type pstore (rw,nosuid,nodev,noexec,relatime,seclabel)
selinuxfs on /var/lib/rkt/pods/exited-garbage/bf7e9a0b-e09e-42d6-991a-b1ad5613501e/stage1/rootfs/opt/stage2/hyperkube/rootfs/var/lib/rkt/pods/exited-garbage/5442c7ed-6b70-4323-a259-2936aab8a153/stage1/rootfs/opt/stage2/flannel/rootfs/sys/fs/selinux type selinuxfs (rw,relatime)
debugfs on /var/lib/rkt/pods/exited-garbage/bf7e9a0b-e09e-42d6-991a-b1ad5613501e/stage1/rootfs/opt/stage2/hyperkube/rootfs/var/lib/rkt/pods/exited-garbage/5442c7ed-6b70-4323-a259-2936aab8a153/stage1/rootfs/opt/stage2/flannel/rootfs/sys/kernel/debug type debugfs (rw,relatime,seclabel)
tmpfs on /var/lib/rkt/pods/exited-garbage/bf7e9a0b-e09e-42d6-991a-b1ad5613501e/stage1/rootfs/opt/stage2/hyperkube/rootfs/var/lib/rkt/pods/exited-garbage/5442c7ed-6b70-4323-a259-2936aab8a153/stage1/rootfs/opt/stage2/flannel/rootfs/tmp type tmpfs (rw,relatime,seclabel)
/dev/mapper/usr on /var/lib/rkt/pods/exited-garbage/bf7e9a0b-e09e-42d6-991a-b1ad5613501e/stage1/rootfs/opt/stage2/hyperkube/rootfs/var/lib/rkt/pods/exited-garbage/5442c7ed-6b70-4323-a259-2936aab8a153/stage1/rootfs/opt/stage2/flannel/rootfs/usr/share/ca-certificates type ext4 (ro,relatime,seclabel,block_validity,delalloc,barrier,user_xattr,acl)
/dev/xvda9 on /var/lib/rkt/pods/exited-garbage/bf7e9a0b-e09e-42d6-991a-b1ad5613501e/stage1/rootfs/opt/stage2/hyperkube/rootfs/var/lib/rkt/pods/exited-garbage/5442c7ed-6b70-4323-a259-2936aab8a153/stage1/rootfs/opt/stage2/flannel/rootfs/etc/hosts type ext4 (rw,relatime,seclabel,data=ordered)
tmpfs on /var/lib/rkt/pods/exited-garbage/bf7e9a0b-e09e-42d6-991a-b1ad5613501e/stage1/rootfs/opt/stage2/hyperkube/rootfs/var/lib/rkt/pods/exited-garbage/5442c7ed-6b70-4323-a259-2936aab8a153/stage1/rootfs/opt/stage2/flannel/rootfs/etc/resolv.conf type tmpfs (rw,nosuid,nodev,seclabel,mode=755)
/dev/xvda9 on /var/lib/rkt/pods/exited-garbage/bf7e9a0b-e09e-42d6-991a-b1ad5613501e/stage1/rootfs/opt/stage2/hyperkube/rootfs/var/lib/rkt/pods/exited-garbage/5442c7ed-6b70-4323-a259-2936aab8a153/stage1/rootfs/opt/stage2/flannel/rootfs/etc/ssl/etcd type ext4 (rw,relatime,seclabel,data=ordered)
tmpfs on /var/lib/rkt/pods/exited-garbage/bf7e9a0b-e09e-42d6-991a-b1ad5613501e/stage1/rootfs/opt/stage2/hyperkube/rootfs/var/lib/rkt/pods/exited-garbage/5442c7ed-6b70-4323-a259-2936aab8a153/stage1/rootfs/opt/stage2/flannel/rootfs/run/flannel type tmpfs (rw,nosuid,nodev,seclabel,mode=755)
/dev/mapper/usr on /var/lib/rkt/pods/exited-garbage/bf7e9a0b-e09e-42d6-991a-b1ad5613501e/stage1/rootfs/opt/stage2/hyperkube/rootfs/var/lib/rkt/pods/exited-garbage/5442c7ed-6b70-4323-a259-2936aab8a153/stage1/rootfs/opt/stage2/flannel/rootfs/etc/ssl/certs type ext4 (ro,relatime,seclabel,block_validity,delalloc,barrier,
@ghost ghost changed the title rkt-gc.service fail to GC some containers rkt-gc.service fail to GC some containers - "device or resource busy Feb 23, 2017
@euank
Copy link
Contributor

euank commented Apr 12, 2017

This should be fixed by rkt/rkt#3486

That's in rkt v1.22.0 and later (so in the current alpha and beta, but not stable).

I think there were also improvements to how flanneld containers were cleaned up.

Please re-open if updating to rkt >= 1.22.0 doesn't fix this.

@euank euank closed this as completed Apr 12, 2017
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

2 participants