Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

runC depends on devices cgroup to find mountpoints #798

Open
davidlt opened this issue May 2, 2016 · 6 comments
Open

runC depends on devices cgroup to find mountpoints #798

davidlt opened this issue May 2, 2016 · 6 comments
Assignees
Labels

Comments

@davidlt
Copy link

davidlt commented May 2, 2016

While testing rootless containers (not yet merged -- #774) on Scientific Linux CERN SLC release 6.7 (Carbon) (same as CentOS/RHEL) I hit an issue:

$ runc --root $PWD/tmp start test_container
mountpoint for devices not found

Kernel:

Linux YYY 2.6.32-573.22.1.el6.x86_64 #1 SMP Wed Mar 23 17:13:03 CET 2016 x86_64 x86_64 x86_64 GNU/Linux

The same setup worked out-of-the box on Fedora 24.

From @cyphar

It looks like you're missing the devices cgroup. While this would 
ordinarily be a show-stopper for regular containers (for security 
reasons), we don't need the devices cgroup with rootless 
containers! Unfortunately, the cgroup code uses the devices 
cgroup as the "mandatory cgroup" to do path lookup checks to 
figure out where the cgroup mountpoint is.

I also did run ./contrib/check-config.sh from Docker.

warning: /proc/config.gz does not exist, searching other paths for kernel config ...
info: reading kernel config from /boot/config-2.6.32-573.22.1.el6.x86_64 ...

Generally Necessary:
- cgroup hierarchy: single mountpoint! [/cgroup/plus]
   (see https://github.com/tianon/cgroupfs-mount)
- CONFIG_NAMESPACES: enabled
- CONFIG_NET_NS: enabled
- CONFIG_PID_NS: enabled
- CONFIG_IPC_NS: enabled
- CONFIG_UTS_NS: enabled
- CONFIG_DEVPTS_MULTIPLE_INSTANCES: enabled
- CONFIG_CGROUPS: enabled
- CONFIG_CGROUP_CPUACCT: enabled
- CONFIG_CGROUP_DEVICE: enabled
- CONFIG_CGROUP_FREEZER: enabled
- CONFIG_CGROUP_SCHED: enabled
- CONFIG_CPUSETS: enabled
- CONFIG_MEMCG: missing
- CONFIG_KEYS: enabled
- CONFIG_MACVLAN: enabled (as module)
- CONFIG_VETH: enabled (as module)
- CONFIG_BRIDGE: enabled (as module)
- CONFIG_BRIDGE_NETFILTER: enabled
- CONFIG_NF_NAT_IPV4: missing
- CONFIG_IP_NF_FILTER: enabled (as module)
- CONFIG_IP_NF_TARGET_MASQUERADE: enabled (as module)
- CONFIG_NETFILTER_XT_MATCH_ADDRTYPE: missing
- CONFIG_NETFILTER_XT_MATCH_CONNTRACK: enabled (as module)
- CONFIG_NF_NAT: enabled (as module)
- CONFIG_NF_NAT_NEEDED: enabled
- CONFIG_POSIX_MQUEUE: enabled

Optional Features:
- CONFIG_USER_NS: enabled
- CONFIG_SECCOMP: missing
- CONFIG_CGROUP_PIDS: missing
- CONFIG_MEMCG_KMEM: missing
- CONFIG_MEMCG_SWAP: missing
- CONFIG_MEMCG_SWAP_ENABLED: missing
- CONFIG_RESOURCE_COUNTERS: enabled
- CONFIG_BLK_CGROUP: enabled
- CONFIG_BLK_DEV_THROTTLING: enabled
- CONFIG_IOSCHED_CFQ: enabled
- CONFIG_CFQ_GROUP_IOSCHED: enabled
- CONFIG_CGROUP_PERF: enabled
- CONFIG_CGROUP_HUGETLB: missing
- CONFIG_NET_CLS_CGROUP: enabled
- CONFIG_NETPRIO_CGROUP: enabled
- CONFIG_CFS_BANDWIDTH: enabled
- CONFIG_FAIR_GROUP_SCHED: enabled
- CONFIG_RT_GROUP_SCHED: enabled
- CONFIG_EXT3_FS: enabled (as module)
- CONFIG_EXT3_FS_XATTR: enabled
- CONFIG_EXT3_FS_POSIX_ACL: enabled
- CONFIG_EXT3_FS_SECURITY: enabled
- CONFIG_EXT4_FS: enabled (as module)
- CONFIG_EXT4_FS_POSIX_ACL: enabled
- CONFIG_EXT4_FS_SECURITY: enabled
- Network Drivers:
 - "overlay":
   - CONFIG_VXLAN: enabled (as module)
- Storage Drivers:
 - "aufs":
   - CONFIG_AUFS_FS: missing
 - "btrfs":
   - CONFIG_BTRFS_FS: enabled (as module)
 - "devicemapper":
   - CONFIG_BLK_DEV_DM: enabled (as module)
   - CONFIG_DM_THIN_PROVISIONING: enabled (as module)
 - "overlay":
   - CONFIG_OVERLAY_FS: missing
 - "zfs":
   - /dev/zfs: missing
   - zfs command: missing
   - zpool command: missing
@thaJeztah
Copy link
Member

Kernel 2.6 is pretty old and (at least for Docker) no longer supported, so I'm not sure if runC still supports this

@cyphar
Copy link
Member

cyphar commented May 2, 2016

@thaJeztah While technically true, this kernel appears to have user namespaces enabled. So while normal runC probably won't work on such a kernel, it should be entirely possible to run rootless containers on such a setup. I asked @davidlt to open an issue because it appears there's several bugs in runC (that would be reproducible on supported kernels) that have been excacerbated by his setup:

  • It appears as though runC can't handle single mountpoint cgroup hierarchies properly. This is something we should fix (and in fact we could probably use this fix to implement cgroupv2 handling because the semantics are nearly the same).
  • We shouldn't be using the devices cgroup as the "mandatory cgroup to check against" (especially in the rootless container setup).

The key part of the check config output is this:

Generally Necessary:
- cgroup hierarchy: single mountpoint! [/cgroup/plus]
   (see https://github.com/tianon/cgroupfs-mount)

@cyphar cyphar self-assigned this May 2, 2016
@cyphar
Copy link
Member

cyphar commented May 20, 2016

Here's some output from the system in question:

% uname -a
Linux lxplus074.cern.ch 2.6.32-573.22.1.el6.x86_64 #1 SMP Wed Mar 23 17:13:03 CET 2016 x86_64 x86_64 x86_64 GNU/Linux
% ./runc --root $PWD/tmp start test_container
mountpoint for devices not found
% cat /etc/redhat-release
Scientific Linux CERN SLC release 6.7 (Carbon)
% cat /proc/self/mountinfo | grep cgroup
29 21 0:18 / /cgroup/plus rw,relatime - cgroup cgroup rw,blkio,memory,cpuacct,cpu
% cat /proc/self/cgroup
1:blkio,memory,cpuacct,cpu:/users/davidlt

@jpetazzo
Copy link
Contributor

It looks like DevicesGroup.Set has a check to make it a no-op when running in a user namespace (https://github.com/opencontainers/runc/blob/master/libcontainer/cgroups/fs/devices.go#L28) but DevicesGroup.Apply doesn't (https://github.com/opencontainers/runc/blob/master/libcontainer/cgroups/fs/devices.go#L18). Is that an oversight, a work in progress, or am I just being confused?

@hqhq
Copy link
Contributor

hqhq commented Nov 19, 2016

@jpetazzo I think it's intended, we shouldn't change device cgroup settings in user namespace because the kernel won't allow it, but we should still join a particular cgroup which the admin had already set the whitelist or blacklist and want the container to follow.

@Nabarun
Copy link

Nabarun commented Jan 12, 2017

I think I have a similar issue. @hqhq as per your suggestion, I am creating a new group, now can you tell if I need to add any limits to the newly created group.

[ appcontainer]$ cat /proc/self/mountinfo | grep cgroup
25 20 0:18 / /cgroup/cpuset rw,relatime - cgroup cgroup rw,cpuset
26 20 0:19 / /cgroup/cpu rw,relatime - cgroup cgroup rw,cpu
27 20 0:20 / /cgroup/cpuacct rw,relatime - cgroup cgroup rw,cpuacct
28 20 0:21 / /cgroup/memory rw,relatime - cgroup cgroup rw,memory
29 20 0:22 / /cgroup/devices rw,relatime - cgroup cgroup rw,devices
30 20 0:23 / /cgroup/freezer rw,relatime - cgroup cgroup rw,freezer
31 20 0:24 / /cgroup/net_cls rw,relatime - cgroup cgroup rw,net_cls
32 20 0:25 / /cgroup/blkio rw,relatime - cgroup cgroup rw,blkio
[ appcontainer]$ cat /proc/self/cgroup
8:blkio:/
7:net_cls:/
6:freezer:/
5:devices:/
4:memory:/
3:cpuacct:/
2:cpu:/
1:cpuset:/

stefanberger pushed a commit to stefanberger/runc that referenced this issue Sep 8, 2017
…rence

bundle.md: specify root reference the directory
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

6 participants