Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inconsistent container behavior with shiftfs #8490

Closed
1 of 6 tasks
techtonik opened this issue Feb 20, 2021 · 16 comments
Closed
1 of 6 tasks

Inconsistent container behavior with shiftfs #8490

techtonik opened this issue Feb 20, 2021 · 16 comments
Labels
Incomplete Waiting on more information from reporter

Comments

@techtonik
Copy link
Contributor

Required information

  • Distribution: Ubuntu
  • Distribution version: 20.10
  • The output of "lxc info" or if that fails:
    driver: lxc | qemu
    driver_version: 4.0.6 | 5.2.0
    firewall: nftables
    kernel_features:
    netnsid_getifaddrs: "true"
    seccomp_listener: "true"
    seccomp_listener_continue: "true"
    shiftfs: "true"
    uevent_injection: "true"
    unpriv_fscaps: "true"
    kernel_version: 5.8.0-43-generic
    lxc_features:
    cgroup2: "true"
    devpts_fd: "true"
    mount_injection_file: "true"
    network_gateway_device_route: "true"
    network_ipvlan: "true"
    network_l2proxy: "true"
    network_phys_macvlan_mtu: "true"
    network_veth_router: "true"
    pidfd: "true"
    seccomp_allow_deny_syntax: "true"
    seccomp_notify: "true"
    seccomp_proxy_send_notify_fd: "true"
    os_name: Ubuntu
    os_version: "20.10"
    project: default
    server: lxd
    server_clustered: false
    server_version: "4.11"
    storage: zfs
    storage_version: 0.8.4-1ubuntu11.1

Issue description

There is an inconsistent behavior in LXD launching two containers - repodraw and u3. I suppose operations below should be idempotent and depend on container name, but they don't. Graphviz in the first container fails, and works as expected in the second.

$ lxc rm -f repodraw && lxc launch images:ubuntu/20.10 repodraw && lxc exec repodraw -- apt-get -y -qq install graphviz && lxc exec repodraw -- dot -v
...
dot - graphviz version 2.43.0 (0)
There is no layout engine support for "dot"
Perhaps "dot -c" needs to be run (with installer's privileges) to register the plugins?
$ lxc rm -f u3 && lxc launch images:ubuntu/20.10 u3 && lxc exec u3 -- apt-get -y -qq install graphviz && lxc exec u3 -- dot -v
...
dot - graphviz version 2.43.0 (0)
libdir = "/usr/lib/x86_64-linux-gnu/graphviz"
Activated plugin library: libgvplugin_dot_layout.so.6
Using layout: dot:dot_layout
...

I guess this was caused by setting shift=true at some point - https://discuss.linuxcontainers.org/t/secure-and-user-friendly-mounts-for-unprivileged/10284/3?u=techtonik - but lxc rm should clean it up, right?

Information to attach

  • Any relevant kernel output (dmesg)
  • Container log (lxc info NAME --show-log)
Log:

lxc repodraw 20210220003526.613 WARN     cgfsng - cgroups/cgfsng.c:mkdir_eexist_on_last:1129 - File exists - Failed to create directory "/sys/fs/cgroup/cpuset//lxc.monitor.repodraw"
lxc repodraw 20210220003526.615 WARN     cgfsng - cgroups/cgfsng.c:mkdir_eexist_on_last:1129 - File exists - Failed to create directory "/sys/fs/cgroup/cpuset//lxc.payload.repodraw"
lxc repodraw 20210220003526.623 WARN     cgfsng - cgroups/cgfsng.c:fchowmodat:1550 - No such file or directory - Failed to fchownat(17, memory.oom.group, 1000000000, 0, AT_EMPTY_PATH | AT_SYMLINK_NOFOLLOW )
Log:

lxc u3 20210220002258.141 WARN     cgfsng - cgroups/cgfsng.c:mkdir_eexist_on_last:1129 - File exists - Failed to create directory "/sys/fs/cgroup/cpuset//lxc.monitor.u3"
lxc u3 20210220002258.143 WARN     cgfsng - cgroups/cgfsng.c:mkdir_eexist_on_last:1129 - File exists - Failed to create directory "/sys/fs/cgroup/cpuset//lxc.payload.u3"
lxc u3 20210220002258.149 WARN     cgfsng - cgroups/cgfsng.c:fchowmodat:1550 - No such file or directory - Failed to fchownat(17, memory.oom.group, 1000000000, 0, AT_EMPTY_PATH | AT_SYMLINK_NOFOLLOW )
  • Container configuration (lxc config show NAME --expanded)
  • Main daemon log (at /var/log/lxd/lxd.log or /var/snap/lxd/common/lxd/logs/lxd.log)
  • Output of the client with --debug
  • Output of the daemon with --debug (alternatively output of lxc monitor while reproducing the issue)
@stgraber stgraber added the Incomplete Waiting on more information from reporter label Feb 20, 2021
@stgraber
Copy link
Contributor

I'm unable to reproduce your issue, all containers behave like your second one here regardless of name.
This is extremely unlikely to be a LXD issue and I don't see the relation with shift=true given the two containers you're creating here do not have any such attached device in the first place.

@techtonik
Copy link
Contributor Author

So how to debug what's going on?

@techtonik
Copy link
Contributor Author

I compared lxc monitor logs, and there are no differences, except minor message order.

@techtonik
Copy link
Contributor Author

I run strace dot -v inside container, and could not make sense of it. Probably the error occurs during installation of the packages.

Restarting daemon didn't help. Only after I disabled shiftfs and restared lxd daemon, the error is gone.

sudo snap set lxd shiftfs.enable=false
sudo systemctl reload snap.lxd.daemon

Most likely filesystem driver remembered that I attached lxc config device add "$NAME" "$NAME-shared" disk source="$PWD" path="/root/$NAME" shift=true at some point.

@techtonik
Copy link
Contributor Author

When I reenable shiftfs and reload daemon, the dot -v start to fail again. That only repeat for this specific container. It doesn't repeat with other container.

The command I use for testing:

NAME=repodraw; (lxc rm -f $NAME && lxc launch images:ubuntu/20.10 $NAME && lxc exec $NAME -- apt-get -y -qq install graphviz && lxc exec $NAME -- dot -v)

That doesn't make sense. Unless there is some state preserved in filesystem driver related to this container.

@techtonik
Copy link
Contributor Author

It is pretty insane, but the bug with enabled shiftfs depends on the length of container name. The bug only manifests itself if the name is 7-8 symbols.

These work ok:

NAME=ii; (lxc launch images:ubuntu/20.10 $NAME && lxc exec $NAME -- apt-get -y -qq install graphviz && lxc exec $NAME -- dot -v)
NAME=i23; (lxc launch images:ubuntu/20.10 $NAME && lxc exec $NAME -- apt-get -y -qq install graphviz && lxc exec $NAME -- dot -v)
NAME=i23456; (lxc launch images:ubuntu/20.10 $NAME && lxc exec $NAME -- apt-get -y -qq install graphviz && lxc exec $NAME -- dot -v)

These fail:

NAME=i234567; (lxc launch images:ubuntu/20.10 $NAME && lxc exec $NAME -- apt-get -y -qq install graphviz && lxc exec $NAME -- dot -v)
NAME=i2345678; (lxc launch images:ubuntu/20.10 $NAME && lxc exec $NAME -- apt-get -y -qq install graphviz && lxc exec $NAME -- dot -v)
NAME=iiiiiiii; (lxc launch images:ubuntu/20.10 $NAME && lxc exec $NAME -- apt-get -y -qq install graphviz && lxc exec $NAME -- dot -v)

But these again work ok:

NAME=i23456789; (lxc launch images:ubuntu/20.10 $NAME && lxc exec $NAME -- apt-get -y -qq install graphviz && lxc exec $NAME -- dot -v)
NAME=i234567890; (lxc launch images:ubuntu/20.10 $NAME && lxc exec $NAME -- apt-get -y -qq install graphviz && lxc exec $NAME -- dot -v)

That's repeatable.

@techtonik techtonik changed the title Inconsistent container behavior Inconsistent container behavior with shiftfs Feb 21, 2021
@techtonik
Copy link
Contributor Author

It also seems that with enabled shiftfs guest doesn't immediately detect changes on host filesystem.

@brauner
Copy link
Contributor

brauner commented Feb 25, 2021

It also seems that with enabled shiftfs guest doesn't immediately detect changes on host filesystem.

If you change the filesystem directly from the host than shiftfs can't guarantee in all scenarios that the updates are picked up immediately; it'll basically have to keep two caches in sync. If things don't go terribly wrong after this merge window then shiftfs will slowly be faded out in favor of an upstream solution we wrote that has been merged.

@techtonik
Copy link
Contributor Author

@brauner thanks for the clarification. It is unfortunate the shiftfs didn't work for me. For my development purposes I don't really need performance, and secure mount of a network filesystem without caching would solve all my problems. I successfully did that with 9p2000 filesystem, but it still requires installed FUSE driver inside guest. If LXD could extrapolate 9p filesystem support from VMs to containers, then I would not need shiftfs.

@stgraber
Copy link
Contributor

Well, 9pfs in VMs is also terrible for performance, that's why the world is switching to virtiofs instead which is significantly faster but also even more tied to how VMs work.

@brauner I still owe you a review on the liblxc side for the shifted mounts but were you planning on also adding direct LXD support or are we going to have to wait for a new LXC release that brings in that support before we can use it?

@brauner
Copy link
Contributor

brauner commented Apr 16, 2021

Well, 9pfs in VMs is also terrible for performance, that's why the world is switching to virtiofs instead which is significantly faster but also even more tied to how VMs work.

@brauner I still owe you a review on the liblxc side for the shifted mounts but were you planning on also adding direct LXD support or are we going to have to wait for a new LXC release that brings in that support before we can use it?

Direct LXD support as in LXD setting up idmapped mounts for hotplugging into containers?
This obviously will not work for virtiofs right now but the virtiofs developers want this to be a thing.

@stgraber
Copy link
Contributor

@brauner direct LXD support as in detecting availability of the new kernel feature on startup and using it everywhere we use shiftfs today.

@brauner
Copy link
Contributor

brauner commented Apr 16, 2021

Oh, yeah. I think that should be mostly doable it would need to be in the generic part of the storage code with a test whether the filesystem supports that feature.

@techtonik
Copy link
Contributor Author

There is still no explanations of bug with container name.

@stgraber
Copy link
Contributor

stgraber commented Apr 8, 2022

Closing as this looks like a shiftfs issue and shiftfs isn't seeing development at this point.

We're pushing for as many filesystems to move over to idmapped mounts as possible with currently ext4, xfs, vfat, f2fs and btrfs all supporting it. ZFS is tracked at openzfs/zfs#12923 and we have an experimental patchset for cephfs as well as ongoing work on overlayfs.

@stgraber stgraber closed this as completed Apr 8, 2022
@stgraber
Copy link
Contributor

stgraber commented Apr 8, 2022

Ubuntu 22.04 ships with both shiftfs and idmapped mounts with LXD preferring the latter whenever available and shiftfs still requiring direct enablement through snap config at which point it takes care of the few filesystems without the native idmap support.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Incomplete Waiting on more information from reporter
Projects
None yet
Development

No branches or pull requests

3 participants