Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Requires running under rootful podman #98

Open
cgwalters opened this issue Jan 10, 2024 · 16 comments
Open

Requires running under rootful podman #98

cgwalters opened this issue Jan 10, 2024 · 16 comments

Comments

@cgwalters
Copy link
Contributor

A paper cut we hit today is that podman desktop defaults to rootless, and bib doesn't work with that because we need loopback. The core problem is we need to write Linux filesystems. The important Linux filesystems like XFS/ext4 in general really want to be only written by code from the Linux kernel.

Running the Linux kernel is either done by reusing the host kernel (privileged), or running a VM. But on the podman machine case we're already in a VM, which gets us into nested virt, and on Mac at least that's going to involve full emulation which usually mostly works but isn't considered a production scenario and definitely hits weird random bugs.

My inclination because we're already running this container with --privileged is just to behind the scenes reuse the fact that podman machine uses FCOS today and the core user has passwordless sudo enabled and basically reuse that to re-execute ourselves with real root privileges. Yes, this would not really be "rootless" but I personally don't care about that and I don't think users would really in general either.

@supakeen
Copy link
Member

We're also investigating if we can do at least (some) of the filesystem work with libguestfs.

@cgwalters
Copy link
Contributor Author

libguestfs is just a way to run VMs, so the nested virt concerns above apply.

@cgwalters cgwalters changed the title Running with Rootless Podman Requires running under rootful podman Jan 10, 2024
@cgwalters cgwalters pinned this issue Jan 10, 2024
@achilleas-k
Copy link
Member

Right, so I was just reading about the internals and yeah libguestfs uses qemu to boot a kernel and sets up an "appliance" to talk to it. :|

@cgwalters
Copy link
Contributor Author

The 3rd option (beyond host kernel and virt) is https://github.com/lkl/linux which is relatively new and specifically cptofs is about this problem but...I really don't think it's worth trying to scope this in right now.

@ondrejbudai
Copy link
Member

libguestfs doesn't require KVM: https://libguestfs.org/guestfs-faq.1.html I guess it just falls back to emulation if there's no KVM. The question is how fast it is.

@achilleas-k
Copy link
Member

Mounting directly uses FUSE and is pretty poor, but supposedly using the shell can be quite good. We can benchmark of course.

FTR, this works on rootless podman machine on macOS:
test.sh

#!/usr/bin/env bash

set -euo pipefail

fname="${1}"
truncate -s 100M "${fname}"

mkfs.ext4 "${fname}"

guestfish --rw -a "${fname}" << EOF
run
list-filesystems
mount /dev/sda /
copy-in test.sh /
cat /test.sh
quit
EOF

echo "DONE"
rm "${fname}"

Containerfile

FROM fedora:39

RUN dnf -y install libguestfs

ENV LIBGUESTFS_BACKEND=direct

COPY test.sh /test.sh
ENTRYPOINT ["/test.sh"]

@cgwalters
Copy link
Contributor Author

Note that https://github.com/cgwalters/osbuildbootc/ doesn't use libguestfs, but it does use the underlying tool (supermin) to construct a VM root filesystem out of the container rootfs and works unprivileged today.

Honestly I think that code and approach there is much simpler than the "higher level" libguestfs approach because we have the ability to drive things at a low level.

So if we go down this path I think it'd make sense to look at merging that code.

(The other thing osbuildbootc does it defers all the heavy lifting to bootc install to-disk, which is #18 )

@cgwalters
Copy link
Contributor Author

the underlying tool (supermin) to construct a VM root filesystem out of the container rootfs

That said what would make much more sense in a modern times is to use virtiofs as the root filesystem instead, it probably wouldn't be too hard. I just haven't dug into it.

@cgwalters
Copy link
Contributor Author

Honestly I think that code and approach there is much simpler than the "higher level" libguestfs approach because we have the ability to drive things at a low level.

For example, forcing indirection through libguestfs's high level APIs reintroduce the same problems that osbuild creates today that motivates ostreedev/ostree#3094 - what we're doing often wants to do quite low level filesystem and block device things. libguestfs is just high level sugar for executing arbitrary code in a transient VM, and we can construct a transient VM without it.

@ondrejbudai
Copy link
Member

I'm worried that doing the whole build under supermin might be extremely slow if KVM is not there. Whereas if we just offload the final copying part, it might be fine. I know that @achilleas-k is working on some benchmarks.

@ondrejbudai
Copy link
Member

Also, full QEMU emulation isn't supported on RHEL. I wonder if guestfs has an exception....

@cgwalters
Copy link
Contributor Author

libguestfs doesn't have an exception, its main use case is just targeted being used from Linux hosts.

@vrothberg
Copy link
Contributor

I am currently catching up on containers/podman-desktop-extension-bootc#93. What's the current status of this issue? The root requirement can be documented (as pointed out in containers/podman-desktop-extension-bootc#93) but I want to have a better understanding.

@cgwalters
Copy link
Contributor Author

I doubt we're going to do anything major here soon, I think we should just document switching or initializing with --rootful.

@ondrejbudai
Copy link
Member

ondrejbudai commented Mar 11, 2024

I don't think we have any ways to fix it. bootc-image-builder is meant to run in environments (Mac) without KVM support. libguestfs is utterly slow without KVM. mkfs.xfs protofiles don't work well with the bootc install model (unless bootc gets support for it).

EDIT: Just to clarify, the issue is that we need to mount the disk file so we can write the files into it. That can be done only by a root in the top-level user namespace. Root in a rootless container simply cannot do it.

@cgwalters
Copy link
Contributor Author

mkfs.xfs protofiles don't work well with the bootc install model (unless bootc gets support for it).

Right, to elaborate on that slightly it would create wildly distinct mechanisms for "day 1" versus "day 2". It's not impossible...but would be extremely hard to maintain over time.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants