Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bootc install to-disk --via-loopback fails when pointing to file mounted over virtfs #485

Closed
ckyrouac opened this issue Apr 24, 2024 · 5 comments · Fixed by #487
Closed
Assignees
Labels
area/install Issues related to `bootc install` area/osintegration Relates to an external OS/distro base image bug Something isn't working

Comments

@ckyrouac
Copy link
Contributor

I discovered this when using podman-bootc-cli on Fedora 40. The following bootc command fails because the sgdisk command to create the partitions silently fails, so the actual error happens when trying to list the created partitions.

This command:

podman -c podman-machine-default-root run --pid=host --network=host --privileged --security-opt label=type:unconfined_t -v /var/lib/containers:/var/lib/containers -v /home/chris:/output -v /dev:/dev quay.io/centos-bootc/centos-bootc:stream9 bootc install to-disk --via-loopback --generic-image --skip-fetch-check /output/test.raw

results in this bootc error:

ERROR Installing to disk: Creating rootfs: Failed to find children after partitioning

These are the errors in the journal:

Apr 24 16:35:19 localhost.localdomain kernel: ------------[ cut here ]------------
Apr 24 16:35:19 localhost.localdomain kernel: WARNING: CPU: 3 PID: 2827 at fs/netfs/iterator.c:50 netfs_extract_user_iter+0x175/0x250 [netfs]
Apr 24 16:35:19 localhost.localdomain kernel: Modules linked in: loop 9p netfs rfkill overlay ptp_kvm sunrpc intel_rapl_msr intel_rapl_common binfmt_misc intel_uncore_frequency_common intel_pmc_core intel_vsec pmt_telemetry pmt_class kvm_intel ppdev kvm s>
Apr 24 16:35:19 localhost.localdomain kernel: CPU: 3 PID: 2827 Comm: kworker/u16:1 Not tainted 6.8.4-200.fc39.x86_64 #1
Apr 24 16:35:19 localhost.localdomain kernel: Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.3-1.fc39 04/01/2014
Apr 24 16:35:19 localhost.localdomain kernel: Workqueue: loop0 loop_workfn [loop]
Apr 24 16:35:19 localhost.localdomain kernel: RIP: 0010:netfs_extract_user_iter+0x175/0x250 [netfs]
Apr 24 16:35:19 localhost.localdomain kernel: Code: c6 29 fb 31 ff 89 5a f8 4c 39 d9 75 c3 4d 85 c9 0f 84 c2 00 00 00 45 39 f2 0f 83 b9 00 00 00 4d 89 cd 44 89 d3 e9 30 ff ff ff <0f> 0b 48 c7 c3 fb ff ff ff 48 8b 44 24 28 65 48 2b 04 25 28 00 00
Apr 24 16:35:19 localhost.localdomain kernel: RSP: 0018:ffffb49282c2bc60 EFLAGS: 00010202
Apr 24 16:35:19 localhost.localdomain kernel: RAX: 0000000000000000 RBX: ffff991ba6cf0e00 RCX: 0000000000000000
Apr 24 16:35:19 localhost.localdomain kernel: RDX: ffff991ba6cf0e78 RSI: 0000000000005000 RDI: ffffb49282c2bd18
Apr 24 16:35:19 localhost.localdomain kernel: RBP: ffff991b6f6b35b0 R08: ffffffffc0ee1a80 R09: 0000000000000000
Apr 24 16:35:19 localhost.localdomain kernel: R10: ffffb49282c2bc80 R11: 0000000000000001 R12: ffff991b526f02e8
Apr 24 16:35:19 localhost.localdomain kernel: R13: ffffffffc0f55b40 R14: ffffb49282c2bd18 R15: ffff991ba6cf0e00
Apr 24 16:35:19 localhost.localdomain kernel: FS:  0000000000000000(0000) GS:ffff991bbdcc0000(0000) knlGS:0000000000000000
Apr 24 16:35:19 localhost.localdomain kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Apr 24 16:35:19 localhost.localdomain kernel: CR2: 00007f01d8001018 CR3: 0000000006930006 CR4: 0000000000770ef0
Apr 24 16:35:19 localhost.localdomain kernel: PKRU: 55555554
Apr 24 16:35:19 localhost.localdomain kernel: Call Trace:
Apr 24 16:35:19 localhost.localdomain kernel:  <TASK>
Apr 24 16:35:19 localhost.localdomain kernel:  ? netfs_extract_user_iter+0x175/0x250 [netfs]
Apr 24 16:35:19 localhost.localdomain kernel:  ? __warn+0x81/0x130
Apr 24 16:35:19 localhost.localdomain kernel:  ? netfs_extract_user_iter+0x175/0x250 [netfs]
Apr 24 16:35:19 localhost.localdomain kernel:  ? report_bug+0x171/0x1a0
Apr 24 16:35:19 localhost.localdomain kernel:  ? handle_bug+0x3c/0x80
Apr 24 16:35:19 localhost.localdomain kernel:  ? exc_invalid_op+0x17/0x70
Apr 24 16:35:19 localhost.localdomain kernel:  ? asm_exc_invalid_op+0x1a/0x20
Apr 24 16:35:19 localhost.localdomain kernel:  ? __pfx_lo_rw_aio_complete+0x10/0x10 [loop]
Apr 24 16:35:19 localhost.localdomain kernel:  ? netfs_extract_user_iter+0x175/0x250 [netfs]
Apr 24 16:35:19 localhost.localdomain kernel:  ? _raw_spin_unlock+0xe/0x30
Apr 24 16:35:19 localhost.localdomain kernel:  ? __pfx_lo_rw_aio_complete+0x10/0x10 [loop]
Apr 24 16:35:19 localhost.localdomain kernel:  netfs_unbuffered_write_iter+0x14a/0x3c0 [netfs]
Apr 24 16:35:19 localhost.localdomain kernel:  lo_rw_aio.isra.0+0x29a/0x2b0 [loop]
Apr 24 16:35:19 localhost.localdomain kernel:  loop_process_work+0xb4/0x950 [loop]
Apr 24 16:35:19 localhost.localdomain kernel:  ? finish_task_switch.isra.0+0x94/0x2f0
Apr 24 16:35:19 localhost.localdomain kernel:  ? __schedule+0x3f4/0x1530
Apr 24 16:35:19 localhost.localdomain kernel:  process_one_work+0x171/0x340
Apr 24 16:35:19 localhost.localdomain kernel:  worker_thread+0x27b/0x3a0
Apr 24 16:35:19 localhost.localdomain kernel:  ? __pfx_worker_thread+0x10/0x10
Apr 24 16:35:19 localhost.localdomain kernel:  kthread+0xe5/0x120
Apr 24 16:35:19 localhost.localdomain kernel:  ? __pfx_kthread+0x10/0x10
Apr 24 16:35:19 localhost.localdomain kernel:  ret_from_fork+0x31/0x50
Apr 24 16:35:19 localhost.localdomain kernel:  ? __pfx_kthread+0x10/0x10
Apr 24 16:35:19 localhost.localdomain kernel:  ret_from_fork_asm+0x1b/0x30
Apr 24 16:35:19 localhost.localdomain kernel:  </TASK>
Apr 24 16:35:19 localhost.localdomain kernel: ---[ end trace 0000000000000000 ]---
Apr 24 16:35:19 localhost.localdomain kernel: I/O error, dev loop0, sector 20971480 op 0x1:(WRITE) flags 0x0 phys_seg 5 prio class 0
Apr 24 16:35:19 localhost.localdomain kernel: I/O error, dev loop0, sector 0 op 0x1:(WRITE) flags 0x0 phys_seg 5 prio class 0
Apr 24 16:35:19 localhost.localdomain kernel: Buffer I/O error on dev loop0, logical block 2621435, lost async page write
Apr 24 16:35:19 localhost.localdomain kernel: Buffer I/O error on dev loop0, logical block 0, lost async page write
Apr 24 16:35:19 localhost.localdomain kernel: Buffer I/O error on dev loop0, logical block 1, lost async page write
Apr 24 16:35:19 localhost.localdomain kernel: Buffer I/O error on dev loop0, logical block 2621436, lost async page write
Apr 24 16:35:19 localhost.localdomain kernel: Buffer I/O error on dev loop0, logical block 2, lost async page write
Apr 24 16:35:19 localhost.localdomain kernel: Buffer I/O error on dev loop0, logical block 2621437, lost async page write
Apr 24 16:35:19 localhost.localdomain kernel: Buffer I/O error on dev loop0, logical block 3, lost async page write
Apr 24 16:35:19 localhost.localdomain kernel: Buffer I/O error on dev loop0, logical block 4, lost async page write
Apr 24 16:35:19 localhost.localdomain kernel: Buffer I/O error on dev loop0, logical block 2621438, lost async page write
Apr 24 16:35:19 localhost.localdomain kernel: Buffer I/O error on dev loop0, logical block 2621439, lost async page write
Apr 24 16:35:20 localhost.localdomain kernel: I/O error, dev loop0, sector 0 op 0x1:(WRITE) flags 0x800 phys_seg 1 prio class 0
Apr 24 16:35:20 localhost.localdomain kernel:  loop0: p1 p2 p3 p4
Apr 24 16:35:20 localhost.localdomain kernel: I/O error, dev loop0, sector 20971480 op 0x1:(WRITE) flags 0x800 phys_seg 5 prio class 0
Apr 24 16:35:20 localhost.localdomain kernel: I/O error, dev loop0, sector 0 op 0x1:(WRITE) flags 0x800 phys_seg 5 prio class 0
Apr 24 16:35:20 localhost.localdomain systemd-homed[1753]: block device /sys/devices/virtual/block/loop0/loop0p1 has been removed.

The problem only happens when using a file on the virtfs/9p mounted filesystem created by podman machine (i.e. the home directory). The sgdisk command works correctly when pointing it directly to a file on the virtfs filesystem. It only fails when pointed to a loopback device that is pointed to a file on the virtfs filesystem. I'm not sure what is unique to the virtfs mount that is causing this.

@cgwalters cgwalters added bug Something isn't working area/install Issues related to `bootc install` area/osintegration Relates to an external OS/distro base image labels Apr 25, 2024
@cgwalters
Copy link
Collaborator

Only briefly looking at this, the kernel stack includes netfs_unbuffered_write_iter. Skimming the git log there turns up e.g.
torvalds/linux@153a996
Which is about direct I/O - which we recently switched to in 61de519
I am not sure, maybe before that relatively recent change DIO would fall back to buffered IO silently?

One unfortunate variable here too is 9p vs virtiofs. We should really IMO try to hard switch podman-machine over to virtiofs on Linux as it's just better tested/supported than 9p.

@ckyrouac
Copy link
Contributor Author

ah interesting. Switching to losetup --direct-io=off makes this work. This means that currently podman-bootc-cli is broken on the latest kernel in Fedora 39/40 due to this. Not sure what the best short term solution is since I imagine switching podman-machine to virtiofs will take awhile. A few options I can think of:

  • hack something up in podman-bootc-cli to build the bootc disk image in /tmp or somewhere other than the 9p mount
  • set --direct-io=off in bootc globally
  • add a new --direct-io option to bootc install

none of them are ideal, wdyt?

@cgwalters
Copy link
Collaborator

set --direct-io=off in bootc globally

Yeah let's do this (although while we're here at least add an environment variable to make it configurable so we can easily A/B test)

This would be a short term workaround but I think the right next step here is to try building a kernel with that patch series reverted. Or see whether using virtiofs fixes it.

hack something up in podman-bootc-cli to build the bootc disk image in /tmp or somewhere other than the 9p mount

That'd quickly run into RAM limitations for nontrivial images, plus we'd need to do a full physical copy back out to the host...

ckyrouac added a commit to ckyrouac/bootc that referenced this issue Apr 25, 2024
When using a loopback device pointing to a file on a 9p filesystem (e.g.
the home mount of a podman machine), using direct-io=on causes the
partitioning via sgdisk to fail. This is a temporary fix until podman
machine switches away from 9p or another fix is found. Direct IO can be
enabled via the BOOTC_DIRECT_IO=on environment variable.

Fixes containers#485

Signed-off-by: Chris Kyrouac <ckyrouac@redhat.com>
ckyrouac added a commit to ckyrouac/bootc that referenced this issue Apr 25, 2024
When using a loopback device pointing to a file on a 9p filesystem (e.g.
the home mount of a podman machine), using direct-io=on causes the
partitioning via sgdisk to fail. This is a temporary fix until podman
machine switches away from 9p or another fix is found. Direct IO can be
enabled via the BOOTC_DIRECT_IO=on environment variable.

Fixes containers#485

Signed-off-by: Chris Kyrouac <ckyrouac@redhat.com>
@ckyrouac ckyrouac self-assigned this Apr 25, 2024
@ckyrouac
Copy link
Contributor Author

Mounting a filesystem in libvirt via the virtiofs driver works. This is the relevant libvirt domain xml:

<filesystem type="mount" accessmode="passthrough">
  <driver type="virtiofs" queue="1024"/>
  <binary path="/usr/libexec/virtiofsd"/>
  <source dir="/home/chris"/>
  <target dir="/home/chris"/>
  <alias name="fs0"/>
  <address type="pci" domain="0x0000" bus="0x07" slot="0x00" function="0x0"/>
</filesystem>

@cgwalters
Copy link
Collaborator

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/install Issues related to `bootc install` area/osintegration Relates to an external OS/distro base image bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants