Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gf-platformid: use qemu-img create instead of reflinks #935

Merged
merged 3 commits into from
Nov 21, 2019

Conversation

jlebon
Copy link
Member

@jlebon jlebon commented Nov 20, 2019

I spent a few hours yesterday trying to figure out why my Azure images
were failing to boot in the cloud with:

error: ../../grub-core/fs/fshelp.c:257:file `/grub2/i386-pc/normal.mod' not found.

What it came down to was whether or not we were doing a reflink copy in
gf-platformid. (My root filesystem is configured with reflink support,
so --reflink=auto would use reflinks.)

I seem to be hitting a kernel regression of some kind where reflinks +
interactions with guestfish cause the grub2/ directory of the boot
partition to become corrupted:

$ guestmount -a tmp/azure.vhd -m /dev/sda1 tmp/mount && ll /srv/rhcos/tmp/mount
ls: cannot access 'grub2': Bad message
total 20
drwxrwxr-x. 2 root root  1024 Nov 19 11:55 efi
d?????????? ? ?    ?        ?            ? grub2
-rw-rw-r--. 1 root root     0 Nov 19 11:56 ignition.firstboot
lrwxrwxrwx. 1 root root     8 Nov 19 11:56 loader -> loader.1
drwxr-xr-x. 3 root root  1024 Nov 19 11:56 loader.1
drwx------. 2 root root 12288 Nov 19 11:55 lost+found
drwxrwxr-x. 3 root root  1024 Nov 19 11:56 ostree

In fact, reproducing the cloud failure is easy to do locally too,
because qemu-kvm will happily boot VHDs, so one can just do
cosa run -d /path/to/vhd.

I haven't filed a bug report for this yet because I've been trying to
get a more minimal reproducer and bisecting kernel versions without much
success (for one, it doesn't seem to reproduce in VMs).

The previous patch disabled reflinks globally. Though this makes
gf-platformid much slower because it has to copy the full qcow2.
Another more lightweight method here is to just create a new qcow2 and
use the source as its backing file. (Not to mention it also makes it
much faster for all the other devs/CI pipelines which don't run with
reflinks on).

@cgwalters
Copy link
Member

I probably would have made a copyfile wrapper that we use instead of cp:

#!/bin/sh
exec cp -a --reflink="${REFLINK:-auto}" "$@"

But OK as is.

This must have something to do with the pattern of writes that the guestfish qemu process issues to the backing file. Or maybe it's around doing a write-read-write cycle, if something is getting corrupted on the read.

@cgwalters
Copy link
Member

One tangentially related topic: We are likely to have a requirement for RHCOS at some point to ship updated bootimages inside containers. It'd be nicer if we did something like have just the qemu.qcow2, and do the "platformid stamping" on the client end. However, doing the stamping client side would be way more reliable if we didn't require libguestfs. We could e.g. store the platform ID in the partition table or something like that; reading/writing partition tables can (I believe) easily be done in userspace without running a kernel like libguestfs does.

Instead of repeating `cp -a --reflink=auto` everywhere, let's just ship
a handy helper script to make this less verbose.

Prep for next patch.
@miabbott
Copy link
Member

CI shows that you'll need to source cmdlib.sh in create_disk.sh so you can use REFLINK

@jlebon
Copy link
Member Author

jlebon commented Nov 20, 2019

OK, I added a cp-reflink helper! ⬆️

I ended up just shipping it in /usr/bin to make it easier to use across bash and python. I could restrict this to just /usr/lib/coreos-assembler if folks prefer.

Copy link
Member

@cgwalters cgwalters left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

There seems to be a recent (likely kernel) regression with reflinks
where the image spit out by `gf-platformid` has its `/boot` partition
subsequently corrupted.

So as a safety measure, just disable all reflinks for now.

See the next patch for more details.
I spent a few hours yesterday trying to figure out why my Azure images
were failing to boot in the cloud with:

    error: ../../grub-core/fs/fshelp.c:257:file `/grub2/i386-pc/normal.mod' not found.

What it came down to was whether or not we were doing a reflink copy in
`gf-platformid`. (My root filesystem is configured with reflink support,
so `--reflink=auto` would use reflinks.)

I seem to be hitting a kernel regression of some kind where reflinks +
interactions with guestfish cause the `grub2/` directory of the boot
partition to become corrupted:

```
$ guestmount -a tmp/azure.vhd -m /dev/sda1 tmp/mount && ll /srv/rhcos/tmp/mount
ls: cannot access 'grub2': Bad message
total 20
drwxrwxr-x. 2 root root  1024 Nov 19 11:55 efi
d?????????? ? ?    ?        ?            ? grub2
-rw-rw-r--. 1 root root     0 Nov 19 11:56 ignition.firstboot
lrwxrwxrwx. 1 root root     8 Nov 19 11:56 loader -> loader.1
drwxr-xr-x. 3 root root  1024 Nov 19 11:56 loader.1
drwx------. 2 root root 12288 Nov 19 11:55 lost+found
drwxrwxr-x. 3 root root  1024 Nov 19 11:56 ostree
```

In fact, reproducing the cloud failure is easy to do locally too,
because qemu-kvm will happily boot VHDs, so one can just do
`cosa run -d /path/to/vhd`.

I haven't filed a bug report for this yet because I've been trying to
get a more minimal reproducer and bisecting kernel versions without much
success (for one, it doesn't seem to reproduce in VMs).

The previous patch disabled reflinks globally. Though this makes
`gf-platformid` much slower because it has to copy the full qcow2.
Another more lightweight method here is to just create a new qcow2 and
use the source as its backing file. (Not to mention it also makes it
much faster for all the other devs/CI pipelines which *don't* run with
reflinks on).
@cgwalters
Copy link
Member

See also #936

@jlebon jlebon merged commit e80b853 into coreos:master Nov 21, 2019
@jlebon jlebon deleted the pr/platformid-backing branch November 21, 2019 14:10
@cgwalters cgwalters mentioned this pull request Nov 21, 2019
@jlebon
Copy link
Member Author

jlebon commented Dec 13, 2019

This approach (specifically the last commit) does not work, see #999.

jcajka pushed a commit to jcajka/coreos-assembler that referenced this pull request Mar 24, 2020
kola/tests/ignition: add test for versioned s3 objects
jcajka pushed a commit to jcajka/coreos-assembler that referenced this pull request Mar 24, 2020
The original PR was mistakenly merged without a MinVersion in coreos#935.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants