Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ignition-ostree: make sure we don't mount /sysroot before transposefs #2526

Merged
merged 1 commit into from
Aug 1, 2023

Conversation

jmarrero
Copy link
Member

@dustymabe
Copy link
Member

Without more details in the commit message or comments here I would classify this as a workaround. i.e. we know that when multipath is enabled the autosave-xfs stuff will fail, but we don't know why.

Does the problem that autosave-xfs solve still happen on multipath? Should it work? Is there an underlying bug we need to investigate and figure out? Do we want it to work in the future? Should we add a test for it when it does work?

@jmarrero
Copy link
Member Author

Oh yeah this is a workaround, I am trying a possible real fix too. Just opened this in case we needed to provide them an urgent workaround.

With this workaround:
Does the problem that autosave-xfs solve still happen on multipath? Yes
Should it work? Yes
Is there an underlying bug we need to investigate and figure out? Yes (hopefully I can fix it before having to push the workaround)
Do we want it to work in the future? Yes
Should we add a test for it when it does work? Yes

@dustymabe
Copy link
Member

OK. Thanks for the context!

@jmarrero
Copy link
Member Author

Now this, ends up in the same place as the reported issue.

Displaying logs from failed units: ignition-ostree-transposefs-autosave-xfs.service
Jul 28 20:47:07 systemd[1]: ignition-ostree-transposefs-autosave-xfs.service: Main process exited, code=exited, status=32/n/a
Jul 28 20:47:07 ignition-ostree-transposefs[1986]: ++ realpath /dev/disk/by-label/dm-mpath-root
Jul 28 20:47:07 systemd[1]: ignition-ostree-transposefs-autosave-xfs.service: Failed with result 'exit-code'.
Jul 28 20:47:07 ignition-ostree-transposefs[1866]: + echo 'Mounting /dev/disk/by-label/dm-mpath-root ro (/dev/dm-4) to /var/tmp/mnt'
Jul 28 20:47:07 ignition-ostree-transposefs[1866]: Mounting /dev/disk/by-label/dm-mpath-root ro (/dev/dm-4) to /var/tmp/mnt
Jul 28 20:47:07 ignition-ostree-transposefs[1866]: + mkdir -p /var/tmp/mnt
Jul 28 20:47:07 ignition-ostree-transposefs[1866]: + mount -o ro /dev/disk/by-label/dm-mpath-root /var/tmp/mnt
Jul 28 20:47:07 systemd[1]: Failed to start Ignition OSTree: Autosave XFS Rootfs Partition.
Jul 28 20:47:07 ignition-ostree-transposefs[1988]: mount: /var/tmp/mnt: /dev/mapper/mpatha4 already mounted on /sysroot.
Jul 28 20:47:07 systemd[1]: ignition-ostree-transposefs-autosave-xfs.service: Triggering OnFailure= dependencies.

I continue debugging.

@jmarrero
Copy link
Member Author

jmarrero commented Aug 1, 2023

This boots correctly and does not seem to cause any errors, multipath devices are still being used when I check df -h

When enabling multipath the ignition-ostree-mount-firstboot-sysroot
service loses to the systemd's generator, which causes /sysroot to
be mounted before we finish transposing the fs. This change makes
sure we wait to mount the /sysroot until we are done.
This was reported via: https://issues.redhat.com/browse/OCPBUGS-16157
@jmarrero jmarrero changed the title ignition-ostree-transposefs: don't run if multipath is enabled ignition-ostree: make sure we don't mount /sysroot before transposefs Aug 1, 2023
@jmarrero jmarrero marked this pull request as ready for review August 1, 2023 17:01
@jmarrero
Copy link
Member Author

jmarrero commented Aug 1, 2023

Tested these 4 variants and all boot OK.

cosa run -c --qemu-size=120G --qemu-memory=6096

cosa run -c --qemu-size=20G --qemu-memory=6096

cosa run -c --qemu-size=120G --qemu-memory=6096 --qemu-multipath=true --kargs='rd.multipath=default root=/dev/disk/by-label/dm-mpath-root rw'

cosa run -c --qemu-size=20G --qemu-memory=6096 --qemu-multipath=true --kargs='rd.multipath=default root=/dev/disk/by-label/dm-mpath-root rw'

@cgwalters
Copy link
Member

[2023-08-01T17:20:28.219Z] --- PASS: multipath.day1 (55.06s)

etc 🎉

What we could do as a followup is a new kola test which expands to a big size with mpath but this LGTM

@cgwalters cgwalters merged commit d0c7956 into coreos:testing-devel Aug 1, 2023
1 check passed
@jmarrero
Copy link
Member Author

jmarrero commented Aug 1, 2023

Adding test issue so I don't forget: #2533. I will get to this ASAP.

Comment on lines 7 to +8
After=ignition-ostree-growfs.service
Before=initrd-root-fs.target
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK now that I look I think I'm starting to understand this better. Before this PR, we only had Before=ignition-ostree-transposefs-restore.service on this service, which is itself just Before=ignition-ostree-mount-firstboot-sysroot.service.

Yet that service is just a no-op if we have a root kernel argument.

So there were no constraints and we just end up racing.

So the multipath bits exposed this core race condition because it is done via a root= karg.

Comment on lines +12 to +13
Before=initrd-root-fs.target
Before=sysroot.mount
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is 100% fine, but I'd note that normally sysroot.mount is itself Before=initrd-root-fs.target, so we could just be Before=sysroot.mount too.

Or maybe the simple way to say it is: anywhere we have Before=ignition-ostree-mount-firstboot-sysroot.service we also need Before=sysroot.mount.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants