-
Notifications
You must be signed in to change notification settings - Fork 290
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
prepare-root: Set up sysroot readonly in initramfs #2187
Conversation
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: cgwalters The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
This currently fails in FCOS since
|
c266f77
to
d7b6666
Compare
What is the status of this issue? I'm running into what looks very much like fedora-coreos issue #746 on stable FCOS releases 34.20210529.3.0 and 34.20210821.3.0 but works on FCOS Testing release 34.20210904.2.0. Is it possible this has been merged to testing but not yet stable? |
In the `multipath.partition` test, we mount a multipath device on `/var/lib/containers`, and rely on systemd to create the mountpoint. But because we we weren't deterministically ordered against `ostree-remount.service`, in rare cases we can run approximately at the same time and systemd will try to create the mountpoint during that short window of time when `/var` is read-only. So then the test failed like this: ``` systemd[1]: Found device /dev/disk/by-label/dm-mpath-containers. systemd[1]: var-lib-containers.mount: Failed to check directory /var/lib/containers: No such file or directory systemd[1]: Mounting Mount /var/lib/containers... mount[983]: mount: /var/lib/containers: mount point does not exist. systemd[1]: var-lib-containers.mount: Mount process exited, code=exited, status=32/n/a systemd[1]: var-lib-containers.mount: Failed with result 'exit-code'. systemd[1]: Failed to mount Mount /var/lib/containers. ``` It took me a while to figure out this was the issue, because systemd ignores errors from `mkdir`: https://github.com/systemd/systemd/blob/3a18c0e5f2e4d8d46f3fd11cd0e421f52e727b0d/src/core/mount.c#L1023 An `EROFS` here would've made it much more obvious. Anyway, let's just add an `After=` in the mount unit to fix this. An alternative would've been to create the directory from Ignition (which it would've done automatically for us if we used it to set up the filesystem and mount unit, but we can't do this for non-root multipath, because $reasons). Long-term fix for this is ostreedev/ostree#2187.
In the `multipath.partition` test, we mount a multipath device on `/var/lib/containers`, and rely on systemd to create the mountpoint. But because we we weren't deterministically ordered against `ostree-remount.service`, in rare cases we can run approximately at the same time and systemd will try to create the mountpoint during that short window of time when `/var` is read-only. So then the test failed like this: ``` systemd[1]: Found device /dev/disk/by-label/dm-mpath-containers. systemd[1]: var-lib-containers.mount: Failed to check directory /var/lib/containers: No such file or directory systemd[1]: Mounting Mount /var/lib/containers... mount[983]: mount: /var/lib/containers: mount point does not exist. systemd[1]: var-lib-containers.mount: Mount process exited, code=exited, status=32/n/a systemd[1]: var-lib-containers.mount: Failed with result 'exit-code'. systemd[1]: Failed to mount Mount /var/lib/containers. ``` It took me a while to figure out this was the issue, because systemd ignores errors from `mkdir`: https://github.com/systemd/systemd/blob/3a18c0e5f2e4d8d46f3fd11cd0e421f52e727b0d/src/core/mount.c#L1023 An `EROFS` here would've made it much more obvious. Anyway, let's just add an `After=` in the mount unit to fix this. An alternative would've been to create the directory from Ignition (which it would've done automatically for us if we used it to set up the filesystem and mount unit, but we can't do this for non-root multipath, because $reasons). Long-term fix for this is ostreedev/ostree#2187.
Interesting. It's not in any Fedora CoreOS release, so something else must be at play. The latest stable release from this week (34.20210904.3.0) is a promotion of 34.20210904.2.0, so I'd give that a try. If it still doesn't work, please file an issue on the FCOS tracker. |
In the `multipath.partition` test, we mount a multipath device on `/var/lib/containers`, and rely on systemd to create the mountpoint. But because we we weren't deterministically ordered against `ostree-remount.service`, in rare cases we can run approximately at the same time and systemd will try to create the mountpoint during that short window of time when `/var` is read-only. So then the test failed like this: ``` systemd[1]: Found device /dev/disk/by-label/dm-mpath-containers. systemd[1]: var-lib-containers.mount: Failed to check directory /var/lib/containers: No such file or directory systemd[1]: Mounting Mount /var/lib/containers... mount[983]: mount: /var/lib/containers: mount point does not exist. systemd[1]: var-lib-containers.mount: Mount process exited, code=exited, status=32/n/a systemd[1]: var-lib-containers.mount: Failed with result 'exit-code'. systemd[1]: Failed to mount Mount /var/lib/containers. ``` It took me a while to figure out this was the issue, because systemd ignores errors from `mkdir`: https://github.com/systemd/systemd/blob/3a18c0e5f2e4d8d46f3fd11cd0e421f52e727b0d/src/core/mount.c#L1023 An `EROFS` here would've made it much more obvious. Anyway, let's just add an `After=` in the mount unit to fix this. An alternative would've been to create the directory from Ignition (which it would've done automatically for us if we used it to set up the filesystem and mount unit, but we can't do this for non-root multipath, because $reasons). Long-term fix for this is ostreedev/ostree#2187.
d7b6666
to
571d33a
Compare
For reference, I think this was failing because there was the following leftover code:
I've force-pushed the branch behind this PR to rebase on current That plus coreos/fedora-coreos-config#1270 brought me to a booting image. |
c57b48f
to
9c0fbd2
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks so much for working on this!
d32dc2d
to
d813818
Compare
This should be ready for review now. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking quite good, I just have a few minor nits. Thanks again for working on this!
src/switchroot/ostree-prepare-root.c
Outdated
/* Link to the deployment's /var */ | ||
if (mount_var) | ||
{ | ||
if (snprintf (srcpath, sizeof(srcpath), "%s/../../var", deploy_path) < 0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I notice that the path is different here; the previous code used a relative path:
mount ("../../var", "var", NULL, MS_BIND, NULL)
does that have any effect on the visible result in e.g. findmnt
?
Just curious if you changed this intentionally.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It was semi-intentional, as I was trying to make it work initially by using absolute paths for my own sanity/readability.
It doesn't have any visible side-effects, as the kernel performs path normalization on mount anyway.
In order to reduce the scope of changes in this PR, I've kept using relative paths everywhere.
Let's ensure things are right from the start in the initramfs; this closes off various race conditions. Followup to ostreedev@3564225 Closes: ostreedev#2115
d813818
to
c553b5c
Compare
Thanks so much for doing that! In the end, I am fairly hopeful this change will sail through regression free, but if we discover a problem at least now it's easy to revert just this last bit while keeping all the cleanups, etc. So, I just now discovered because I'm listed as the original author, I can't approve this PR. Which is clearly a gap in the system. Anyways, from my PoV feel free to approve the PR! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice and clean. Thanks a lot for working on this!
Let's ensure things are right from the start in the initramfs;
this closes off various race conditions. Followup to
3564225
Closes: #2115