Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

prepare-root: Set up sysroot readonly in initramfs #2187

Merged
merged 1 commit into from
Nov 4, 2021

Conversation

cgwalters
Copy link
Member

Let's ensure things are right from the start in the initramfs;
this closes off various race conditions. Followup to
3564225

Closes: #2115

@openshift-ci-robot
Copy link
Collaborator

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: cgwalters

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@cgwalters
Copy link
Member Author

This currently fails in FCOS since igniton-files can't run useradd; still debugging, the mounts all seem right though:

`-/sysroot                            /dev/vda4[/ostree/deploy/fedora-coreos/deploy/24baddf67040f815aa56043b20001aa94c80ca10a7f4b9441e8640562c968318.0]
                                                 xfs     rw,relatime,attr2,inode
  |-/sysroot/sysroot                  /dev/vda4  xfs     ro,relatime,attr2,inode
  |-/sysroot/etc                      /dev/vda4[/ostree/deploy/fedora-coreos/deploy/24baddf67040f815aa56043b20001aa94c80ca10a7f4b9441e8640562c968318.0/etc]
  |                                              xfs     rw,relatime,attr2,inode
  `-/sysroot/usr                      /dev/vda4[/ostree/deploy/fedora-coreos/deploy/24baddf67040f815aa56043b20001aa94c80ca10a7f4b9441e8640562c968318.0/usr]
                                                 xfs     ro,relatime,attr2,inode

@smuda
Copy link

smuda commented Sep 22, 2021

What is the status of this issue?

I'm running into what looks very much like fedora-coreos issue #746 on stable FCOS releases 34.20210529.3.0 and 34.20210821.3.0 but works on FCOS Testing release 34.20210904.2.0. Is it possible this has been merged to testing but not yet stable?

jlebon added a commit to jlebon/coreos-assembler that referenced this pull request Sep 22, 2021
In the `multipath.partition` test, we mount a multipath device on
`/var/lib/containers`, and rely on systemd to create the mountpoint. But
because we we weren't deterministically ordered against
`ostree-remount.service`, in rare cases we can run approximately at the
same time and systemd will try to create the mountpoint during that
short window of time when `/var` is read-only. So then the test failed
like this:

```
systemd[1]: Found device /dev/disk/by-label/dm-mpath-containers.
systemd[1]: var-lib-containers.mount: Failed to check directory /var/lib/containers: No such file or directory
systemd[1]: Mounting Mount /var/lib/containers...
mount[983]: mount: /var/lib/containers: mount point does not exist.
systemd[1]: var-lib-containers.mount: Mount process exited, code=exited, status=32/n/a
systemd[1]: var-lib-containers.mount: Failed with result 'exit-code'.
systemd[1]: Failed to mount Mount /var/lib/containers.
```

It took me a while to figure out this was the issue, because systemd
ignores errors from `mkdir`:

https://github.com/systemd/systemd/blob/3a18c0e5f2e4d8d46f3fd11cd0e421f52e727b0d/src/core/mount.c#L1023

An `EROFS` here would've made it much more obvious.

Anyway, let's just add an `After=` in the mount unit to fix this. An
alternative would've been to create the directory from Ignition (which
it would've done automatically for us if we used it to set up the
filesystem and mount unit, but we can't do this for non-root multipath,
because $reasons).

Long-term fix for this is ostreedev/ostree#2187.
jlebon added a commit to jlebon/coreos-assembler that referenced this pull request Sep 22, 2021
In the `multipath.partition` test, we mount a multipath device on
`/var/lib/containers`, and rely on systemd to create the mountpoint. But
because we we weren't deterministically ordered against
`ostree-remount.service`, in rare cases we can run approximately at the
same time and systemd will try to create the mountpoint during that
short window of time when `/var` is read-only. So then the test failed
like this:

```
systemd[1]: Found device /dev/disk/by-label/dm-mpath-containers.
systemd[1]: var-lib-containers.mount: Failed to check directory /var/lib/containers: No such file or directory
systemd[1]: Mounting Mount /var/lib/containers...
mount[983]: mount: /var/lib/containers: mount point does not exist.
systemd[1]: var-lib-containers.mount: Mount process exited, code=exited, status=32/n/a
systemd[1]: var-lib-containers.mount: Failed with result 'exit-code'.
systemd[1]: Failed to mount Mount /var/lib/containers.
```

It took me a while to figure out this was the issue, because systemd
ignores errors from `mkdir`:

https://github.com/systemd/systemd/blob/3a18c0e5f2e4d8d46f3fd11cd0e421f52e727b0d/src/core/mount.c#L1023

An `EROFS` here would've made it much more obvious.

Anyway, let's just add an `After=` in the mount unit to fix this. An
alternative would've been to create the directory from Ignition (which
it would've done automatically for us if we used it to set up the
filesystem and mount unit, but we can't do this for non-root multipath,
because $reasons).

Long-term fix for this is ostreedev/ostree#2187.
@jlebon
Copy link
Member

jlebon commented Sep 22, 2021

What is the status of this issue?

I'm running into what looks very much like fedora-coreos issue #746 on stable FCOS releases 34.20210529.3.0 and 34.20210821.3.0 but works on FCOS Testing release 34.20210904.2.0. Is it possible this has been merged to testing but not yet stable?

Interesting. It's not in any Fedora CoreOS release, so something else must be at play. The latest stable release from this week (34.20210904.3.0) is a promotion of 34.20210904.2.0, so I'd give that a try. If it still doesn't work, please file an issue on the FCOS tracker.

jlebon added a commit to coreos/coreos-assembler that referenced this pull request Sep 22, 2021
In the `multipath.partition` test, we mount a multipath device on
`/var/lib/containers`, and rely on systemd to create the mountpoint. But
because we we weren't deterministically ordered against
`ostree-remount.service`, in rare cases we can run approximately at the
same time and systemd will try to create the mountpoint during that
short window of time when `/var` is read-only. So then the test failed
like this:

```
systemd[1]: Found device /dev/disk/by-label/dm-mpath-containers.
systemd[1]: var-lib-containers.mount: Failed to check directory /var/lib/containers: No such file or directory
systemd[1]: Mounting Mount /var/lib/containers...
mount[983]: mount: /var/lib/containers: mount point does not exist.
systemd[1]: var-lib-containers.mount: Mount process exited, code=exited, status=32/n/a
systemd[1]: var-lib-containers.mount: Failed with result 'exit-code'.
systemd[1]: Failed to mount Mount /var/lib/containers.
```

It took me a while to figure out this was the issue, because systemd
ignores errors from `mkdir`:

https://github.com/systemd/systemd/blob/3a18c0e5f2e4d8d46f3fd11cd0e421f52e727b0d/src/core/mount.c#L1023

An `EROFS` here would've made it much more obvious.

Anyway, let's just add an `After=` in the mount unit to fix this. An
alternative would've been to create the directory from Ignition (which
it would've done automatically for us if we used it to set up the
filesystem and mount unit, but we can't do this for non-root multipath,
because $reasons).

Long-term fix for this is ostreedev/ostree#2187.
@lucab
Copy link
Member

lucab commented Oct 12, 2021

This currently fails in FCOS since igniton-files can't run useradd; still debugging

For reference, I think this was failing because there was the following leftover code:

  /* Default to true, but in the systemd case, default to false because it's handled by
   * ostree-system-generator. */
  bool mount_var = true;
#ifdef HAVE_SYSTEMD_AND_LIBMOUNT
  mount_var = false;
#endif

I've force-pushed the branch behind this PR to rebase on current main and to actually perform the RW bind-mount of var.

That plus coreos/fedora-coreos-config#1270 brought me to a booting image.
Initial boot seems fine, but still need to do further manual poking.

Copy link
Member Author

@cgwalters cgwalters left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks so much for working on this!

src/switchroot/ostree-prepare-root.c Outdated Show resolved Hide resolved
src/switchroot/ostree-prepare-root.c Show resolved Hide resolved
src/switchroot/ostree-prepare-root.c Outdated Show resolved Hide resolved
src/switchroot/ostree-prepare-root.c Show resolved Hide resolved
src/switchroot/ostree-prepare-root.c Show resolved Hide resolved
@lucab lucab force-pushed the sysroot-ro-initramfs branch 3 times, most recently from d32dc2d to d813818 Compare October 25, 2021 07:14
@lucab lucab changed the title WIP: prepare-root: Set up sysroot readonly in initramfs prepare-root: Set up sysroot readonly in initramfs Oct 25, 2021
@lucab
Copy link
Member

lucab commented Oct 25, 2021

This should be ready for review now.

Copy link
Member Author

@cgwalters cgwalters left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking quite good, I just have a few minor nits. Thanks again for working on this!

src/switchroot/ostree-prepare-root.c Outdated Show resolved Hide resolved
src/switchroot/ostree-prepare-root.c Show resolved Hide resolved
src/switchroot/ostree-prepare-root.c Show resolved Hide resolved
src/switchroot/ostree-prepare-root.c Outdated Show resolved Hide resolved
src/switchroot/ostree-prepare-root.c Outdated Show resolved Hide resolved
/* Link to the deployment's /var */
if (mount_var)
{
if (snprintf (srcpath, sizeof(srcpath), "%s/../../var", deploy_path) < 0)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I notice that the path is different here; the previous code used a relative path:
mount ("../../var", "var", NULL, MS_BIND, NULL)
does that have any effect on the visible result in e.g. findmnt?

Just curious if you changed this intentionally.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was semi-intentional, as I was trying to make it work initially by using absolute paths for my own sanity/readability.
It doesn't have any visible side-effects, as the kernel performs path normalization on mount anyway.
In order to reduce the scope of changes in this PR, I've kept using relative paths everywhere.

Let's ensure things are right from the start in the initramfs;
this closes off various race conditions.  Followup to
ostreedev@3564225

Closes: ostreedev#2115
@lucab
Copy link
Member

lucab commented Nov 4, 2021

I've drained all the drive-by changes into separate PRs (all already merged: #2471, #2472, #2473, #2475), and adopted a few other suggestions from previous review.
Fully rebased on main now and ready to go.

@cgwalters
Copy link
Member Author

I've drained all the drive-by changes into separate PRs (all already merged: #2471, #2472, #2473, #2475), and adopted a few other suggestions from previous review.

Thanks so much for doing that! In the end, I am fairly hopeful this change will sail through regression free, but if we discover a problem at least now it's easy to revert just this last bit while keeping all the cleanups, etc.


So, I just now discovered because I'm listed as the original author, I can't approve this PR. Which is clearly a gap in the system. Anyways, from my PoV feel free to approve the PR!

Copy link
Member

@jlebon jlebon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice and clean. Thanks a lot for working on this!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Enable sysroot.readonly at initrd time
5 participants