Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ensure that efivarfs is mounted in the container #302

Merged
merged 1 commit into from
Feb 12, 2024

Conversation

bcrochet
Copy link
Contributor

@bcrochet bcrochet commented Feb 6, 2024

Especially on ARM, which utilizes UEFI for booting in most cases, it is important that the /sys/firmware/efi/efivars be mounted and populated, otherwise bootc will fail to complete a to-filesystem installation.

This patch attempts a mount as long as the hosts efivars directory has an entry, signifying the system is at least capable of UEFI.

Note that it is sufficient to just attempt to mount efivars. If it's already mounted elsewhere, it triggers the mount to be made at the /sys location.

Fixes #291

Copy link

openshift-ci bot commented Feb 6, 2024

Hi @bcrochet. Thanks for your PR.

I'm waiting for a containers member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@github-actions github-actions bot added the area/install Issues related to `bootc install` label Feb 6, 2024
tracing::debug!("Setting up sys mounts");
// First of all, does efivars even exist in the host? If not, we are
// not dealing with an EFI system
let _ = rustix::fs::lstat("/proc/1/root/sys/firmware/efi/efivars")?;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will cause us to hard fail though if it doesn't exist, which is not what we want.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right. Forgot to test that case on a BIOS system.

let _ = rustix::fs::lstat("/proc/1/root/sys/firmware/efi/efivars")?;

// Now, let's find out if it's populated
let od = rustix::fs::open(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So far typically here we use std::fs unless there's a reason to use rustix.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Understood.

// Now, let's find out if it's populated
let od = rustix::fs::open(
"/proc/1/root/sys/firmware/efi/efivars",
rustix::fs::OFlags::DIRECTORY,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In particular, this is missing OFlags::CLOEXEC which is generally a good idea. While this codebase isn't security sensitive in general, see e.g. https://github.com/opencontainers/runc/releases/tag/v1.1.12 which was a bad CVE in runc from leaking fds and handling them incorrectly.

Comment on lines 942 to 943
let dir_populated = rustix::fs::Dir::read_from(od)?.next().is_some();
if dir_populated {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Combining the above I think something just like:

let efivars = "/proc/1/root/sys/firmware/efi/efivars";
if !Path::new(efivars).try_exists()? {
    return Ok(());
}
if std::fs::read_dir(efivars)?.next().is_none() {
    return Ok(());
}
...rest of code

or so?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another option is https://docs.rs/cap-std-ext/latest/cap_std_ext/dirext/trait.CapStdExtDirExt.html#tymethod.open_dir_optional which we use in a few places in this project, and is intended for this use case.

// This looks to be a workaround for a problem elsewhere. Not sure if
// it's in podman, but just attempting to mount this is enough to trigger
// it being populated even when there is an existing mount.
// Given that, we will just try to mount and ignore the result.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't quite understand the rationale behind ignoring the error. Is it erroring out now?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will double-check tomorrow. But what I think I was seeing today was that when bootc install to-filesystem --replace=alongside /target was run... On the first invocation /sys/firmware/efi/efivars was unpopulated, but {/proc/1/root,/target}/sys/firmware/efi/efivars are populated, and the mount command fails, saying "efivarfs already mounted at /target/sys/firmware/efi/efivars".

But once that was run, and failed out, "magically" /sys/firmware/efi/efivars was now populated, and the bootc run would complete without error.

I'm totally game with working on this further to make it more resilient. I also think this is some kind of bug elsewhere. I'm just not sure where to point the finger.

So, my rationale here was that it didn't matter if the mount command itself succeeded, but just that it was run.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW I did verify that adding -v /sys:/sys:rslave to our podman run invocation gets us the efivars mount (as expected) alongside others that we also currently mount manually like selinuxfs.

and the mount command fails, saying "efivarfs already mounted at /target/sys/firmware/efi/efivars".

That's weird indeed. In a quick test I can't reproduce that with a simple

$ podman run --rm -ti --privileged --pid=host quay.io/fedora/fedora:39
# mount -t efivarfs efivars /sys/firmware/efi/efivars

Probably worth trying to dig into what's different about what we're doing here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The proper direction may be then to just do a doc change for this, while we investigate the origin of the error.

Is it perhaps ARM specific?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, I think the other key is that -v /:/target is used. As efivarfs will be mounted in /target.

And of course I'm now unable to reproduce this scenario. 🙄

I'll update this patch with all of the suggestions and stop ignoring this error, and we can go from there.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, I think the other key is that -v /:/target is used. As efivarfs will be mounted in /target.

Ahh yes, tricky. In the case that we have the full target rootfs, we could in our container simply do e.g. mount --rbind /target/sys /sys or so.

Actually for this "alongside" case it would definitely help there if we still didn't need to pass in the -v /:/target even and just got the root filesystem by jumping out of the mount namespace.

@cgwalters
Copy link
Collaborator

/ok-to-test

Comment on lines +884 to +887
let inspect = crate::mount::inspect_filesystem(path);
if inspect.is_ok() {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is OK as is, but is_ok() is a pattern to try to avoid as a general rule because it "swallows" all forms of errors (if e.g. the findmnt binary failed to be executed, not just if the target isn't a mountpoint).

Unfortunately detecting if something is a mountpoint is surprisingly tricky (we use a helper ostree_ext::mountutil::is_mountpoint in other places here). It looks like we could extend our findmnt wrapper to support passing -T/--target, which would always succeed (in general), and then we can check if the filesystem type is efivarfs.

}

// This means the host has this mounted, so we should mount it too
tracing::debug!("mounting efivarfs");
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very minor nit, I doubt we need this debug invocation because Task::new_and_run will already show something like this with the tracing::debug!("exec: {cmd:?}");

Comment on lines +863 to +866
#[context("Ensuring sys mounts")]
pub(crate) fn setup_sys_mounts() -> Result<()> {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This makes sense as a name, though let's add some commentary here, something like:

/// By default, podman/docker etc. when passed `--privileged` mount `/sys` as read-only, but non-recursively.  We selectively grab sub-filesystems that we need.

I think your naming is right though because it'd make sense to also handle selinuxfs exactly this same way. (Don't need to do it in this PR though)

@cgwalters
Copy link
Collaborator

/ok-to-test

Copy link
Collaborator

@cgwalters cgwalters left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is OK as is, we can do other fixes as followups. Does this solve the problem in your testing?

@bcrochet
Copy link
Contributor Author

This is OK as is, we can do other fixes as followups. Does this solve the problem in your testing?

So far, yes.

I'll do the cleanups suggested.

Especially on ARM, which utilizes UEFI for booting in most cases, it is
important that the /sys/firmware/efi/efivars be mounted and populated,
otherwise bootc will fail to complete a to-filesystem installation.

This patch attempts a mount as long as the hosts efivars directory has
an entry, signifying the system is at least capable of UEFI.

Note that it is sufficient to just attempt to mount efivars. If it's
already mounted elsewhere, it triggers the mount to be made at the /sys
location.

Fixes containers#291

Signed-off-by: Brad P. Crochet <brad@redhat.com>
@cgwalters cgwalters merged commit e4c2971 into containers:main Feb 12, 2024
11 checks passed
bcrochet added a commit to bcrochet/bootc that referenced this pull request Feb 14, 2024
As a follow-on to containers#302, we want to also mount the selinuxfs special
filesystem if the host also has that filesystem mounted.

Related containers#303

Signed-off-by: Brad P. Crochet <brad@redhat.com>
bcrochet added a commit to bcrochet/bootc that referenced this pull request Feb 15, 2024
As a follow-on to containers#302, we want to also mount the selinuxfs special
filesystem if the host also has that filesystem mounted.

Related containers#303

Signed-off-by: Brad P. Crochet <brad@redhat.com>
bcrochet added a commit to bcrochet/bootc that referenced this pull request Feb 15, 2024
As a follow-on to containers#302, we want to also mount the selinuxfs special
filesystem if the host also has that filesystem mounted.

Related containers#303

Signed-off-by: Brad P. Crochet <brad@redhat.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/install Issues related to `bootc install` ok-to-test
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Run bootc install failed in AWS EC2 ARM instance with error "Failed to invoke efibootmgr"
2 participants