Design tooling to ship bootloader updates #510

cgwalters · 2020-06-01T19:08:39Z

We've already hit problems in Silverblue with people having absolutely ancient GRUB2 UEFI binaries, see fwupd/fwupd#2084 (comment)

For FCOS today the bootloader is written at disk image image build time: https://github.com/coreos/coreos-assembler/blob/40c6d44497056b6af308ad7c7c9298a0ead3e975/src/create_disk.sh#L318
And then there is no mechanism we ship to update it (whether automatically or manually for that matter).

One thing that is nice is that since we shipped the aleph version marker we can at least reliably identify the versions of those binaries. (It would also be useful to have a tooling that checksums them and attempts to identify e.g. the RPM package version they came from).

I think my strawman is something like this:

Write a new (theoretically distribution independent-ish) tool that takes ownership of this problem space, let's call it bootupd because we have lots of imagination
Write it in Rust, use e.g. https://github.com/coreos/afterburn/ as an example stub project
bootupd can be passed a filesystem tree containing expected binaries (for UEFI) and replaces the bits in create_disk.sh to write them to disk to start, and includes metadata about them, and also knows how to wrap grub2-install as needed for the MBR

Now, an important thing that needs to be discussed here is at what cadence we apply bootloader updates. It might be OK to simply do them whenever the distribution makes them, but we might also want to make that optional, because it is likely (without significant engineering effort) to be a "don't
turn off your computer right now" event.

It might be very useful for bootupd to define its own little "upgrade graph" model, at least the ability to only apply updates if a bootloader is too old rather than doing it on every update.

And client side this should be configurable (and perhaps off by default).

Probably to start the simplest is for bootupd to take as input a filesystem tree, a lot like /usr/lib/ostree-boot rather than (like fwupd and dnf/rpm-ostree) defining its own mechanism for retrieving content from http etc.

For FCOS/RHCOS we can then e.g. choose to pin the bootloaders separately from the main ostree content, or ship them as they come in.

The text was updated successfully, but these errors were encountered:

ashcrow · 2020-06-01T20:08:03Z

I think my strawman is something like this:

Write a new (theoretically distribution independent-ish) tool that takes ownership of this problem > space, let's call it bootupd because we have lots of imagination

Write it in Rust, use e.g. https://github.com/coreos/afterburn/ as an example stub project

bootupd can be passed a filesystem tree containing expected binaries (for UEFI) and replaces the bits in create_disk.sh to write them to disk to start, and includes metadata about them, and also knows how to wrap grub2-install as needed for the MBR

This sounds good to me!!

Probably to start the simplest is for bootupd to take as input a filesystem tree, a lot like /usr/lib/ostree-boot rather than (like fwupd and dnf/rpm-ostree) defining its own mechanism for retrieving content from http etc.

Agreed. If there ends up being a benefit from defining other delivery mechanisms it could be added later.

We're going to have to do some work here if we want to ship these kinds of updates. My only question is specific to reuse versus owning end-to-end. Would integrating fwupd at any level end up adding more work to accomplish this? Put another way, if we provided a firmware update via the proper RHCOS/FCOS delivery mechanism and then use the expected metadata format pointing to the local location (didn't check if file:// is valid, but there are workarounds) does that help? Or should we directly own the process and code start-to-finish as that would a better solution for CoreOS style systems?

cgwalters · 2020-06-01T20:30:53Z

It definitely makes sense to touch base with fwupd. Maybe they'd agree to own this space, and it would make some sense, gets interesting then because fwupd would be updating itself.

I believe ChromeOS ships fwupd, but they seem to use different tooling for updating the bootloader:
https://www.chromium.org/chromium-os/firmware-porting-guide/2-concepts

Another huge sub-thread in this is that we may need to try to convert traditional RPM (and other) systems over to using this tool too, otherwise it will pile onto the delta carried here.

ashcrow · 2020-06-01T20:52:59Z

Got it. Sounds good. I think starting with the minimal set of functionality, IE:

The update content is delivered to the system in $WAY and is dropped in $LOCATION
bootupd applies the content from $LOCATION based on a simple graph
Reboot occurs

is small enough not to write us into a corner as we continue to explore this idea, while still being helpful to those who would like this functionality and could help us test/verify 😄

cgwalters · 2020-06-01T21:16:00Z

It's super tempting to do something just to update UEFI, because that boils down to copying files. But, the problem with that is it won't help us when at some point we want to use a feature in our grub.cfg that isn't supported by e.g. the GRUB in the MBR, and GRUB-in-MBR is used for e.g. OpenShift 4.1 bootimages in big cases like most public clouds. That of course circles into openshift/enhancements#201 - and that's one thread here, we can say anyone who wants bootloader updates needs to use new bootimages (reprovisioning in place existing nodes).

dustymabe · 2020-06-02T03:15:06Z

It definitely makes sense to touch base with fwupd. Maybe they'd agree to own this space, and it would make some sense, gets interesting then because fwupd would be updating itself.

👍 - i'm not sure if it makes sense from the maintainers of LVFS/fwupd but having one tool handle it could make a lot of sense. It's already made a lot of progress at being cross distro and universally accepted. Maybe the distros could plug in to a universal model for where to ship the files via package management and then fwupd applies them or something.

cmurf · 2020-06-02T03:51:09Z

It's a good opportunity to break with two unfortunate behaviors: nested mount at /boot/efi, and the persistent mounting of it. The latter is difficult security wise to justify leaving it hanging around all the time. Whatever is responsible for updating things related to the bootloader should be able to mount it, modify, and unmount. As to where to mount it, maybe do it somewhere in /run. I think this should be behavior out of the gate in 1.0 version.

For 2.0 I think it should be multiple device aware, and capable of properly syncing the drives, e.g. the raid1 use case, where both drives need bootloaders and updates.

I'm uncertain about resolving the differences between the sd-boot and Fedora blscfg.mod BLS paradigms; neither of them are accepted upstream GRUB which is also unfortunate. This messiness has made for extra work on the RH/Fedora boot folks.

One thing that is going to get a bit better in F33 GRUB is no longer depending on grubenv to store variables like kernel arguments. They'll go in the BLS snippets directly which is more like the sd-boot/upstream BLS.

martinezjavier · 2020-06-02T16:08:32Z

Write a new (theoretically distribution independent-ish) tool that takes ownership of this problem space, let's call it bootupd because we have lots of imagination

Write it in Rust, use e.g. https://github.com/coreos/afterburn/ as an example stub project

We discussed the same idea with @vathpela and @gicmo in the past. And even said that rust was likely the saner option, so our goals are very much aligned.

Another huge sub-thread in this is that we may need to try to convert traditional RPM (and other) systems over to using this tool too, otherwise it will pile onto the delta carried here.

I think that makes sense for traditional Fedora as well, since is needed for legacy BIOS x86_64 as you said but also for ppc64le that has a similar setup. We are currently not updating the bootloader too for those platforms in non-ostree based variants.

Even for EFI it would be good to decouple the package installation from the ESP update for example as @cmurf said to only mount the ESP when needs to be updated. But it would also allow for other future improvements like for example having an A/B update mechanism for the bootloader. Since files won't be copied to the ESP directly as a part of a package update, overwriting the existing EFI binaries.

Maybe can also be used for s390x since the bootmap has to be updated on each kernel update by calling the zipl tool. It's true that this is not updating the bootloader per se but still is an action that needs to be taken in order to update the bootloader configuration.

Currently that is done by the ostree zipl backend but that doesn't feel completely right. This tool might take care of that and allow a sysroot for s390x to just be configured with bootloader=none instead of requiring bootloader=zipl.

martinezjavier · 2020-06-02T16:22:20Z

I'm uncertain about resolving the differences between the sd-boot and Fedora blccfg.mod BLS

I don't think that this tool should care about BLS at all since ostree handles that. But only about the things that need to be updated that are not part of the ostree deployment transaction.

paradigms; neither of them are accepted upstream GRUB which is also unfortunate. This messiness has made for extra work on the RH/Fedora boot folks.

Yes, it's unfortunate that upstream GRUB doesn't like the BLS. Last time we discussed this the maintainers said that maybe we could define a BLSv2 that better aligns with the GRUB configuration and features. I still don't think upstreaming the blscfg module is a lost battle, but this isn't really relevant to this discussion in my opinion.

One thing that is going to get a bit better in F33 GRUB is no longer depending on grubenv to store variables like kernel arguments. They'll go in the BLS snippets directly which is more like the sd-boot/upstream BLS anyway.

That's correct, using the grubenv to store the cmdline caused more harm than good. But keep in mind that this was only used for traditional Fedora since ostree manages its own BLS snippets, so only the blscfg module from the GRUB package is needed there.

cgwalters · 2020-06-02T18:12:32Z

Maybe can also be used for s390x since the bootmap has to be updated on each kernel update by calling the zipl tool. ... Currently that is done by the ostree zipl backend but that doesn't feel completely right. This tool might take care of that and allow a sysroot for s390x to just be configured with bootloader=none instead of requiring bootloader=zipl.

If it involves changing the kernel, that is ostree's (or traditional rpm/yum's) domain. So I would strongly prefer the status quo of bootloader=zipl over teaching ostree how to execute bootupd since that's tying them together when they should be independent.

martinezjavier · 2020-06-02T18:48:52Z

So I would strongly prefer the status quo of bootloader=zipl over teaching ostree how to execute bootupd since that's tying them together when they should be independent.

Right, I guess bootloader=none is only suitable for the case when just writing BLS snippets is enough and no other action is needed for the bootloader to parse the new configuration.

dsd · 2020-06-03T01:53:48Z

For FCOS today the bootloader is written at disk image image build time:
Write a new (theoretically distribution independent-ish) tool that takes ownership of this problem space

It sounds like you're defining "problem space" here to mean that the image builder writes the bootloader and then there's no way of updating that part later.

There is an alternative, wider view that the problem space is that the image builder makes a series of decisions and actions that affect what is written within the installation (beyond the deployment of the ostree), and those aspects may need to be updated later. The bootloader is one such element within that wider problem space.

This is an approach I have been working on exploring here: Updating the OS data that ostree doesn't manage

Maybe this is not such a pressing issue on your side at the moment, given (as discussed there) there is the reprovision-on-every-boot approach that works well for cloud setups, and the Silverblue image build is currently not much more than deploying an ostree and installing a bootloader anyway. But I feel inclined to mention this anyway given that it's largely a Silverblue (non-cloud) case that has put this topic on the table again, plus our past experience at Endless which may have relevance for Silverblue going forward. We originally started considering the problem as a bootloader update thing, but then we enjoyed an uptick in userbase and accumulated a bunch of product history on the way, and now we firmly find ourselves with a larger problem to solve.

Those discussion points aside, there is probably no fundamental incompatibility between a separate bootupd being envisioned here, and the type of wider solution I'm considering for Endless.

However, one lesson learned along this journey is that you should work hard to avoid there being a separate codepath for installation vs update, that is a recipe for failure. So when building images, the bootloader installation should be done by a call into this new solution. Don't end up with image builder code that installs the bootloader being separately maintained from the bootupd solution that puts the updates in place later.

From that angle, if bootupd means "boot updater" then that's an imperfection in the name, since it would also be used for installing a bootloader on a blank disk, it's not only the update case. (Or does it mean "bootup daemon"?) Also, if it does handle both installation-on-blank and updating, then that's a bit of a conceptual difference from fwupd.

In terms of this being a distro-agnostic generic thing, in addition to supporting grub EFI & MBR it would be great if the underlying design would allow this to be cleanly extended to also supporting systemd-boot, Raspberry Pi (i.e. blobs on a special non-ESP FAT partition), and other ARM solutions with good mainline kernel/bootloader support like Rockchip and Amlogic that need special blobs written onto specific sectors at the start of the disk.

cgwalters · 2020-06-17T18:26:12Z

Thanks for replying dsd! I agree with all of your points generally, and I want to reply to one:

From that angle, if bootupd means "boot updater" then that's an imperfection in the name, since it would also be used for installing a bootloader on a blank disk, it's not only the update case. (Or does it mean "bootup daemon"?) Also, if it does handle both installation-on-blank and updating, then that's a bit of a conceptual difference from fwupd.

See this comment in the original issue:

bootupd can be passed a filesystem tree containing expected binaries (for UEFI) and replaces the bits in create_disk.sh to write them to disk to start, and includes metadata about them, and also knows how to wrap grub2-install as needed for the MBR

So yes we're totally in sync that bootupd would need to be responsible for wrapping the installation as well.

Work in progress for coreos/fedora-coreos-tracker#510

cgwalters · 2020-06-26T13:49:54Z

Some initial work on this in https://github.com/coreos/bootupd/

jlebon · 2020-08-05T17:41:27Z

We discussed the Boot Hole vulnerability today in the community meeting, and it came up that OSTree doesn't update the bootloader because it can't do so atomically on FAT. I had a chat with @vathpela afterwards who said it might actually be possible. Pasting logs:

<jlebon> re. FAT, did you mean that it is actually possible to have e.g. an atomic rename(2) ?
<pjones> atomic rename(2) is hard
<pjones> actually it might not be that hard.
<pjones> I'll have to look.
<pjones> atomic /copy/ definitely should be doable.
<jlebon> yeah, though i guess it doesn't help if it's multiple files that need to be updated
<pjones> so the thing you do get 100% atomically write(fd, buf, 512) after an lseek(fd, offset_aligned_to_512,
         SEEK_SET)
<pjones> which means if nothing else we can make /boot/efi/EFI/fedora/a/ and /boot/efi/EFI/fedora/b/ and
         literally a text file /boot/efi/EFI/fedora/lng
<pjones> and lng (for "last known good") say "a" or "b" in it.
<pjones> or if not lng, "current"
<pjones> (I forget which model you need for this)
<jlebon> hmm, how would this work with the EFI firmware though?
<pjones> or even .. you have something like serialized version numbers right?
<pjones> so the trick there is that when we install, we create both of those directories
<jlebon> like, how would it know to look at `current`, then find the files at `/boot/efi/EFI/fedora/$current`
<pjones> and we create boot entries for both of them, and put them both in the boot order
<pjones> updates to BootOrder *should* be atomic so long as the number of entries in the order doesn't change,
         but there might be some work in linux and userland we need to do in order to make that true
<pjones> and then we make something in the early startup check BootCurrent to see which one we booted

cmurf · 2020-08-05T19:01:50Z

Is rename(2) atomic on FAT? - on linux-fsdevel@, Oct 2019

I think the fact this is atomic at the VFS level but not at the FAT level, is an acceptable risk. But also, on common consumer SSD and NVMe, concurrent writes end up on the same erase block. Rename is in effect atomic. Just no one will guarantee that, because in some strange case it might end up the writes happen across two EBs.

cmurf · 2020-08-05T19:12:35Z

My suggestion is copy the new bootloader with some temp name. Sync. Rename. And unmount the volume. There's no good reason to keep this thing mounted persistently all the time anyway.

Anaconda new clean installs to completely empty media results in a 600MiB EFI system partitions since I think Fedora 32. I have a laptop with a ~1 year old clean installed Windows 10 Pro using Microsoft produced media and their installer - not the OEM. And it's a 100MiB ESP. Ample free space is needed on the ESP for firmware update payloads, regardless of what platform does the update.

See https://github.com/coreos/bootupd and coreos/fedora-coreos-tracker#510 Basically in order to handle *updates*, bootupd should also take care of installation. For example this also generates a JSON file with the versions.

See https://github.com/coreos/bootupd and coreos/fedora-coreos-tracker#510 Basically in order to handle *updates*, bootupd also takes care of installation. For example this also generates a JSON file with the versions. In order to sanely "ratchet" this change, only use bootupd if we find the ostree deployment is using it.

See https://github.com/coreos/bootupd and coreos/fedora-coreos-tracker#510 Basically in order to handle *updates*, bootupd also takes care of installation so that it knows the original version. In order to sanely "ratchet" this change, only use bootupd if we find the ostree deployment is using it.

dustymabe · 2020-10-28T21:39:25Z

bootupd is now in FCOS. Any next steps for this ticket?

cgwalters · 2020-10-28T21:46:54Z

It's still in preview but yeah, I think we can close this as "MVP done".

For anyone curious with current stable the steps are:

$ systemctl enable bootupd.socket
$ env BOOTUPD_ACCEPT_PREVIEW=1 bootupctl update

The preview requirement was dropped in 0.2.0.

(But for FCOS...Fedora still hasn't shipped a shim update for boot hole so there's not a really strong reason to update your bootloader today)

cgwalters mentioned this issue Jun 1, 2020

Add an ostree-admin-esp-upgrade subcommand to update the EFI System Partition ostreedev/ostree#1873

Closed

ashcrow added kind/enhancement jira for syncing to jira labels Jun 9, 2020

martinezjavier mentioned this issue Jun 17, 2020

after switch to bls system not boot coreos/rpm-ostree#2121

Open

cgwalters added a commit to cgwalters/rpm-ostree that referenced this issue Jun 20, 2020

WIP: proxy bootupd crate

9b8e1d6

Work in progress for coreos/fedora-coreos-tracker#510

dustymabe mentioned this issue Aug 6, 2020

Response to CVE-2020-10713 (GRUB 2 Boot Hole) #587

Open

cgwalters mentioned this issue Sep 2, 2020

create_disk: Use bootupd to install uefi if configured coreos/coreos-assembler#1695

Merged

jlebon mentioned this issue Sep 23, 2020

Flesh out and document target UX coreos/bootupd#8

Open

cgwalters closed this as completed Oct 28, 2020

dustymabe mentioned this issue Oct 28, 2020

2020-10-28: gather status update for Fedora Council #650

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Design tooling to ship bootloader updates #510

Design tooling to ship bootloader updates #510

cgwalters commented Jun 1, 2020 •

edited

Loading

ashcrow commented Jun 1, 2020

cgwalters commented Jun 1, 2020

ashcrow commented Jun 1, 2020

cgwalters commented Jun 1, 2020 •

edited

Loading

dustymabe commented Jun 2, 2020

cmurf commented Jun 2, 2020 •

edited

Loading

martinezjavier commented Jun 2, 2020

martinezjavier commented Jun 2, 2020

cgwalters commented Jun 2, 2020

martinezjavier commented Jun 2, 2020

dsd commented Jun 3, 2020

cgwalters commented Jun 17, 2020

cgwalters commented Jun 26, 2020

jlebon commented Aug 5, 2020 •

edited

Loading

cmurf commented Aug 5, 2020

cmurf commented Aug 5, 2020 •

edited

Loading

dustymabe commented Oct 28, 2020

cgwalters commented Oct 28, 2020

Design tooling to ship bootloader updates #510

Design tooling to ship bootloader updates #510

Comments

cgwalters commented Jun 1, 2020 • edited Loading

ashcrow commented Jun 1, 2020

cgwalters commented Jun 1, 2020

ashcrow commented Jun 1, 2020

cgwalters commented Jun 1, 2020 • edited Loading

dustymabe commented Jun 2, 2020

cmurf commented Jun 2, 2020 • edited Loading

martinezjavier commented Jun 2, 2020

martinezjavier commented Jun 2, 2020

cgwalters commented Jun 2, 2020

martinezjavier commented Jun 2, 2020

dsd commented Jun 3, 2020

cgwalters commented Jun 17, 2020

cgwalters commented Jun 26, 2020

jlebon commented Aug 5, 2020 • edited Loading

cmurf commented Aug 5, 2020

cmurf commented Aug 5, 2020 • edited Loading

dustymabe commented Oct 28, 2020

cgwalters commented Oct 28, 2020

cgwalters commented Jun 1, 2020 •

edited

Loading

cgwalters commented Jun 1, 2020 •

edited

Loading

cmurf commented Jun 2, 2020 •

edited

Loading

jlebon commented Aug 5, 2020 •

edited

Loading

cmurf commented Aug 5, 2020 •

edited

Loading