Making Cincinnati updates work with ostree containers #1263

jlebon · 2022-07-21T16:31:59Z

This is split out of #1219 to discuss specifically how we'll make system updates work on hosts using CoreOS layering. Quoting:

Updates via Zincati (update barriers; update graphs)
- Zincati/Cincinnati offer us a "safe" path to traverse when deploying
  updates to systems. When following a container image in a registry the
  user is following whatever is latest. Work still needs to be done to
  get back the added value from Zincati, into the CoreOS Layering workflow.

Let's discuss ideas on how to address this gap.

jlebon · 2022-07-21T16:33:03Z

(Originally posted by @cgwalters in #1219 (comment))

But regarding zincati specifically:

half-baked strawman: embed barriers in the container image

We encode "epochs"/barriers in like this:

quay.io/coreos-assembler/fcos:stable-v0
quay.io/coreos-assembler/fcos:stable-v1

We embed metadata in the container images that says that quay.io/coreos-assembler/fcos:stable-v1 is the successor to quay.io/coreos-assembler/fcos:stable-v0.

This is related to ostreedev/ostree#874

jlebon · 2022-07-22T14:27:20Z

Another half-baked proposal:

Tag-based updates

Zincati itself doesn't have much OSTree knowledge. Its main output is telling rpm-ostree which version to deploy. This model could carry over in the CoreOS layering world with the right semantics in place:

We change the FCOS pipeline to also tag released images with the version. E.g. we'd have both the moving tag quay.io/fedora/fedora-coreos:stable and also the fixed tag quay.io/fedora/fedora-coreos:36.20220703.3.1.
We require whatever tool users have to rebuild their layered images to retain the same base tag. E.g. quay.io/jlebon/my-layered-fedora-coreos:36.20220703.3.1.
We keep Zincati configured to follow the canonical update graph but in CoreOS layering mode, it notices that the host refspec is a container image and so knows to pass the version tag to rpm-ostree rather than the OSTree checksum. I.e. instead of
```
rpm-ostree deploy --lock-finalization --skip-branch-check revision=${checksum}
```
it would run e.g.:
```
rpm-ostree deploy --lock-finalization tag=${version}
```
It will happily block there and periodically retry until it succeeds if the corresponding container image hasn't been built yet.
We teach rpm-ostree deploy to handle this.

Rolling out out-of-cycle changes

When users modify their layered content, they might not want to wait until the next FCOS release to roll it out. What I don't want is Zincati learning to speak to container registries. Instead, we can have it periodically ask rpm-ostree to check if updates to that tag are available.

There's a similarity here with client-side RPM layering from repos: new RPM versions won't actually be updated until the next release or if one explicitly does rpm-ostree upgrade. One could imagine in the future a Zincati knob to also force it to periodically ask rpm-ostree to check for layered RPM updates and e.g. rollout if a CVE is fixed.

jlebon · 2022-07-22T14:53:51Z

Rolling out out-of-cycle changes

I've updated this section of the proposal based on discussions OOB with @lucab.

dustymabe · 2022-07-22T16:48:00Z

What I don't want is Zincati learning to speak to container registries.

I'm not so sure. I don't feel like this is super heavyweight IMO. I guess it gets more complicated if there is authentication involved.

cgwalters · 2022-07-22T17:01:05Z

[jlebon] What I don't want is Zincati learning to speak to container registries.
[dusty] I'm not so sure. I don't feel like this is super heavyweight IMO. I guess it gets more complicated if there is authentication involved.

What's cool is: I specifically split out Rust bindings to use skopeo into a dependency of ostree to support use cases exactly like this! (That project is also already being used by at least one project not published on crates.io.

(As of recently there is apparently now https://github.com/confidential-containers/image-rs/blob/main/docs/design.md which explicitly cites the proxy code)

cgwalters · 2022-07-22T17:02:29Z

This all said - storing things in registries today that aren't actually runnable containers is awkward. But, OCI Artifacts is coming to make that better.

jlebon · 2022-07-25T21:26:58Z

What I don't want is Zincati learning to speak to container registries.

I'm not so sure. I don't feel like this is super heavyweight IMO. I guess it gets more complicated if there is authentication involved.

The reason I said this was that I didn't want yet another thing pulling in a container stack, but containers-image-proxy-rs is a good counterpoint. Though even then, rpm-ostree obviously needs to know how to do this, so it seems cleaner to me to have Zincati use an rpm-ostree API to check for updates instead of directly checking it itself. It also allows abstracting over the delivery method (containers vs ostree + RPMs).

cgwalters · 2022-10-02T20:26:28Z

There was some out of band discussion on this as it relates to Fedora IoT and the idea of supporting Cincinnati there too; the more I think about this the more I feel like it makes sense to entirely fold the core functionality of zincati into rpm-ostree at some point.

(The whole thing we did with "update drivers" is really complex, and while I think it's still logically something we want for the general complex case, it'd be a lot more obvious from a UX point of view to have e.g. rpm-ostree upgrade do and/or output something more useful)

This is part of coreos/fedora-coreos-tracker#1263 If we're booted into a container image, then instead of looking for the special `fedora-coreos.stream` ostree commit metadata, we can do the much more obvious and natural thing of looking at the container image tag.

cgwalters · 2022-11-02T18:31:45Z

Some work on this in coreos/zincati#878

This is part of coreos/fedora-coreos-tracker#1263 If we're booted into a container image, then instead of looking for the special `fedora-coreos.stream` ostree commit metadata, we can do the much more obvious and natural thing of looking at the container image tag.

This is part of coreos/fedora-coreos-tracker#1263 We don't yet have an official stance on how zincati and custom container images interact. Today, zincati just crash loops. This changes things so that we gracefully exit if we detect the booted system is using a container image origin. (The code here isn't quite as clean as it could be; calling `std::process::exit()` in the middle of the call chain isn't elegant but doing better would require plumbing through an `Option<T>` through many layers)

From discussion in coreos/zincati#878 related to coreos/fedora-coreos-tracker#1263

This is part of coreos/fedora-coreos-tracker#1263 We don't yet have an official stance on how zincati and custom container images interact. Today, zincati just crash loops. This changes things so that we exit (but still with an error) if we detect the booted system is using a container image origin. One nicer thing here is that the unit status is also updated, e.g. `systemctl status zincati` will show: `Status: "Automatic updates disabled; booted into container image ..."`

From discussion in coreos/zincati#878 related to coreos/fedora-coreos-tracker#1263

cgwalters · 2023-01-19T21:32:32Z

I think this currently blocks on #1367

dustymabe · 2023-01-19T23:13:02Z

If we stick with the full Zincati/Cincinnati graph setup, I think we really have to provide good docs and possibly tooling for users.

I think this possibly depends on how we implement things on our end. Definitely something to bring up in the discussion.

bgilbert · 2023-01-20T05:07:31Z

Re things the graph gives us, I'd split the second one and add one more:

2a. Scheduled rollouts over a defined time interval.
2b. The ability to pause or terminate a rollout.
3. Deadend releases which can't upgrade. We've used this exactly once, and I'm not aware of a pressing use case for it in the initial design. But e.g. we've had conversations in the past about forcing nodes older than X to reprovision from scratch.

Re dropping the barrier releases, I'm not so concerned about the runtime cost of carrying upgrade code (it's usually a shell script, with a unit that can check for a stamp file), but I do think regressions are a concern. We'd need to get better at carrying tests for upgrading from a specified older release. Key rotation seems harder to solve though.

cgwalters · 2023-01-20T14:25:10Z

2a/2b can also be done by having the registry server itself do this, right? (Now, an interesting topic here is whether we'd want to somehow apply the same policies to clients fetching it as a container image via podman/docker/kube and not to boot directly)
In this implementation model, perhaps we have registry.osupdates.fedoraproject.org for example that's a "smart" server that redirects (304) to quay.io for blobs for example.

is something doable via the container image metadata itself, no? There's actually an ostree EOL metadata key already...but we should I think standardize a metadata key for this and have it emit an error.

jlebon · 2023-01-20T15:21:25Z

In this implementation model, perhaps we have registry.osupdates.fedoraproject.org for example that's a "smart" server that redirects (304) to quay.io for blobs for example.

The redirect approach sounds interesting. If we want to keep the current wariness stuff (which is how the phasing happens), we'd have to somehow have the request include a stable UUID, e.g. as an HTTP header.

is something doable via the container image metadata itself, no? There's actually an ostree EOL metadata key already...but we should I think standardize a metadata key for this and have it emit an error.

The ostree EOL metadata key addresses the case where you know at build time that it's going to be the last commit in the stream. Deadend releases are usually understood to be deadends after the fact.

Unless you mean including information about past deadend releases in the metadata of new images we push on that stream, until the deadend releases go EOL. It's a bit of a hack, but nice in its simplicity (and anyway, deadend releases should be quite rare, so I don't expect that metadata to grow out of control). I guess another approach is keeping it as a separate OCI artifact.

cgwalters · 2023-01-20T15:26:34Z

The redirect approach sounds interesting. If we want to keep the current wariness stuff (which is how the phasing happens), we'd have to somehow have the request include a stable UUID, e.g. as an HTTP header.

Yes, this would require a bit of work in the containers/image (and proxy) stack to request adding a header, but that seems straightforward.

Unless you mean including information about past deadend releases in the metadata of new images we push on that stream, until the deadend releases go EOL.

Right, we can push a new image that just changes the manifest, but leaves all the blobs the same. (Right now, the bootc stack actually will reboot in the metadata-only case, but we can optimize that...)

jlebon · 2023-01-20T17:32:27Z

Unless you mean including information about past deadend releases in the metadata of new images we push on that stream, until the deadend releases go EOL.

Right, we can push a new image that just changes the manifest, but leaves all the blobs the same. (Right now, the bootc stack actually will reboot in the metadata-only case, but we can optimize that...)

That's not quite what I meant. :) I mean putting information about old deadends into new images and carrying it for a while. Changing manifests we've already pushed for the same version doesn't feel right. (Edit: well... I don't know, maybe it's called for given that deadends are exceptional events. But it feels funny changing the digest of an existing image.)

bgilbert · 2023-01-21T05:36:51Z

The redirection server could be generic over derived images too: rollout.updates.fedoraproject.org/quay.io/fedora/fedora-coreos could be our base image, and rollout.updates.fedoraproject.org/quay.io/bgilbert/derived could apply rollout logic to a derived image. Perhaps the rollout start/duration could be driven by labels, allowing custom derives to pick their own rollout schedules? Pausing/restarting rollouts would be a manifest change (...which of course would change the hash). Registries could choose to implement this functionality natively, avoiding the redirection server, and we could ship a redirection server container for use with private registries.

...also this reinvents orchestration a bit, which doesn't feel great. In that sense, Zincati/Cincinnati is closer to the existing k8s model, where there's an external orchestrator deciding what to pull. The label approach may be easier to deploy though.

If we want to keep the current wariness stuff (which is how the phasing happens), we'd have to somehow have the request include a stable UUID, e.g. as an HTTP header.

I thought I remembered that at one point the FCOS Cincinnati handshake was changed to avoid sending a client UUID, since we wanted to avoid uniquely identifying the client. Instead, either the conditional update would be encoded into the graph so that the wariness could be applied client-side, or the client would pick a wariness for each rollout and send it to the server. Looking at the code, apparently that never happened in the default case, and we are indeed sending a UUID to the server.

Perhaps we should avoid perpetuating that design, though, and the client should send a wariness value instead of a UUID. While a sufficiently fine-grained wariness would uniquely identify the client for the period between successive updates, it wouldn't be a long-term identifier.

lucab · 2023-01-21T11:01:40Z

The reference for the existing protocol is https://coreos.github.io/zincati/development/cincinnati/protocol/#graph-api, while Zincati configuration details are at https://coreos.github.io/zincati/usage/agent-identity/#identity-configuration.

On the two points above about UUIDs and wariness:

the client can optionally send a wariness value. If it is present in the request, the server obey this. Otherwise it computes one for the request. By default Zincati doesn't assign a value, but it can send one if it configured by the user.
the client can optionally send a UUID. The protocol does not require it, though it helps calculating a wariness value which is sticky to a client (instead of per-request). Zincati sends an app-specific ID by default, it can be overridden through configuration, and is also used for reboot-lock management. So it is possible to identify a client (by design), serve a custom graph, and cross-link with reboot management. But it shouldn't be possible to track back the machine-id from it.

Fedora CoreOS is not yet using containers by default for updates; xref coreos/fedora-coreos-tracker#1263 etc. Consequently, when one boots a FCOS system and wants to rebase to a custom image, one ends up downloading the entire image, including the parts of FCOS that you already have. This changes things so that when we generate disk images by default, we write the *layer refs* of the component parts - but we delete the "merged" container image ref. The semantics here will be: - Only a tiny amount of additional data used by default; the layer refs are just metadata, the bulk of the data still lives in regular file content. - When a FCOS system auto-updates to its by-default usage of an ostree commit, the unused layer refs will be garbage collected. - But, as noted above when rebasing to a container image instead, if the target container image reuses some of those layers (as we expect when rebasing FCOS to a FCOS-derived container) then we don't need to redownload them - we only download what the user provided. Hence, this significantly improves rebasing to container images, with basically no downsides. The alternative code path to actually deploy *as a container* remains off by default. When that is enabled, `rpm-ostree upgrade` fetches a container by default, which is a distinct thing.

cgwalters · 2023-05-02T12:32:54Z

I re-read this thread and I still find myself somewhat unconvinced that we need to require an update graph instead of just fetching from a container tag. It definitely has value, but comes with a lot of operational complexity as a cost.

Of all of the things discussed here, I think we need to take a decision on whether or not we try to require barriers in the future. My vote is no, we just carry the upgrade code for longer, say a year (two fedora majors).

cgwalters · 2023-05-03T14:37:34Z

Moving the thread re bootloaders from #1485 (comment)

To be clear, what I was just trying to say is I found it slightly confusing to have to dig through git history to find the failing systemd unit; if we'd tried to do something without barriers, then the unit would have probably had a "stamp file" approach of e.g. ConditionPathExists=/var/lib/bootloader-updated-once.stamp etc. (Or, better driven natively into bootupd like bootupctl update --if-older-than)

I would agree that it seems hard to solve this particular problem without barriers.

That said, there is a whole other new thing we could add here, which is a mechanism to pull and execute a container image before trying to apply an OS update.

That would align with what we're doing effectively in OCP with the MCO, and be an extremely flexible and powerful escape hatch. Basically zincati (or maybe rpm-ostree) would do e.g. podman run --privileged --rm quay.io/fedora/fedora-coreos-preupgrade:latest before trying to actually apply an update.

dustymabe · 2023-05-03T15:28:39Z

Moving the thread re bootloaders from #1485 (comment)

To be clear, what I was just trying to say is I found it slightly confusing to have to dig through git history to find the failing systemd unit; if we'd tried to do something without barriers, then the unit would have probably had a "stamp file" approach of e.g. ConditionPathExists=/var/lib/bootloader-updated-once.stamp etc. (Or, better driven natively into bootupd like bootupctl update --if-older-than)

Right. We sometimes use the stamp file approach and don't super aggressively remove the unit until the next barrier is created (i.e. when we put in migration code we don't always have to do a barrier at the same time like we had to here).

I would agree that it seems hard to solve this particular problem without barriers.

Thanks. This problem was super tricky and we were lucky we had barriers and bootupd to help us get out of the position we were in. We needed them both.

That said, there is a whole other new thing we could add here, which is a mechanism to pull and execute a container image before trying to apply an OS update.

That would align with what we're doing effectively in OCP with the MCO, and be an extremely flexible and powerful escape hatch. Basically zincati (or maybe rpm-ostree) would do e.g. podman run --privileged --rm quay.io/fedora/fedora-coreos-preupgrade:latest before trying to actually apply an update.

I've thought about this problem before and brought it up. In my mind it would just be extra code we shipped in the OSTree commit we build today that would know how to handle/do migrations depending on different factors. i.e. the OSTree gets downloaded (from a container registry or OSTree repo, doesn't matter) and special migration code gets extracted from it and run before. This code could choose to block the upgrade or allow it to continue, etc.

I don't really see why this would need to be a separate container image (just seems like more work IMO).

cgwalters · 2023-05-03T16:13:00Z

I don't really see why this would need to be a separate container image (just seems like more work IMO).

Pre-upgrade logic could have different dependencies than the OS, for one. And we wouldn't pay the storage cost of carrying them on disk after completion.

dustymabe · 2023-05-03T17:25:55Z

Pre-upgrade logic could have different dependencies than the OS, for one.

Right, in which case we could just have the migration code call podman? Though that is one thing to think about too. At least for FCOS users they can choose between docker or podman and just running the container engine usually will "initialize" some things, which may be undesirable.

cgwalters · 2023-05-04T13:17:03Z

OK, moving this to ostreedev/ostree#2855 and I think with that, we can stop requiring barriers in many cases. That could use some further analysis, but going through a few of them (e.g. the recent bootloader one, or the iptables one) I am pretty sure it'd be viable.

jlebon · 2023-05-04T16:44:20Z

I'm really not a fan of pre-upgrade code execution, but I agree it's a way out of this. Between hooks and barriers, I'd indeed prefer the former. If we do this, I'd like to see tight policies and maintenance around what we do in there. (E.g. we can say we drop workarounds there after X months.)

I'd agree with @dustymabe re. container runtimes. I think all our migration code so far has been not too complex scripts. It's not ideal, but it also has minimal impact on the node state, and dependency concerns are much less of an issue.

cgwalters · 2023-08-16T17:20:42Z

My inclination BTW is to try to drop the rollouts and wariness etc. and keep it super simple - the client tracks a container image tag, fetched once a day by default.

Anyone who wants to do anything more than that (replicating something like current "wariness") can mirror the OS update containers on their own to a registry and update it on their own timeframe and schedule. Which is what they already need to know how to do for application containers! Because there's no zincati/cincinnati for podman/kubelet (or for dnf/RPMs for that matter).

jlebon mentioned this issue Jul 21, 2022

Develop Fedora CoreOS layering user stories #1219

Open

dustymabe mentioned this issue Sep 28, 2022

alias quay.io/fedora/fedora-coreos:stable to :latest in quay #1309

Closed

cgwalters changed the title ~~Making updates work with CoreOS layering~~ Making updates work with ostree containers Oct 2, 2022

cgwalters added the area/bootable-containers Related to the bootable containers effort. label Oct 3, 2022

cgwalters mentioned this issue Nov 2, 2022

Initial support for container image updates coreos/zincati#878

Open

cgwalters changed the title ~~Making updates work with ostree containers~~ Making Cincinnati updates work with ostree containers Nov 18, 2022

cgwalters mentioned this issue Nov 19, 2022

service: exit and prevent restarts if booted into a container image coreos/zincati#896

Merged

cgwalters added a commit to cgwalters/coreos-assembler that referenced this issue Nov 19, 2022

build: Add fedora-coreos.stream to image labels

8bad257

From discussion in coreos/zincati#878 related to coreos/fedora-coreos-tracker#1263

cgwalters mentioned this issue Nov 19, 2022

build: Add fedora-coreos.stream to image labels coreos/coreos-assembler#3214

Merged

cgwalters mentioned this issue Dec 14, 2022

Ship layering as an "equal" model in Fedora CoreOS #1363

Open

cgwalters mentioned this issue Jan 10, 2023

Add a Drogue IoT MQTT connector coreos/zincati#904

Closed

cgwalters added a commit to cgwalters/coreos-assembler that referenced this issue Jan 11, 2023

build: Add fedora-coreos.stream to image labels

c0dc9d0

From discussion in coreos/zincati#878 related to coreos/fedora-coreos-tracker#1263

cgwalters self-assigned this Jan 11, 2023

jlebon pushed a commit to coreos/coreos-assembler that referenced this issue Jan 11, 2023

build: Add fedora-coreos.stream to image labels

c38daf9

From discussion in coreos/zincati#878 related to coreos/fedora-coreos-tracker#1263

cgwalters mentioned this issue Jan 13, 2023

Create container repo tags for each FCOS release #1367

Open

cgwalters mentioned this issue Feb 14, 2023

create_disk: Create image layer refs by default coreos/coreos-assembler#3359

Merged

dustymabe mentioned this issue Mar 17, 2023

zincati service fails to start if non-conforming ostree deployment exists coreos/zincati#859

Closed

dustymabe mentioned this issue May 2, 2023

Add a doc for container provisioning and updates coreos/fedora-coreos-docs#540

Open

cgwalters mentioned this issue May 3, 2023

bootupd fails on mirrored boot disks #1485

Open

cgwalters mentioned this issue May 4, 2023

Add pre-upgrade logic ostreedev/ostree#2855

Open

bsherman mentioned this issue May 25, 2023

update documentation and config once zincati works with ostree containers ublue-os/ucore#54

Open

cgwalters mentioned this issue Jun 15, 2023

Support updating air-gapped instances #261

Open

LorbusChris mentioned this issue Nov 6, 2023

Add opinionated but configurable automatic updates containers/bootc#5

Closed

alxbl mentioned this issue Mar 19, 2024

Setup auto-updates with ZFS layers alxbl/lab#2

Closed

4 tasks

cgwalters mentioned this issue Apr 10, 2024

Fold Zincati features into bootc containers/bootc#459

Closed

travier mentioned this issue May 13, 2024

Roadmap to Fedora Bootable Containers #1726

Open

1 task

travier mentioned this issue May 22, 2024

Provide an rpm-ostree compatible interface to help with migration containers/bootc#558

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Making Cincinnati updates work with ostree containers #1263

Making Cincinnati updates work with ostree containers #1263

jlebon commented Jul 21, 2022

jlebon commented Jul 21, 2022

jlebon commented Jul 22, 2022 •

edited

Loading

jlebon commented Jul 22, 2022

Rolling out out-of-cycle changes

dustymabe commented Jul 22, 2022

cgwalters commented Jul 22, 2022

cgwalters commented Jul 22, 2022

jlebon commented Jul 25, 2022

cgwalters commented Oct 2, 2022

cgwalters commented Nov 2, 2022

cgwalters commented Jan 19, 2023

dustymabe commented Jan 19, 2023

bgilbert commented Jan 20, 2023

cgwalters commented Jan 20, 2023 •

edited

Loading

jlebon commented Jan 20, 2023

cgwalters commented Jan 20, 2023

jlebon commented Jan 20, 2023 •

edited

Loading

bgilbert commented Jan 21, 2023

lucab commented Jan 21, 2023

cgwalters commented May 2, 2023 •

edited

Loading

cgwalters commented May 3, 2023

dustymabe commented May 3, 2023

cgwalters commented May 3, 2023

dustymabe commented May 3, 2023

cgwalters commented May 4, 2023

jlebon commented May 4, 2023

cgwalters commented Aug 16, 2023

Making Cincinnati updates work with ostree containers #1263

Making Cincinnati updates work with ostree containers #1263

Comments

jlebon commented Jul 21, 2022

jlebon commented Jul 21, 2022

half-baked strawman: embed barriers in the container image

jlebon commented Jul 22, 2022 • edited Loading

Tag-based updates

Rolling out out-of-cycle changes

jlebon commented Jul 22, 2022

Rolling out out-of-cycle changes

dustymabe commented Jul 22, 2022

cgwalters commented Jul 22, 2022

cgwalters commented Jul 22, 2022

jlebon commented Jul 25, 2022

cgwalters commented Oct 2, 2022

cgwalters commented Nov 2, 2022

cgwalters commented Jan 19, 2023

dustymabe commented Jan 19, 2023

bgilbert commented Jan 20, 2023

cgwalters commented Jan 20, 2023 • edited Loading

jlebon commented Jan 20, 2023

cgwalters commented Jan 20, 2023

jlebon commented Jan 20, 2023 • edited Loading

bgilbert commented Jan 21, 2023

lucab commented Jan 21, 2023

cgwalters commented May 2, 2023 • edited Loading

cgwalters commented May 3, 2023

dustymabe commented May 3, 2023

cgwalters commented May 3, 2023

dustymabe commented May 3, 2023

cgwalters commented May 4, 2023

jlebon commented May 4, 2023

cgwalters commented Aug 16, 2023

jlebon commented Jul 22, 2022 •

edited

Loading

cgwalters commented Jan 20, 2023 •

edited

Loading

jlebon commented Jan 20, 2023 •

edited

Loading

cgwalters commented May 2, 2023 •

edited

Loading