Develop Fedora CoreOS layering user stories #1219

jlebon · 2022-06-06T18:56:50Z

In this ticket, let's come up with the various use cases that CoreOS layering enables for Fedora CoreOS. This will allow us to have more targeted discussions about each of them and evaluate (1) whether they're worth pursuing, (2) what the UX would look like, and (3) how it would be implemented.

Proposed documentation for use cases/explanation:

CoreOS Layering Use Cases

The techology we are referring to a "CoreOS Layering" allows users/projects
to easily build derivative works that build on top of Fedora CoreOS. The build
experience re-uses the container build workflow (i.e. Dockerfile or Containerfile),
which is pervasive across the industry. The output of this build is a container
image that can be hosted in a registry. Existing systems can be rebased to this
container image and follow updates from there.

This is a significant change to the tooling around building new OSTree commits
based on CoreOS. This is a net new set of features/tooling. It doesn't change
any current tooling users may use and does not require users to make any changes
to their current setup.

Here are some current use cases that are made possible or easier with CoreOS
Layering:

As an End User

Currently Fedora CoreOS allows users to modify the system either through Ignition,
by writing to any read/write directories, or via RPM package layering on the client
side. These two approaches have served us well, but some use cases can be improved.

First, delivering configuration/files/software via Ignition works well, but can get
heavy and also requires the user to re-provision if the configuration ever changes;
Ignition is only supported for the first boot of a machine. Secondly, client side
package layering can take a lot of time to run on first boot, makes upgrades less
reliable, and isn't tightly controlled (software in package repositories change often).

It may be more desirable to deliver all of the changes as a derivative layer built as
a container image, and delivered on first boot. We think this will be attractive in
the following uses:

Configuration
- CoreOS Layering allows for configuration changes to be delivered as an update
  - With Ignition you must re-deploy the machine or build a bespoke method
    for delivering configuration updates (i.e. some tool using SSH).

Detailed Scenario: A user does a bare metal install of 10 systems in a datacenter.
The user later discovers they should have deployed the systems with bonded networking
setup With CoreOS Layering this change can be made to the Dockerfile definition,
rebuilt, and delivered as an update. Without CoreOS Layering the recommended way
would be to re-install/re-provision the machines; which would represent a significant
waste of time for this user.

Unpackaged Software
- CoreOS Layering allows software to be built and layered in one operation
  - Re-using container build technology we're able to do multi-stage builds
    - This allows us to detect and match target software versions at build time
- CoreOS Layering allows more invasive operations than client side package layering
  - Users are allowed to write files to directories that are read-only client side
    - i.e. can write binaries into /usr/bin/ vs. /usr/local/bin
  - coreos-layering allows the software to be built and layered in one operation (multi-stage build)

Detailed Scenario: One example here that is compelling would be the case of third
party kernel modules. A user can do a new multi-stage container build based on a
recently delivered Fedora CoreOS base container. This multi-stage build can detect
the delivered kernel in the base container, build the kernel module from source,
and copy the results into the target container image. This committed container
can now be pushed to a registry and and clients can target that image for updates.

Packaged Software
- CoreOS Layering allows the package layering to happen server side
  - With client side package layering
    - Every client has to do it separately
    - Each client now has to pull metadata from package repositories
      - This is heavyweight and happens at runtime
      - Changes in the package repo might cause upgrades to fail
  - With CoreOS Layering
    - No risk of package repo issues client side
    - Derivative commit can be tested before being delivered to clients

Detailed Scenario: To illustrate this use case further we can walk through a client
side package layer scenario. Machine A and machine B exist and are following the
stable stream. Both have the NetworkManager-wifi package layered. A new
stable update released on Monday. Machine A updates on Monday, early in the
rollout window; the client pulls in the new update and pulls NetworkManager-wifi
from the package repo, making a new client side commit. On Tuesday Machine B
attempts to update. At this point a few things could go wrong:

The package repo is unavailable. In this case client side package layering
operation will fail. Machine B stays on the old commit and keeps retrying.
The package layering is successful, but pulls in a different version of
NetworkManager-wifi than Machine A.

In both of these cases Machine A and Machine B, which are expected to be more or less
running the same exact software, have diverged. Further, in scenario 1. if the package
repository never comes back (i.e. using a repo from a third party) that machine is stuck
forever.

As a Layered Project

Fedora CoreOS provides a nice stable base for other projects to build on top of,
however every decision Fedora CoreOS makes isn't always right for layered projects.
Currently the layered project will need to either decide to encode every change into
an Ignition configuration that runs on boot of every instance or rebuild a brand new
OSTree completely from scratch.

CoreOS Layering offers the opportunity for layered projects to easily make tweaks
to Fedora CoreOS. Some examples of layered projects as of July 2022 that take
different approaches:

Podman machine
- Uses Fedora CoreOS with a heavy Ignition config to customize instances on boot
OKD
- Rebuilds Fedora CoreOS from scratch with additions for OKD

With CoreOS Layering these projects can provide a more polished solution for end users:

Less changes get applied client side
- Decreased opportunity for provisioning issues
The commit that was built server side can get tested in CI
- Validation of the built (derived) commit can happen in CI

Potential Drawbacks of CoreOS Layering

The CoreOS Layering technology is still under active development. There are
currently some workflows that haven't been fully fleshed out. Here is a summary:

Build tooling/infra for CoreOS Layering containers
- Any user creating derivative containers of Fedora CoreOS will need to
  continue to monitor and build new derivative containers when new Fedora
  CoreOS updates are released. These will need to be hosted in a registry
  that their machines can then pull container images from.
Updates via Zincati (update barriers; update graphs)
- Zincati/Cincinnati offer us a "safe" path to traverse when deploying
  updates to systems. When following a container image in a registry the
  user is following whatever is latest. Work still needs to be done to
  get back the added value from Zincati, into the CoreOS Layering workflow.

The text was updated successfully, but these errors were encountered:

dustymabe · 2022-07-20T16:25:42Z

@jlebon and I got together to try to flesh out the CoreOS Layering use cases.

I have updated the description of this issue (2022-07-20) with some proposed use cases where the value of CoreOS layering is illustrated. I have also described current limitations of CoreOS Layering that we'll be working to address in the coming months. Please take a look and let us know how this could be improved and if any corrections need to be made.

cgwalters · 2022-07-20T18:20:28Z

Looked through this; seems sane. Thanks so much for writing this up!

The techology we are referring to a "CoreOS Layering"

That said, https://fedoraproject.org/wiki/Changes/OstreeNativeContainer currently proposes "ostree native container". About 90% of all the stuff written here applies outside of Fedora CoreOS too. Something to keep in mind.

cgwalters · 2023-05-02T19:30:53Z

I think chunks of this are now covered by the existence of https://github.com/coreos/layering-examples
right?

If it's about trying to just explain the benefits and drawbacks, we could just move that into coreos/fedora-coreos-docs#540 ?

cgwalters · 2023-05-03T17:08:12Z

I think what this issue is trying to get at really is the baseline question of when should users:

Take a pre-built "golden image" and just configure it (workstation and FCOS) today
Derive from and own OS updates (layering)

I think for 90% of this there's really nothing FCOS specific about this in the end (this is why the term "coreos layering" is misleading as a technology descriptor). IOW all of these tradeoffs are things that also apply to other rpm-ostree based systems. Particularly relevant to, but not limited to desktop ones.

So my instinct here is to:

take these concerns and just add it upstream to the rpm-ostree docs
Link from Add a doc for container provisioning and updates fedora-coreos-docs#540 to the rpm-ostree docs around this
Close this issue

?

bgilbert · 2023-05-03T17:27:16Z

Fedora CoreOS docs describe lots of things that technically repeat documentation elsewhere. We should aim to be maximally helpful to new users, rather than asking them to assemble the opinions of various upstreams.

But also, FCOS is opinionated in various ways, and "when should you use Ignition vs. layering" seems like an important thing to have an opinion about. It affects not only the advice we give users, but our priorities for the functionality we build, and indeed how we think about the distro as a whole.

cgwalters · 2023-05-04T12:33:04Z

Yes, I have some! I added some bits from this in coreos/fedora-coreos-docs#540 (comment)

But I'd hope you have specific opinions on this too that could be expressed here in the doc directly - would love to make this feel more collaborative.

cgwalters · 2023-05-04T18:59:17Z

I made this point in an OpenShift meeting, but I wanted to write it down here; going back one level, when I was saying this isn't FCOS specific in that it also applies to other rpm-ostree systems - in fact even if rpm-ostree (and coreos etc.) didn't exist, this problem also exists today in RHEL.

The day that RHEL introduced Image Builder, suddenly there are two ways to set up that Postgresql server in Azure (start from stock cloud image, maybe use cloud-init/ansible/whatever and yum install postgresq) or build golden disk images with IB and do updates by instance teardown/spinup. These things have the same very fundamental tradeoffs around systems management that we're introducing here.

I think indeed, it is on us to provide guidance. But I don't think there's any real way to not support these two paths ("configure" vs "build/own").

Now, what I hope actually is this fundamental change in mindset and technology eventually leads us to a place where we have a more "seamless" spectrum into what is e.g. today Fedora Cloud and Fedora Server, instead of having harder barriers.

jlebon added area/bootable-containers Related to the bootable containers effort. meeting topics for meetings and removed meeting topics for meetings labels Jun 6, 2022

This comment was marked as outdated.

Sign in to view

jlebon mentioned this issue Jul 21, 2022

Making Cincinnati updates work with ostree containers #1263

Open

bgilbert mentioned this issue Nov 13, 2022

config,storage: support populating directories from archives coreos/ignition#1498

Open

cgwalters mentioned this issue Dec 14, 2022

Ship layering as an "equal" model in Fedora CoreOS #1363

Open

jlebon mentioned this issue May 2, 2023

Add a doc for container provisioning and updates coreos/fedora-coreos-docs#540

Open

travier added the kind/enhancement label Sep 7, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Develop Fedora CoreOS layering user stories #1219

Develop Fedora CoreOS layering user stories #1219

jlebon commented Jun 6, 2022 •

edited by dustymabe

Loading

dustymabe commented Jul 20, 2022

cgwalters commented Jul 20, 2022

This comment was marked as outdated.

cgwalters commented May 2, 2023

cgwalters commented May 3, 2023

bgilbert commented May 3, 2023

cgwalters commented May 4, 2023

cgwalters commented May 4, 2023

Develop Fedora CoreOS layering user stories #1219

Develop Fedora CoreOS layering user stories #1219

Comments

jlebon commented Jun 6, 2022 • edited by dustymabe Loading

Proposed documentation for use cases/explanation:

CoreOS Layering Use Cases

As an End User

As a Layered Project

Potential Drawbacks of CoreOS Layering

dustymabe commented Jul 20, 2022

cgwalters commented Jul 20, 2022

This comment was marked as outdated.

cgwalters commented May 2, 2023

cgwalters commented May 3, 2023

bgilbert commented May 3, 2023

cgwalters commented May 4, 2023

cgwalters commented May 4, 2023

jlebon commented Jun 6, 2022 •

edited by dustymabe

Loading