Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Develop Fedora CoreOS layering user stories #1219

Open
jlebon opened this issue Jun 6, 2022 · 8 comments
Open

Develop Fedora CoreOS layering user stories #1219

jlebon opened this issue Jun 6, 2022 · 8 comments
Labels
area/bootable-containers Related to the bootable containers effort. kind/enhancement

Comments

@jlebon
Copy link
Member

jlebon commented Jun 6, 2022

In this ticket, let's come up with the various use cases that CoreOS layering enables for Fedora CoreOS. This will allow us to have more targeted discussions about each of them and evaluate (1) whether they're worth pursuing, (2) what the UX would look like, and (3) how it would be implemented.

Proposed documentation for use cases/explanation:

CoreOS Layering Use Cases

The techology we are referring to a "CoreOS Layering" allows users/projects
to easily build derivative works that build on top of Fedora CoreOS. The build
experience re-uses the container build workflow (i.e. Dockerfile or Containerfile),
which is pervasive across the industry. The output of this build is a container
image that can be hosted in a registry. Existing systems can be rebased to this
container image and follow updates from there.

This is a significant change to the tooling around building new OSTree commits
based on CoreOS. This is a net new set of features/tooling. It doesn't change
any current tooling users may use and does not require users to make any changes
to their current setup.

Here are some current use cases that are made possible or easier with CoreOS
Layering:

As an End User

Currently Fedora CoreOS allows users to modify the system either through Ignition,
by writing to any read/write directories, or via RPM package layering on the client
side. These two approaches have served us well, but some use cases can be improved.

First, delivering configuration/files/software via Ignition works well, but can get
heavy and also requires the user to re-provision if the configuration ever changes;
Ignition is only supported for the first boot of a machine. Secondly, client side
package layering can take a lot of time to run on first boot, makes upgrades less
reliable, and isn't tightly controlled (software in package repositories change often).

It may be more desirable to deliver all of the changes as a derivative layer built as
a container image, and delivered on first boot. We think this will be attractive in
the following uses:

  • Configuration
    • CoreOS Layering allows for configuration changes to be delivered as an update
      • With Ignition you must re-deploy the machine or build a bespoke method
        for delivering configuration updates (i.e. some tool using SSH).

Detailed Scenario: A user does a bare metal install of 10 systems in a datacenter.
The user later discovers they should have deployed the systems with bonded networking
setup With CoreOS Layering this change can be made to the Dockerfile definition,
rebuilt, and delivered as an update. Without CoreOS Layering the recommended way
would be to re-install/re-provision the machines; which would represent a significant
waste of time for this user.

  • Unpackaged Software
    • CoreOS Layering allows software to be built and layered in one operation
      • Re-using container build technology we're able to do multi-stage builds
        • This allows us to detect and match target software versions at build time
    • CoreOS Layering allows more invasive operations than client side package layering
      • Users are allowed to write files to directories that are read-only client side
        • i.e. can write binaries into /usr/bin/ vs. /usr/local/bin
      • coreos-layering allows the software to be built and layered in one operation (multi-stage build)

Detailed Scenario: One example here that is compelling would be the case of third
party kernel modules. A user can do a new multi-stage container build based on a
recently delivered Fedora CoreOS base container. This multi-stage build can detect
the delivered kernel in the base container, build the kernel module from source,
and copy the results into the target container image. This committed container
can now be pushed to a registry and and clients can target that image for updates.

  • Packaged Software
    • CoreOS Layering allows the package layering to happen server side
      • With client side package layering
        • Every client has to do it separately
        • Each client now has to pull metadata from package repositories
          • This is heavyweight and happens at runtime
          • Changes in the package repo might cause upgrades to fail
      • With CoreOS Layering
        • No risk of package repo issues client side
        • Derivative commit can be tested before being delivered to clients

Detailed Scenario: To illustrate this use case further we can walk through a client
side package layer scenario. Machine A and machine B exist and are following the
stable stream. Both have the NetworkManager-wifi package layered. A new
stable update released on Monday. Machine A updates on Monday, early in the
rollout window; the client pulls in the new update and pulls NetworkManager-wifi
from the package repo, making a new client side commit. On Tuesday Machine B
attempts to update. At this point a few things could go wrong:

  1. The package repo is unavailable. In this case client side package layering
    operation will fail. Machine B stays on the old commit and keeps retrying.
  2. The package layering is successful, but pulls in a different version of
    NetworkManager-wifi than Machine A.

In both of these cases Machine A and Machine B, which are expected to be more or less
running the same exact software, have diverged. Further, in scenario 1. if the package
repository never comes back (i.e. using a repo from a third party) that machine is stuck
forever.

As a Layered Project

Fedora CoreOS provides a nice stable base for other projects to build on top of,
however every decision Fedora CoreOS makes isn't always right for layered projects.
Currently the layered project will need to either decide to encode every change into
an Ignition configuration that runs on boot of every instance or rebuild a brand new
OSTree completely from scratch.

CoreOS Layering offers the opportunity for layered projects to easily make tweaks
to Fedora CoreOS. Some examples of layered projects as of July 2022 that take
different approaches:

  • Podman machine
    • Uses Fedora CoreOS with a heavy Ignition config to customize instances on boot
  • OKD
    • Rebuilds Fedora CoreOS from scratch with additions for OKD

With CoreOS Layering these projects can provide a more polished solution for end users:

  • Less changes get applied client side
    • Decreased opportunity for provisioning issues
  • The commit that was built server side can get tested in CI
    • Validation of the built (derived) commit can happen in CI

Potential Drawbacks of CoreOS Layering

The CoreOS Layering technology is still under active development. There are
currently some workflows that haven't been fully fleshed out. Here is a summary:

  • Build tooling/infra for CoreOS Layering containers
    • Any user creating derivative containers of Fedora CoreOS will need to
      continue to monitor and build new derivative containers when new Fedora
      CoreOS updates are released. These will need to be hosted in a registry
      that their machines can then pull container images from.
  • Updates via Zincati (update barriers; update graphs)
    • Zincati/Cincinnati offer us a "safe" path to traverse when deploying
      updates to systems. When following a container image in a registry the
      user is following whatever is latest. Work still needs to be done to
      get back the added value from Zincati, into the CoreOS Layering workflow.
@jlebon jlebon added area/bootable-containers Related to the bootable containers effort. meeting topics for meetings and removed meeting topics for meetings labels Jun 6, 2022
@dustymabe
Copy link
Member

@jlebon and I got together to try to flesh out the CoreOS Layering use cases.

I have updated the description of this issue (2022-07-20) with some proposed use cases where the value of CoreOS layering is illustrated. I have also described current limitations of CoreOS Layering that we'll be working to address in the coming months. Please take a look and let us know how this could be improved and if any corrections need to be made.

@cgwalters
Copy link
Member

Looked through this; seems sane. Thanks so much for writing this up!

The techology we are referring to a "CoreOS Layering"

That said, https://fedoraproject.org/wiki/Changes/OstreeNativeContainer currently proposes "ostree native container". About 90% of all the stuff written here applies outside of Fedora CoreOS too. Something to keep in mind.

@cgwalters

This comment was marked as outdated.

@cgwalters
Copy link
Member

I think chunks of this are now covered by the existence of https://github.com/coreos/layering-examples
right?

If it's about trying to just explain the benefits and drawbacks, we could just move that into coreos/fedora-coreos-docs#540 ?

@cgwalters
Copy link
Member

I think what this issue is trying to get at really is the baseline question of when should users:

  • Take a pre-built "golden image" and just configure it (workstation and FCOS) today
  • Derive from and own OS updates (layering)

I think for 90% of this there's really nothing FCOS specific about this in the end (this is why the term "coreos layering" is misleading as a technology descriptor). IOW all of these tradeoffs are things that also apply to other rpm-ostree based systems. Particularly relevant to, but not limited to desktop ones.

So my instinct here is to:

?

@bgilbert
Copy link
Contributor

bgilbert commented May 3, 2023

Fedora CoreOS docs describe lots of things that technically repeat documentation elsewhere. We should aim to be maximally helpful to new users, rather than asking them to assemble the opinions of various upstreams.

But also, FCOS is opinionated in various ways, and "when should you use Ignition vs. layering" seems like an important thing to have an opinion about. It affects not only the advice we give users, but our priorities for the functionality we build, and indeed how we think about the distro as a whole.

@cgwalters
Copy link
Member

Yes, I have some! I added some bits from this in coreos/fedora-coreos-docs#540 (comment)

But I'd hope you have specific opinions on this too that could be expressed here in the doc directly - would love to make this feel more collaborative.

@cgwalters
Copy link
Member

I made this point in an OpenShift meeting, but I wanted to write it down here; going back one level, when I was saying this isn't FCOS specific in that it also applies to other rpm-ostree systems - in fact even if rpm-ostree (and coreos etc.) didn't exist, this problem also exists today in RHEL.

The day that RHEL introduced Image Builder, suddenly there are two ways to set up that Postgresql server in Azure (start from stock cloud image, maybe use cloud-init/ansible/whatever and yum install postgresq) or build golden disk images with IB and do updates by instance teardown/spinup. These things have the same very fundamental tradeoffs around systems management that we're introducing here.

I think indeed, it is on us to provide guidance. But I don't think there's any real way to not support these two paths ("configure" vs "build/own").

Now, what I hope actually is this fundamental change in mindset and technology eventually leads us to a place where we have a more "seamless" spectrum into what is e.g. today Fedora Cloud and Fedora Server, instead of having harder barriers.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/bootable-containers Related to the bootable containers effort. kind/enhancement
Projects
None yet
Development

No branches or pull requests

5 participants