Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OpenShift CoreOS Layering (provisional) #1032

Merged
merged 5 commits into from
Mar 15, 2022

Conversation

cgwalters
Copy link
Member

NOTE: Nothing in this proposal should be viewed as final.
It is highly likely that details will change. It is quite possible
that larger architectural changes will be made as well.

Change RHEL CoreOS as shipped in OpenShift to be a "base image" that can
be used as in layered container builds and then booted. This will allow
custom 3rd party agents delivered via RPMs installed in a container
build. The MCO will roll out and monitor these custom builds the same
way it does for the "pristine" CoreOS image today.

This is the OpenShift integration of ostree native containers or CoreOS layering via the MCO.


A month after that, the administrator wants to make a configuration change, and creates a `machineconfig` object targeting the `worker` pool. This triggers a new image build. But, the 3rd party yum repository is down, and the image build fails. The operations team gets an alert, and resolves the repository connectivity issue. They manually restart the build which succeeds.

#### Kernel hotfix
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@cgwalters yes, this user story exactly covers the problem I was asking about on Slack (subject to the warning down below under "Support Procedures" about making sure it's very clear when a customer is doing this)

@cgwalters
Copy link
Member Author

I'd like to proceed with at least adding a new rhcos entry to the release payload for 4.11 (it wouldn't actually be used at first), and we plan to try to land some preparatory stuff for this to the MCO git main/master. So if we can get some initial even tentative approval for the broad outlines of this proposal, that'd be great!

Copy link
Member

@cybertron cybertron left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still working my way through this, but a few initial comments inline from a host networking perspective.

### Non-Goals

- While the base CoreOS layer/ostree-container tools will be usable outside of OpenShift, this enhancement does not cover or propose any in-cluster functionality for exporting the forward-generated image outside of an OpenShift cluster. In other words, it is not intended to be booted (`$ rpm-ostree rebase <image>`) from outside of a cluster.
- This proposal does not cover generating updated "bootimages"; see https://github.com/openshift/enhancements/pull/201
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this a possible future extension? One of our big use cases for image customization is injecting network configuration that may be needed on inital boot, which IIUC is not currently supported by this proposal.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Everything that works today will continue to work. We've invested a ton of time on the CoreOS side into handling all kinds of network configuration, and that all will continue to work. Specifically, static IP addresses can (and probably should) still be configured via providing Ignition on a per-machine/node basis.

##### Per machine state, the pointer config

See [MCO issue 1720 "machine-specific machineconfigs"](https://github.com/openshift/machine-config-operator/issues/1720).
We need to support per machine/per node state like static IP addresses and hostname.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Related to the above, how does static IP configuration work with this? In some environments the static IP will need to be configured before the node can pull any images.


#### Intersection with https://github.com/openshift/enhancements/pull/201

In the future, we may also generate updated "bootimages" from the custom operating system container.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, I think that answers my questions above. The boot image customization would be a future feature (that we would be very interested in).

**NOTE: Nothing in this proposal should be viewed as final.
It is highly likely that details will change.  It is quite possible
that larger architectural changes will be made as well.**

Change RHEL CoreOS as shipped in OpenShift to be a "base image" that can
be used as in layered container builds and then booted.  This will allow
custom 3rd party agents delivered via RPMs installed in a container
build.  The MCO will roll out and monitor these custom builds the same
way it does for the "pristine" CoreOS image today.

This is the OpenShift integration of [ostree native containers](https://fedoraproject.org/wiki/Changes/OstreeNativeContainer) or [CoreOS layering](coreos/enhancements#7) via the MCO.
cgwalters added a commit to cgwalters/machine-config-operator that referenced this pull request Mar 9, 2022
Part of openshift/enhancements#1032

We'll add the new-format image into the payload alongside the old
one until we can complete the transition.

(There may actually be a separate `rhel-coreos-extensions` image
 e.g. too, so this is just the start)

Note this PR is just laying groundwork; the new format container
will not be used by default.
@cgwalters
Copy link
Member Author

@mrunalp @aravindhp Given this is all provisional, any further concerns before merging this?


## Summary

Change RHEL CoreOS as shipped in OpenShift to be a "base image" that can be used as in layered container builds and then booted. This will allow custom 3rd party agents delivered via RPMs installed in a container build. The MCO will roll out and monitor these custom builds the same way it does for the "pristine" CoreOS image today.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently, AFAIK the licensing of CoreOS does not allow reusing it for other purposes (FIXME). Will this change with this proposal?

Copy link
Member Author

@cgwalters cgwalters Mar 10, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is (to the best of my knowledge) no licensing restrictions - "RHEL CoreOS" is just not a separate thing from OpenShift. The OCP and RHEL organizations would just decline to support usage outside of it.

I think this question is really addressed by https://github.com/openshift/enhancements/pull/1032/files#diff-d56db1c5d442467039e4ac4e6d118edd49185320a947906e8fd6973cbbd5f1abR60

But, the underlying non-OCP functionality of "boot a container" is in https://fedoraproject.org/wiki/Changes/OstreeNativeContainer and https://github.com/coreos/enhancements/blob/main/os/coreos-layering.md and will ship in the rpm-ostree in RHEL. Truly productizing that is a whole separate thing. I pushed d25534f related to that.


example.bank's security team requires a 3rd party security agent to be installed on bare metal machines in their datacenter. The 3rd party agent comes as an RPM today, and requires its own custom configuration. While the 3rd party vendor has support for execution as a privileged daemonset on their roadmap, it is not going to appear soon.

After initial cluster provisioning is complete, the administrators at example.bank supply a configuration that adds a repo file to `/etc/yum.repos.d/agentvendor.repo` and requests installation of a package named `some-3rdparty-security-agent` as part of a container build.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is it also possible to do it on an initial cluster install? by just doing an openshift-install create manifests and then adding this custom manifest with this config which MCO would rollout during install?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could clearly use the bootstrap node to perform a build (e.g. podman build). I think performance needs will drive us to do this because otherwise we'd be adding a whole other phase to rollout where we'd need to wait for the cluster to be up to do a build, to generate our container image to roll it out to all the nodes.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suspect a number of administrators though will end up wanting to provide an externally built image; that's alluded to above. A whole host of tradeoffs with thaat.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm..in that case where would the builds be pushed though ? so that it can be picked up my the individual nodes' MCO?

Copy link
Member Author

@cgwalters cgwalters Mar 10, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We'd run a registry on the bootstrap node; could literally be oc image serve or the main registry container but in non-production mode, etc.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rigth..externally built images is definitely an option but most of the time we have requests like - "i want to add this systemd unti/file to ignition " or just change some kargs for which this will be useful

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep definitely, all that works via MachineConfig today and we intend to keep it working.

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Mar 10, 2022

@cgwalters: all tests passed!

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@mrunalp
Copy link
Member

mrunalp commented Mar 10, 2022

/approve

(will keep it open for a couple more days to see if there more comments from others before we merge)

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Mar 10, 2022

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: mrunalp

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Mar 10, 2022

**NOTE: Nothing in this proposal should be viewed as final. It is highly likely that details will change. It is quite possible that larger architectural changes will be made as well.**

1. The `machine-os-content` shipped as part of the release payload will change format to the new "native ostree-container" format, in which the OS content appears as any other OCI/Docker container.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@control-d I think this means we can now ship this image on registry.redhat.io (or the container catalog) and not just on Quay.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if this does or would solve some of the CVE reporting issues we have with RHCOS today?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, a big sub-goal here is that the container image (though it has some "ostree stuff" in there) otherwise looks like e.g. the UBI base image, and should be understandable to security scanners like Clair directly.

There's some details though around how we handle extensions - those would not be visible to scanners.

**NOTE: Nothing in this proposal should be viewed as final. It is highly likely that details will change. It is quite possible that larger architectural changes will be made as well.**

1. The `machine-os-content` shipped as part of the release payload will change format to the new "native ostree-container" format, in which the OS content appears as any other OCI/Docker container.
(In contrast today, the existing `machine-os-content` has an ostree repository inside a UBI image, which is hard to inspect and cannot be used for derived builds). For more information, see [ostree-rs-ext](https://github.com/ostreedev/ostree-rs-ext/) and [CoreOS layering](https://github.com/coreos/enhancements/pull/7).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will this move us away from needing rpm packaged versions of things that go into the image? or will we still need to create rpms, then install the rpms into the image that becomes the machine-os-content image?

my impression is that it won't remove the requirement, but it'd be nice :)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We actually today do ship some content in CoreOS that doesn't come from an RPM - systemd units and the like: https://github.com/coreos/fedora-coreos-config/tree/testing-devel/overlay.d

Doing that for binaries...we want to enable that for customers. For us (e.g. to stop building kubelet via RPM) - definitely possible but has other ramifications.


#### openshift-install bootstrap node process

A key question here is whether we need the OpenShift build API as part of the bootstrap node or not. One option is to do a `podman build` on the bootstrap node.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in the future it may even be a question of whether the openshift build api is available, period, as we try to move more components to be "optional" (and as we deprecate the buildv1 api, though our current api removal policies says that buildv1 cannot go away entirely, within the v4 major version, so there's less risk of that).

a dependency like this would, i think, make the case that the buildv1 api is never optional, until/unless you just do your own builds(run your own pods w/ podman, manage your own image pushing) w/o using the buildconfig api/controller. Which carries its own challenges (reimplementing a lot of push/pull secrets, proxy configs, CAs, ICSP logic), though you wouldn't need to full set of capabilities that buildsv1 provides for your narrow use case.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(reimplementing a lot of push/pull secrets, proxy configs, CAs, ICSP logic)

We already need to be able to pull container images on the bootstrap node with plain podman, and we already do. So that covers CAs and ICSP too.

The push aspect is interesting, but I think the most likely thing here is to run a basic in-memory registry (possibly the in-cluster one, just in a non-production mode) and we can easily set things up to allow pushing to that. Or we just use oc image serve etc.


#### Registry availability

If implemented in the obvious way, OS updates would fail if the cluster-internal registry is down.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

given the desire to have the cluster-internal registry be optional, it seems like this feature needs to support pushing/pulling the images from arbitrary registries, not just the internal one.

1. The CVO will replace a ConfigMap in the MCO namespace with the OS payload reference, as it does today.
1. The MCO will update an `imagestream` object (e.g. `openshift-machine-config-operator/rhel-coreos`) when this ConfigMap changes.
1. The MCO builds a container image for each `MachineConfigPool`, using the new base image
1. Each machineconfig pool will also support a `custom-coreos` `BuildConfig` object and imagestream. This build *must* use the `mco-coreos` imagestream as a base. The result of this will be rolled out by the MCO to nodes.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so if i'm understanding it correctly, this is the spot at which a user/admin would define the image customizations they want to apply, by defining a buildconfig that points to a dockerfile that contains their additions, more or less?

picking up @fabiand's question from the mailing list:

How is it imagined to include different packages from different vendors in the same base image?
Let's say we've got a HW vendor providing (proprietary) drivers for their GPU, hardware monitoring tools by the system vendor, and some IDS. Some of this will be provided by an Operator, other elements by a human. How are they expected to create a single new base image?

and @cgwalters answer:

Hmm; in theory perhaps we could create an interface (like a CRD) that
allowed registering "build fragments" (much like MachineConfig) that
could all be gathered and merged into a single build.

is there another option here that we allow multiple user-provided buildconfigs, with some sort of ordering, where only the final build produces the actual coreos image that will be consumed by the cluster? I don't know if it's strictly necessary, vs just aggregating all the customization logic into a single buildconfig/dockerfile, but i wanted to continue the thread.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel confident that: no matter what we pick, it will have feature gaps 😄

This enhancement specifically didn't yet try to call out the details of how we propose the user interface to work, because I think we simply need time to iterate on it, provide a functioning preview and and learn what works before we can draw a roadmap into that area.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

totally fair, just trying to get some thoughts in place for when you get to that point

@mrunalp
Copy link
Member

mrunalp commented Mar 15, 2022

/lgtm

I think we have enough to merge this provisional enhancement. We can keep iterating and working through what the UI will look like and update the KEP.

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Mar 15, 2022
@mrunalp mrunalp merged commit 559dd76 into openshift:master Mar 15, 2022
jkyros pushed a commit to jkyros/machine-config-operator that referenced this pull request Jun 22, 2022
Part of openshift/enhancements#1032

We'll add the new-format image into the payload alongside the old
one until we can complete the transition.

(There may actually be a separate `rhel-coreos-extensions` image
 e.g. too, so this is just the start)

Note this PR is just laying groundwork; the new format container
will not be used by default.
jkyros pushed a commit to jkyros/machine-config-operator that referenced this pull request Jul 19, 2022
Part of openshift/enhancements#1032

We'll add the new-format image into the payload alongside the old
one until we can complete the transition.

(There may actually be a separate `rhel-coreos-extensions` image
 e.g. too, so this is just the start)

Note this PR is just laying groundwork; the new format container
will not be used by default.
jkyros pushed a commit to jkyros/machine-config-operator that referenced this pull request Jul 20, 2022
Part of openshift/enhancements#1032

We'll add the new-format image into the payload alongside the old
one until we can complete the transition.

(There may actually be a separate `rhel-coreos-extensions` image
 e.g. too, so this is just the start)

Note this PR is just laying groundwork; the new format container
will not be used by default.
jkyros pushed a commit to jkyros/machine-config-operator that referenced this pull request Jul 21, 2022
Part of openshift/enhancements#1032

We'll add the new-format image into the payload alongside the old
one until we can complete the transition.

(There may actually be a separate `rhel-coreos-extensions` image
 e.g. too, so this is just the start)

Note this PR is just laying groundwork; the new format container
will not be used by default.
jkyros pushed a commit to jkyros/machine-config-operator that referenced this pull request Jul 21, 2022
Part of openshift/enhancements#1032

We'll add the new-format image into the payload alongside the old
one until we can complete the transition.

(There may actually be a separate `rhel-coreos-extensions` image
 e.g. too, so this is just the start)

Note this PR is just laying groundwork; the new format container
will not be used by default.
@zhouhao3
Copy link
Contributor

zhouhao3 commented Aug 3, 2022

@cgwalters Hi, I want to test this feature in OCP. I noticed that one of the main tasks of this proposal is to build a container image of RHEL CoreOS. Is this image built? I didn't find this image. Thanks!

@cgwalters
Copy link
Member Author

It is published but only for OCP developers right now at registry.ci.openshift.org/rhcos-devel/rhel-coreos:latest.

Publishing this image officially is tracked by https://url.corp.redhat.com/d4e7ed8

jkyros pushed a commit to jkyros/machine-config-operator that referenced this pull request Aug 11, 2022
Part of openshift/enhancements#1032

We'll add the new-format image into the payload alongside the old
one until we can complete the transition.

(There may actually be a separate `rhel-coreos-extensions` image
 e.g. too, so this is just the start)

Note this PR is just laying groundwork; the new format container
will not be used by default.
jkyros pushed a commit to jkyros/machine-config-operator that referenced this pull request Aug 15, 2022
Part of openshift/enhancements#1032

We'll add the new-format image into the payload alongside the old
one until we can complete the transition.

(There may actually be a separate `rhel-coreos-extensions` image
 e.g. too, so this is just the start)

Note this PR is just laying groundwork; the new format container
will not be used by default.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. lgtm Indicates that a PR is ready to be merged.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

10 participants