Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Design for osImageURL updates - integration with CVO/release payload #183

Closed
cgwalters opened this issue Nov 19, 2018 · 41 comments
Closed

Comments

@cgwalters
Copy link
Member

cgwalters commented Nov 19, 2018

I wanted to elaborate here on the current status of this. We have a PR in #363 which will finally close the loop and inject machine-os-content from the release payload all the way into the MachineConfig objects, which will result in the MCD updating.

The final architecture will be:

New kernel errata, turns into RPM, converted into ostree then oscontainer. A bit more information on the build system side here. The oscontainer makes it into a new release payload published on quay.io.

At some point the release payload pulled down by CVO, which includes a osimageurl ConfigMap that references that container (same thing as the machine-os-content ImageStream). The CVO updates the ConfigMap, which you can see via

oc -n openshift-machine-config-operator get configmap/machine-config-osimageurl

The operator notices the change to the configmap and updates the "controllerconfig" which is an internal CRD that is used as the primary input to the MCC. See oc get -o yaml controllerconfig.

The "template" sub-controller of the MCC then updates machineconfigs/00-master and machineconfigs/00-worker.

The "render" sub-controller of the MCC generates new "rendered" MCs that look like machineconfigs/master-<hash> and machineconfigs/worker-<hash> and updates the MachineConfigPools to target them. For more information on this, see the MCC docs.

On each node the MCD will get the new osimageurl, and if it's different than what's booted, it will pull down the container and rebase to it and reboot. This is also the same as any other config change.

@jlebon
Copy link
Member

jlebon commented Nov 19, 2018

For completeness, one alternative is to make it a MachineConfig object instead, which is then used as the base config containing the osImageURL in which all other (Ignition) configs are merged into. I think this maps slightly more closely to what's currently done today.

That said, using a separate ConfigMap allows us to have richer metadata. E.g. pkg lists & diffs, OSTree commits & versions, etc... The idea being that an "Update available" button can show all that stuff without having to hunt down the osImage.

@ashcrow
Copy link
Member

ashcrow commented Nov 19, 2018

Remember, this data will need to flow one way or another down into a MachineConfig. If it's easier we could add some more information to MCs themselves.

@jlebon
Copy link
Member

jlebon commented Nov 19, 2018

Strawman:

apiVersion: v1
kind: ConfigMap
metadata:
  name: rhcos-release-info
data:
  metadata.json: '{"osImageUrl": "registry.svc.ci.openshift.org/rhcos/os-maipo@sha256:4efb36f476405f8e30256733ac900e11b70833dc6a6b54179db60a501fa47124", "ostree-version": "47.115", "ostree-checksum": "b0edcc594e65ccebcc917b0ead76ea6894c0d4d672a63ceee5bdc59976c55bf9", "pkgdiff": [["origin-clients", 2, {"NewPackage": ["origin-clients", "4.0.0-0.alpha.0.610.98ebf23", "x86_64"], "PreviousPackage": ["origin-clients", "4.0.0-0.alpha.0.607.49e9f08", "x86_64"]}], ...], "pkglist": ["GeoIP-1.5.0-13.el7.x86_64", "NetworkManager-1:1.12.0-7.el7_6.x86_64", ...]}'

Basically, we can include a subset of what coreos-assembler already outputs. Or maybe just a link to that meta.json instead?

@jlebon
Copy link
Member

jlebon commented Nov 19, 2018

Remember, this data will need to flow one way or another down into a MachineConfig. If it's easier we could add some more information to MCs themselves.

I think it shouldn't be too hard to adapt the MCC either way so we shouldn't constrain ourselves too much on what's easier.

@ashcrow
Copy link
Member

ashcrow commented Nov 19, 2018

I think it shouldn't be too hard to adapt the MCC either way so we shouldn't constrain ourselves too much on what's easier.

Fair. Rephrased: what's doable and consumable within specific constraints 😃

@cgwalters
Copy link
Member Author

cgwalters commented Nov 19, 2018

Basically, we can include a subset of what coreos-assembler already outputs.

Do we need anything other than the "registry.svc.ci.openshift.org/rhcos/os-maipo@sha256:4efb36f476405f8e30256733ac900e11b70833dc6a6b54179db60a501fa47124", string?

EDIT: Ah sorry I missed this:

The idea being that an "Update available" button can show all that stuff without having to hunt down the osImage.

Yeah, though...a tricky part about this is that the pkgdiff is against the previous version, which may not be the one they're updating to...

Mmm. I'm OK stuffing the whole meta.json in there for now, but I suspect we're going to need to add something smarter later.

@jlebon
Copy link
Member

jlebon commented Nov 20, 2018

Yeah, though...a tricky part about this is that the pkgdiff is against the previous version, which may not be the one they're updating to...

Yeah, I think to do this correctly, the pkg diff would have to be done from the pkg lists instead rather than precomputed. Anyway, those are things that could easily come later. We just have to make sure we leave the door open for it.

Mmm. I'm OK stuffing the whole meta.json in there for now, but I suspect we're going to need to add something smarter later.

Yeah, that sounds fine to me.

@smarterclayton
Copy link
Contributor

Be aware you’re going to eventually need to have a distinction between “the config says this is the latest” and “I’m ready to roll that out to the nodes”. Design with that in mind because new kubeelts are going to happen ~ 1/week

@smarterclayton
Copy link
Contributor

Pkgdiff is useless. There is no guarantee it has any relevance. Think package manifest instead of diffs. Our errata and higher level tools will calculate the diff

@jlebon
Copy link
Member

jlebon commented Nov 21, 2018

Be aware you’re going to eventually need to have a distinction between “the config says this is the latest” and “I’m ready to roll that out to the nodes”. Design with that in mind because new kubeelts are going to happen ~ 1/week

Can you elaborate on that? Do you mean making sure we actually upgrade to the selected release instead of whatever happens to now be the latest at the time we're actually ready to upgrade? I think that should be covered yeah. The metadata includes the full sha256 of the oscontainer.

Pkgdiff is useless. There is no guarantee it has any relevance. Think package manifest instead of diffs. Our errata and higher level tools will calculate the diff

Yup, see coreos/coreos-assembler#226.

@smarterclayton
Copy link
Contributor

We don't decide on version 4.0.6 until the last moment. So the goal of automation is to continuously have a set of 4.0.6 candidates that we then pick one and ship it. The "train" mindset, not the "artisanal release payload".

@smarterclayton
Copy link
Contributor

Expectation is that you will build an image and push it to the openshift origin integration image stream, then reference the component “os” from this operator and have the dummy value substituted.

In OCP, we do something similar where the OS content gets built via whatever process and shows up in the prerelease list, and is processed the same way.

You can build anywhere - we just need you to push/be imported to the right place

@smarterclayton
Copy link
Contributor

Note the dummy value can be real - but since we need to mirror the content you have to be referenced in the operator list which means you need to be sucked into the right place.

Push vs scheduled import is also possible, but if we do scheduled import the source location has to be appropriately gated like a push would (only changes if you test against latest in an install)

@wking
Copy link
Member

wking commented Nov 30, 2018

... but if we do scheduled import the source location has to be appropriately gated like a push would (only changes if you test against latest in an install)

That breaks the sceduled-import model, doesn't it? How do you know the "latest" test used for the gate hadn't been surpassed by further release-payload work? Or will errors there be caught by post-testing?

@cgwalters
Copy link
Member Author

My notes from playing around with MCD state so far;

I am still a bit confused as to the current flow of the osImageURL; should oc edit machineconfig/00-worker work to render or a new one? Currently I edit the generated one.

Next, currently our oscontainers are uploaded to api.ci under the rhcos/ namespace and they require a separate pull secret, and we don't have a defined way to upload that to nodes. Actually I am a bit confused by pull secrets today, there is openshift/installer#775 which merged but I don't see /root/.docker being created on my master/worker?

For some reason it's not working to podman login manually, need to trace that.

@ashcrow
Copy link
Member

ashcrow commented Dec 6, 2018

I am still a bit confused as to the current flow of the osImageURL; should oc edit machineconfig/00 worker work to render or a new one? Currently I edit the generated one.

I've never tried to edit the source and always edited the generated content. Only generated versions should ever be available to the MCD from the MCO.

FWIW the resource version should get updated on edit. The MCO is in charge of updating the annotations. EG: pkg/controller/node/node_controller.go:syncMachineConfigPool(..) When machineconfiguration.openshift.io/desiredConfig doesn't match machineconfiguration.openshift.io/currentConfig MCD will attempt to take action.

Next, currently our oscontainers are uploaded to api.ci under the rhcos/ namespace and they require a separate pull secret, and we don't have a defined way to upload that to nodes. Actually I am a bit confused by pull secrets today, there is openshift/installer#775 which merged but I don't see /root/.docker being created on my master/worker?

That I'm not sure about ☹️

@cgwalters
Copy link
Member Author

Looked at that PR more carefully and it's only about the bootstrap. The pull secret goes into the main oc get configmap -n kube-system cluster-config-v1 (thanks Kirsten for mentioning that one earlier!).

What I don't understand yet is where that pull secret ends up on the nodes.

@kikisdeliveryservice
Copy link
Contributor

kikisdeliveryservice commented Dec 6, 2018

@cgwalters Pull secret gets written into the controller config here:
https://github.com/openshift/machine-config-operator/blob/4b28c96a225bbd9711c98c73a6d83f1c4ed9653a/lib/resourcemerge/machineconfig.go

@jlebon
Copy link
Member

jlebon commented Dec 6, 2018

I am still a bit confused as to the current flow of the osImageURL; should oc edit machineconfig/00-worker work to render or a new one? Currently I edit the generated one.

Hmm, I think you're right that that should trigger a regeneration of a new machineconfig. I'm trying that here, but my MCC now is hitting:

E1206 15:14:13.341864       1 reflector.go:322] github.com/openshift/machine-config-operator/vendor/k8s.io/client-go/informers/factory.go:130: Failed to watch *v1.Node: Get https://172.30.0.1:443/api/v1/nodes?resourceVersion=9818&timeoutSeconds=360&watch=true: dial tcp 172.30.0.1:443: connect: connection refused
E1206 15:14:13.341972       1 reflector.go:322] github.com/openshift/machine-config-operator/pkg/generated/informers/externalversions/factory.go:101: Failed to watch *v1.ControllerConfig: Get https://172.30.0.1:443/apis/machineconfiguration.openshift.io/v1/controllerconfigs?resourceVersion=4441&timeoutSeconds=365&watch=true: dial tcp 172.30.0.1:443: connect: connection refused
E1206 15:14:13.342012       1 reflector.go:322] github.com/openshift/machine-config-operator/pkg/generated/informers/externalversions/factory.go:101: Failed to watch *v1.MachineConfigPool: Get https://172.30.0.1:443/apis/machineconfiguration.openshift.io/v1/machineconfigpools?resourceVersion=7665&timeoutSeconds=503&watch=true: dial tcp 172.30.0.1:443: connect: connection refused
E1206 15:14:13.342044       1 reflector.go:322] github.com/openshift/machine-config-operator/pkg/generated/informers/externalversions/factory.go:101: Failed to watch *v1.MachineConfig: Get https://172.30.0.1:443/apis/machineconfiguration.openshift.io/v1/machineconfigs?resourceVersion=7316&timeoutSeconds=387&watch=true: dial tcp 172.30.0.1:443: connect: connection refused

which is a flavour of some of the errors people were hitting in #199. Will try to dig deeper there.

Next, currently our oscontainers are uploaded to api.ci under the rhcos/ namespace and they require a separate pull secret, and we don't have a defined way to upload that to nodes. Actually I am a bit confused by pull secrets today, there is openshift/installer#775 which merged but I don't see /root/.docker being created on my master/worker?

I will admit the hacky way I've been testing upgrades so far includes prepending the machine-config-daemon invocation in the daemonset config with oc login ... &&. Fixing that to use secrets would be awesome!

@cgwalters
Copy link
Member Author

Yep, I missed it somehow, I see it now, it goes to /var/lib/kubelet/config.json. Adding the secret there works.

@kikisdeliveryservice
Copy link
Contributor

@jlebon those timeouts are related to the openshift apiserver: openshift/origin#21612

@aaronlevy
Copy link

I am still a bit confused as to the current flow of the osImageURL; should oc edit machineconfig/00-worker work to render or a new one? Currently I edit the generated one.

My understanding is that the source should be edited (which should result in a new generated config). We should not be in the habit of editing the generated config (and if that happened, the MCC should actually roll-out a non-edited config to all nodes -- as that is built from the canonical source(s)).

It might help to add docs along the lines of, "As a user, how do I modify host configuration?" and it points to creating a new (source) machine config (as a layer that will be merged with config we control), and outlining that you should not be editing generated configs (that will ultimately be stomped on by MCC rolled-out source-generated configs anyway).

cc @abhinavdahiya these assumptions are still correct.

@jlebon
Copy link
Member

jlebon commented Dec 6, 2018

I am still a bit confused as to the current flow of the osImageURL; should oc edit machineconfig/00-worker work to render or a new one? Currently I edit the generated one.

Hmm, I think you're right that that should trigger a regeneration of a new machineconfig.

OK yeah, this does work for me. After sorting out the MCC issue (@kikisdeliveryservice thanks! I tried out the workaround there and it seems like it worked), and doing oc edit machineconfig 00-worker, the daemon shows:

I1206 19:46:08.991607    4369 update.go:486] Updating OS to http://example.com
I1206 19:46:08.991616    4369 run.go:13] Running: /bin/pivot http://example.com
pivot version 0.0.2
...

@jlebon
Copy link
Member

jlebon commented Dec 6, 2018

My understanding is that the source should be edited (which should result in a new generated config).

Yeah, this is mostly for testing stuff out.

It might help to add docs along the lines of, "As a user, how do I modify host configuration?" and it points to creating a new (source) machine config (as a layer that will be merged with config we control), and outlining that you should not be editing generated configs (that will ultimately be stomped on by MCC rolled-out source-generated configs anyway).

The issue is that when the MCC merges configs, it doesn't replace the base osImageURL:

// It only uses the OSImageURL from first object and ignores it from rest.
. (And IIUC, the first config is that 00 generated from the baked in template currently, though the osImageURL part of it comes from: ).

Eventually, testing an OS update for hacking could be done by changing the configmap directly (or whatever we settle on in this ticket). (Or at an even higher level, pointing at a custom release payload).

@jlebon
Copy link
Member

jlebon commented Dec 6, 2018

To add some context to the previous comment: this is strictly for testing changes to osImageURL. Other filesystem updates that are part of the Ignition spec are merged in, so can be done indeed by creating a new source machine config that gets squashed into a new generated machine config.

@kikisdeliveryservice
Copy link
Contributor

kikisdeliveryservice commented Jan 15, 2019

@cgwalters I just tried an update on bin/openshift-install unreleased-master-78-g63bdb7fce105c6a5f0422b055b8c1164dddc53fd using the config above and it worked as expected.

Logs for ref: http://pastebin.test.redhat.com/696135

Add: this was run on AWS

@kikisdeliveryservice
Copy link
Contributor

@cgwalters is there a specific order the MCs for the osUrlUpdates need to adhere to? Because applying a 2nd and 3rd config I'm running into errors. Must the newest config come first or last?

@cgwalters
Copy link
Member Author

Since #279 we take the first non-empty.

@kikisdeliveryservice
Copy link
Contributor

Gotcha will try again with proper order for 2nd test MC to see if that was the cause of errors.

@kikisdeliveryservice
Copy link
Contributor

Ok tried the second config with name: 00-0valters-worker-osimageurl 🤣 to land in the first spot and it updated as I expected.

cgwalters added a commit to cgwalters/machine-config-operator that referenced this issue Jan 18, 2019
Have the MCC take `osImageURL` as provided by the cluster update/release payload
and generate a `00-{master,worker}-osimageurl` MC from it, which ensures
the MCD will update the node to it.

However, we need special handling for the *initial* case where we boot
into a target config, but we may be using an old OS image.

Change the MCC to write the target osImageURL from the MC it uses for
bootstrapping to `/etc/rhcos-initial-pivot-target`.  This will then be
handled by the `rhcos-initial-pivot.service` systemd unit.

Closes: openshift#183
cgwalters added a commit to cgwalters/machine-config-operator that referenced this issue Jan 21, 2019
Have the MCC take `osImageURL` as provided by the cluster update/release payload
and generate a `00-{master,worker}-osimageurl` MC from it, which ensures
the MCD will update the node to it.

However, we need special handling for the *initial* case where we boot
into a target config, but we may be using an old OS image.  Currently
the MCD would treat this as "config drift" and go degraded.

Today we write the node annotations to a file in `/etc` as part of the
rendered Ignition.  Use that as a "bootstrap may be required" flag,
and handle it specially - if we need to pivot, do *just* that and
reboot.

We also clean things up by unlinking that node annotation file; after
that, if the `osImageURL` drifts from the expected config, we'll go
degraded, just like if someone modified a file.

Closes: openshift#183
cgwalters added a commit to cgwalters/origin that referenced this issue Feb 9, 2019
For RHCOS we have two things:

 - The "bootimage" (AMI, qcow2, PXE env)
 - The "oscontainer", now represented as `machine-os-content` in the payload

For initial OpenShift releases (e.g. of the installer) ideally
these are the same (i.e. we don't upgrade OS on boot).

This PR aims to support injecting both data into the release payload.

More information on the "bootimage" and its consumption by the
installer as well as the Machine API Operator:
openshift/installer#987

More information on `machine-os-content`:
openshift/machine-config-operator#183
cgwalters added a commit to cgwalters/origin that referenced this issue Feb 9, 2019
For RHCOS we have two things:

 - The "bootimage" (AMI, qcow2, PXE env)
 - The "oscontainer", now represented as `machine-os-content` in the payload

For initial OpenShift releases (e.g. of the installer) ideally
these are the same (i.e. we don't upgrade OS on boot).

This PR aims to support injecting both data into the release payload.

More information on the "bootimage" and its consumption by the
installer as well as the Machine API Operator:
openshift/installer#987

More information on `machine-os-content`:
openshift/machine-config-operator#183
@cgwalters
Copy link
Member Author

If today one wants to test an os update, here's an object you can oc create:

apiVersion: v1
kind: List
items:
- apiVersion: machineconfiguration.openshift.io/v1
  kind: MachineConfig
  metadata:
    labels:
      machineconfiguration.openshift.io/role: master
    name: 00-0walters-master-osimageurl
  spec:
    osImageURL: registry.svc.ci.openshift.org/rhcos/maipo@sha256:ffe6873d7e322da2c3fc56f49347f2aac264812fcd652b58d54142e1fdc9cecb
    config:
      ignition:
        version: 2.2.0
- apiVersion: machineconfiguration.openshift.io/v1
  kind: MachineConfig
  metadata:
    labels:
      machineconfiguration.openshift.io/role: worker
    name: 00-0walters-worker-osimageurl
  spec:
    osImageURL: registry.svc.ci.openshift.org/rhcos/maipo@sha256:ffe6873d7e322da2c3fc56f49347f2aac264812fcd652b58d54142e1fdc9cecb
    config:
      ignition:
        version: 2.2.0

@cgwalters
Copy link
Member Author

This finally landed in #426

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
7 participants