WIP: Add RHCOS oscontainer into payload, render to 00-$role-osimageurl MC #273

cgwalters · 2019-01-08T19:09:10Z

PR Status:

Fix MachineConfigs can be garbage collected while a node is still booting #301
Figure out whether or not we need special handling for the first master

This closes the gap in getting the RHCOS oscontainer into the update
payload, and having the MCO then render that down into the MachineConfigs.

Introduce a new ConfigMap machine-config-osimageurl that points to
the oscontainer and is applied by the CVO. Then, the controller
renders that into machineconfigs/00-$role-osimageurl which then finally
goes into the rendered config, and should be applied by the daemon.

Closes: #183

openshift-ci-robot · 2019-01-08T19:09:17Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: cgwalters

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~OWNERS~~ [cgwalters]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

cgwalters · 2019-01-08T19:13:11Z

The high level idea with this so far (still writing this code) is that we have a ConfigMap that points to the oscontainer and is applied by the CVO. Then, the controller renders that into machineconfigs/00-$role-osimageurl which should make it into the final config.

This supercedes #258 and also takes a small bit of #228 in that we're taking the first non-empty osImageURL.

cgwalters · 2019-01-08T20:25:23Z

A very important thing here is that we need to make an immediate choice - do we pin the oscontainer in this repo and then update it via PRs? (Would be very noisy and painful, but give us CI gating) Or in the short term do we have it float :latest or so?

Or, (this would be my preference probably) matching discussion in openshift/installer#987 do we have the release controller "gather" it?

cgwalters · 2019-01-08T20:27:14Z

(Hm, since AIUI the release controller reacts to ImageStream changes, if we fixed the pipeline to push into the main namespace it should then trigger release payloads?)

smarterclayton · 2019-01-08T20:35:06Z

I would recommend pushing to the image stream because it is exactly what ART will do, so we can then get the RHCoS image integrated into OCP exactly like origin more quickly. You can always control how often you push.

install/image-references

smarterclayton · 2019-01-08T20:50:35Z

Instead of pinning or pushing, you can also just manually tag a new one in, or we can set up a periodic job that grabs the latest maipo, tests it, and then promotes it if the test passes (I could set that up in < 1hr)

cgwalters · 2019-01-08T21:53:30Z

or we can set up a periodic job that grabs the latest maipo, tests it, and then promotes it if the test passes (I could set that up in < 1hr)

This one sounds good! I'd be interested in seeing the code and understanding how that works...there's a lot of magic in the ci-operator/release stuff that I haven't yet fully grasped. The "test it" part is blocked a bit on this PR landing though right?

cgwalters · 2019-01-08T22:59:22Z

And it turns out we aren't pushing a :latest today; we could easily but it's probably better to get the "rhcos oscontainer promotion" going.

Although...maybe we really do need to separate out "add rhcos to payload" from "apply updates"?

Here's a possibility, we could add a config option or so to ignore osImageURL changes and flip that on by default in this PR. That way landing this wouldn't impact development immediately.

smarterclayton · 2019-01-08T23:10:57Z

yes - once you have an image, with a payload, it's really easy to script overlaying that test. I can go over what would be necessary there. That would block promoting a new image, but we would still need a job for PRs that overrides the image appropriately (which is not difficult)

cgwalters · 2019-01-09T14:56:47Z

OK this is updated 🆕 - we now have a CONFIG_IGNORE_OSIMAGEURL and use it by default. This will allow us to land this PR to act as a mechanism and then in both local development and CI we can override the env setting to test things out.

Today each MC will contain both an Ignition fragment and an `osImageURL`. Define "merging" as using the first non-empty `osImageURL` so we don't have to be very picky about ordering. This is a smaller version of openshift#228 Prep for: openshift#273

cgwalters · 2019-01-09T16:42:32Z

Hm:

error: unable to create a release: operator "machine-config-operator" failed to map images: image file "/tmp/release-image-0.0.1-2019-01-09-161203356873649/machine-config-operator/image-references" referenced image "machine-os-content" that is not part of the input images

Hmm but it worked in #258

cgwalters · 2019-01-09T18:58:51Z

/test images

cgwalters · 2019-01-09T19:02:39Z

/hold

cgwalters · 2019-01-09T20:20:00Z

/retest

cgwalters · 2019-01-16T21:54:44Z

This PR is held - it's probably OK to just leave it as it won't be merging yet. We have more issues to work through.

kikisdeliveryservice · 2019-01-16T22:00:32Z

Success! Retested using a diff os container and it worked (using image files up to commit 7999a25 )

oc logs -p:
http://pastebin.test.redhat.com/697033

oc logs -f:
http://pastebin.test.redhat.com/697034

cgwalters · 2019-01-16T22:31:12Z

OK so #301 is definitely happening in some of the CI runs from this PR - but is it actually a new issue? As far as I understand it...e2e-aws could pass just fine if a node went degraded (I think). But let's add a test for that to e2e-aws-op.

The other thing (in the new PR status at the top) is...actually I think I am wrong, there's no "initial master" versus "secondary masters", the installer creates them all at once. Which means they should all consistently do the early pivot.

So I think the design here is going to work.

kikisdeliveryservice · 2019-01-16T22:34:59Z

@cgwalters IME with AWS clusters (which is what I'm running everything on), while I see the error syncing messages, my nodes do not degrade bc of them. Is this perhaps a libvirt issue?

cgwalters · 2019-01-16T22:49:45Z

while I see the error syncing messages, my nodes do not degrade bc of them.

Yeah; those error messages aren't relevant to the real problem in #301 - I edited the initial message to remove them. The actual problem in #301 is that if a MC is garbage collected while a node is booting targeting it, the MCD will fail to load it on start, and the node will go degraded. And that's what I saw in at least one CI run.

I think we have this in CI but we're not noticing it. If it's happening we need to fix it. Ref: openshift#301

cgwalters · 2019-01-17T21:39:28Z

OK I rebased 🏄‍♂️ this, squashed the two commits into one for clarity, did s/Restart=always/Restart=on-failure/, and added a test case (rolled in #319 to avoid conflicts and also since I want that test to run too on this PR).

This is still though waiting for openshift/pivot#25 and the corresponding MCD work, but we can at least run what we have through e2e here.

cgwalters · 2019-01-17T22:17:29Z

--- FAIL: TestNoDegraded (0.10s)
	sanity_test.go:79: 3 degraded nodes found

Well, that's a kind of progress! New test successfully failing. But the cluster got GC'd before I could really debug it...

abhinavdahiya · 2019-01-17T22:22:02Z

Why do i see changes to pkg/server? I'm asking because I see no docs. ;)

Secondly I would live to avoid any more diff that is served by the server from the generated machineconfig contents.

cgwalters · 2019-01-17T22:37:59Z

Why do i see changes to pkg/server? I'm asking because I see no docs. ;)

It's to handle the difference between the "bootimage" (e.g. AMI) versus the CVO-driven payload.

#273 (comment)

Will push this in a bit:

* *Early pivot* - `/etc/rhcos-initial-pivot-target`

    While the Ignition content configures the node, it may actually be booted into an older OS image than is specified by the release payload managed by the [Cluster Version Operator](https://github.com/openshift/cluster-version-operator/).  The MCS writes out the `osImageURL` to this file, and the system will (if necessary) performs an "early pivot" before the node has actually joined the cluster.  This then ensures that when the MachineConfigDaemon starts, it will validate the `currentConfig` (including files written by Ignition and the `osImageURL`).

Secondly I would live to avoid any more diff that is served by the server from the generated machineconfig contents.

How would we pass the target data other than writing a file in Ignition?

cgwalters · 2019-01-17T22:41:33Z

It's to handle the difference between the "bootimage" (e.g. AMI) versus the CVO-driven payload.

And to elaborate, this is now required because of #245 - previously the MCD would have just said "oh it's a config difference" and updated, but the problem with that approach was we could get into reboot loops and other issues as described in that PR's commit message.

(And an advantage of this is that we reboot before node joins the cluster, doesn't make sense to land cluster workloads and then immediately evict them and reboot)

abhinavdahiya · 2019-01-17T23:27:27Z

Why do i see changes to pkg/server? I'm asking because I see no docs. ;)

It's to handle the difference between the "bootimage" (e.g. AMI) versus the CVO-driven payload.

#273 (comment)

Will push this in a bit:
* *Early pivot* - `/etc/rhcos-initial-pivot-target`

    While the Ignition content configures the node, it may actually be booted into an older OS image than is specified by the release payload managed by the [Cluster Version Operator](https://github.com/openshift/cluster-version-operator/).  The MCS writes out the `osImageURL` to this file, and the system will (if necessary) performs an "early pivot" before the node has actually joined the cluster.  This then ensures that when the MachineConfigDaemon starts, it will validate the `currentConfig` (including files written by Ignition and the `osImageURL`).
Secondly I would live to avoid any more diff that is served by the server from the generated machineconfig contents.

How would we pass the target data other than writing a file in Ignition?

Why is getting served as addition by server not through the machineconfig?

cgwalters · 2019-01-18T00:52:35Z

Why is getting served as addition by server not through the machineconfig?

You're arguing for having the MCD handle it? Then we'd need to special case the "initial pivot" there - maybe a good way to do that would be to have "bootstrap node annotations exist" as a flag specifying that. That has some advantages; it's more consistent. But one downside as I noted is the node comes online and then reboots right away.

Does anyone else have a strong opinion on this?

abhinavdahiya · 2019-01-18T01:02:40Z

Why is getting served as addition by server not through the machineconfig?

You're arguing for having the MCD handle it? Then we'd need to special case the "initial pivot" there - maybe a good way to do that would be to have "bootstrap node annotations exist" as a flag specifying that. That has some advantages; it's more consistent. But one downside as I noted is the node comes online and then reboots right away.

Does anyone else have a strong opinion on this?

I think i'm confused, any doc detailing the exact flow that this PR is moving towards will help me judge the approach or even provide feedback (if that matters).. currently i am not flowing whats happening and why :(

Have the MCC take `osImageURL` as provided by the cluster update/release payload and generate a `00-{master,worker}-osimageurl` MC from it, which ensures the MCD will update the node to it. However, we need special handling for the *initial* case where we boot into a target config, but we may be using an old OS image. Change the MCC to write the target osImageURL from the MC it uses for bootstrapping to `/etc/rhcos-initial-pivot-target`. This will then be handled by the `rhcos-initial-pivot.service` systemd unit. Closes: openshift#183

cgwalters · 2019-01-18T14:08:05Z

No worries; we've gone through a lot of architecture changes here, and there are now a whole lot of comments on this PR that Github is hiding.

Did you see #273 (comment) ?

To rephrase a bit, if we landed the bit of this PR to add osImageURL to the rendered MC without the MCS changes, then if there was any difference between the AMI and the update payload's oscontainer, the MCD would go degraded - because it validates its expected state on boot.

And see #273 (comment) again for the semantics of "validate".

Options:

"early pivot" as is implemented by this PR right now; MCS provides target osImageURL so we can pivot before any cluster workloads land (including the MCD).
Change the MCD to special case this by noticing it's the "firstboot case" as defined by "picked up node annotations from /etc"

ashcrow · 2019-01-18T15:39:54Z

/test e2e-aws-op

openshift-ci-robot · 2019-01-18T16:07:24Z

@cgwalters: The following test failed, say /retest to rerun them all:

Test name	Commit	Details	Rerun command
ci/prow/e2e-aws-op	`01c309d`	link	`/test e2e-aws-op`

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

ashcrow · 2019-01-18T16:26:02Z

level=fatal msg="failed to fetch Cluster: failed to generate asset \"Cluster\": failed to create cluster: failed to apply using Terraform"

ashcrow · 2019-01-18T17:15:55Z

@abhinavdahiya @cgwalters it may be worth chatting about this in a higher bandwidth system

openshift-ci-robot · 2019-01-29T05:25:37Z

@cgwalters: PR needs rebase.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

cgwalters · 2019-02-11T14:43:46Z

This is obsoleted by #363

openshift-ci-robot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Jan 8, 2019

openshift-ci-robot added the size/M Denotes a PR that changes 30-99 lines, ignoring generated files. label Jan 8, 2019

openshift-ci-robot requested review from ashcrow and jlebon January 8, 2019 19:09

openshift-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jan 8, 2019

cgwalters force-pushed the images-json-osimageurl branch from a346594 to 0271d40 Compare January 8, 2019 19:09

This was referenced Jan 8, 2019

Design for osImageURL updates - integration with CVO/release payload #183

Closed

DO NOT MERGE - Example of doing replacement #258

Closed

cgwalters force-pushed the images-json-osimageurl branch from 0271d40 to 1882c9e Compare January 8, 2019 20:34

smarterclayton reviewed Jan 8, 2019

View reviewed changes

install/image-references Outdated Show resolved Hide resolved

cgwalters force-pushed the images-json-osimageurl branch from 1882c9e to 79b5001 Compare January 9, 2019 14:49

openshift-ci-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Jan 9, 2019

cgwalters mentioned this pull request Jan 9, 2019

controller: Take first non-empty osImageURL #279

Merged

openshift-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jan 9, 2019

cgwalters changed the title ~~Add RHCOS oscontainer into payload, render to 00-$role-osimageurl MC~~ WIP: Add RHCOS oscontainer into payload, render to 00-$role-osimageurl MC Jan 16, 2019

openshift-ci-robot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Jan 16, 2019

test/e2e: Validate that no nodes went degraded

4921826

I think we have this in CI but we're not noticing it. If it's happening we need to fix it. Ref: openshift#301

cgwalters force-pushed the images-json-osimageurl branch from 7999a25 to 662d0fe Compare January 17, 2019 21:32

cgwalters force-pushed the images-json-osimageurl branch from 662d0fe to 01c309d Compare January 18, 2019 14:01

This was referenced Jan 18, 2019

daemon: skip querying currentConfig if possible #312

Merged

WIP: Render osImageURL, handle "bootstrap" case in MCD #324

Closed

openshift-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jan 29, 2019

cgwalters closed this Feb 11, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

WIP: Add RHCOS oscontainer into payload, render to 00-$role-osimageurl MC #273

WIP: Add RHCOS oscontainer into payload, render to 00-$role-osimageurl MC #273

cgwalters commented Jan 8, 2019 •

edited

Loading

openshift-ci-robot commented Jan 8, 2019

cgwalters commented Jan 8, 2019 •

edited

Loading

cgwalters commented Jan 8, 2019

cgwalters commented Jan 8, 2019

smarterclayton commented Jan 8, 2019

smarterclayton commented Jan 8, 2019

cgwalters commented Jan 8, 2019

cgwalters commented Jan 8, 2019

smarterclayton commented Jan 8, 2019

cgwalters commented Jan 9, 2019

cgwalters commented Jan 9, 2019

cgwalters commented Jan 9, 2019

cgwalters commented Jan 9, 2019

cgwalters commented Jan 9, 2019

cgwalters commented Jan 16, 2019

kikisdeliveryservice commented Jan 16, 2019

cgwalters commented Jan 16, 2019

kikisdeliveryservice commented Jan 16, 2019

cgwalters commented Jan 16, 2019 •

edited

Loading

cgwalters commented Jan 17, 2019 •

edited

Loading

cgwalters commented Jan 17, 2019 •

edited

Loading

abhinavdahiya commented Jan 17, 2019 •

edited

Loading

cgwalters commented Jan 17, 2019 •

edited

Loading

cgwalters commented Jan 17, 2019 •

edited

Loading

abhinavdahiya commented Jan 17, 2019

cgwalters commented Jan 18, 2019

abhinavdahiya commented Jan 18, 2019

cgwalters commented Jan 18, 2019

ashcrow commented Jan 18, 2019

openshift-ci-robot commented Jan 18, 2019

ashcrow commented Jan 18, 2019

ashcrow commented Jan 18, 2019

openshift-ci-robot commented Jan 29, 2019

cgwalters commented Feb 11, 2019

WIP: Add RHCOS oscontainer into payload, render to 00-$role-osimageurl MC #273

WIP: Add RHCOS oscontainer into payload, render to 00-$role-osimageurl MC #273

Conversation

cgwalters commented Jan 8, 2019 • edited Loading

PR Status:

openshift-ci-robot commented Jan 8, 2019

cgwalters commented Jan 8, 2019 • edited Loading

cgwalters commented Jan 8, 2019

cgwalters commented Jan 8, 2019

smarterclayton commented Jan 8, 2019

smarterclayton commented Jan 8, 2019

cgwalters commented Jan 8, 2019

cgwalters commented Jan 8, 2019

smarterclayton commented Jan 8, 2019

cgwalters commented Jan 9, 2019

cgwalters commented Jan 9, 2019

cgwalters commented Jan 9, 2019

cgwalters commented Jan 9, 2019

cgwalters commented Jan 9, 2019

cgwalters commented Jan 16, 2019

kikisdeliveryservice commented Jan 16, 2019

cgwalters commented Jan 16, 2019

kikisdeliveryservice commented Jan 16, 2019

cgwalters commented Jan 16, 2019 • edited Loading

cgwalters commented Jan 17, 2019 • edited Loading

cgwalters commented Jan 17, 2019 • edited Loading

abhinavdahiya commented Jan 17, 2019 • edited Loading

cgwalters commented Jan 17, 2019 • edited Loading

cgwalters commented Jan 17, 2019 • edited Loading

abhinavdahiya commented Jan 17, 2019

cgwalters commented Jan 18, 2019

abhinavdahiya commented Jan 18, 2019

cgwalters commented Jan 18, 2019

ashcrow commented Jan 18, 2019

openshift-ci-robot commented Jan 18, 2019

ashcrow commented Jan 18, 2019

ashcrow commented Jan 18, 2019

openshift-ci-robot commented Jan 29, 2019

cgwalters commented Feb 11, 2019

cgwalters commented Jan 8, 2019 •

edited

Loading

cgwalters commented Jan 8, 2019 •

edited

Loading

cgwalters commented Jan 16, 2019 •

edited

Loading

cgwalters commented Jan 17, 2019 •

edited

Loading

cgwalters commented Jan 17, 2019 •

edited

Loading

abhinavdahiya commented Jan 17, 2019 •

edited

Loading

cgwalters commented Jan 17, 2019 •

edited

Loading

cgwalters commented Jan 17, 2019 •

edited

Loading