Remove clusterID from installconfig and move it to cluster #1057

staebler · 2019-01-12T01:30:26Z

#783 rebased on top of #1052

None of the options, other than Image, that were in the Libvirt MachinePool were being used. They have been removed. The Image has been pulled up to the Libvirt Platform, as there was no way to use a different image for different machines pools. For consistency with the AWS and OpenStack platforms, the Libvirt MachinePool has been retained, even though it is empty. The DefaultMachinePlatform has been retained in the Libvirt Platform as well. The code in the Master Machines and Worker Machines assets that determines the configuration to use for the machines has been adjusted for Libvirt to rectify the machine-pool-specific configuration against the default machine-pool configuration. This is not strictly necessary as, again, the Libvirt configuration is empty. It keeps the logic consistent with the other platforms, though. https://jira.coreos.com/browse/CORS-911

added OPENSHIFT_INSTALL_OS_IMAGE_OVERRIDE env with warning to override the image that will be used.

With baseimage user query removed from TUI, this function is no longer required. This also - drops the vendored files - updates the mocks using `hack/go-genmock.sh`

RHCOS image that needs to be used for installation must be sources from RHCOS build pipeline and RHCOS build. Keeping the image related fields in install-config allows users to change these values as part of valid configuration, but we do not want users to configure this option as the RHCOS image controls the runtime and kubelet versions we depend on.

ClusterID is now removed from installconfig. The reason is that it should not be possible for a user to override this value. The clusterID is still needed for destroy - and hence it is now a separate asset which gets stored in ClusterMetadata. Other assets needing the clusterID (e.g. legacy manifests) issue a separate dependency on this asset. For convenience of package depenencies, the asset still lives in the installconfig package.

wking · 2019-01-12T01:36:38Z

/lgtm
/hold

I'll pull the /hold once #1052 lands.

openshift-ci-robot · 2019-01-12T01:36:47Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: staebler, wking

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~OWNERS~~ [staebler,wking]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

wking · 2019-01-12T06:03:02Z

#1052 landed.

/hold cancel
/retest

wking · 2019-01-12T06:36:26Z

e2e-aws:

error: unable to connect to image repository registry.svc.ci.openshift.org/ci-op-n9v1pkj1/stable@sha256:6cd9842c7897cff8eb05d6e3a9b9dd05de21ae484a68d2b7dba8d71e58b7c22d: Get https://registry.svc.ci.openshift.org/v2/: net/http: TLS handshake timeout

/retest

abhinavdahiya · 2019-01-12T07:36:57Z

Failing tests:

[Conformance][Area:Networking][Feature:Router] The HAProxy router converges when multiple routers are writing conflicting status [Suite:openshift/conformance/parallel/minimal]
[Feature:Builds][pullsecret][Conformance] docker build using a pull secret  Building from a template should create a docker build that pulls using a secret run it [Suite:openshift/conformance/parallel/minimal]
[k8s.io] [sig-node] Security Context [Feature:SecurityContext] should support seccomp default which is unconfined [Feature:Seccomp] [Suite:openshift/conformance/parallel] [Suite:k8s]
[sig-api-machinery] Servers with support for API chunking should return chunks of results for list calls [Suite:openshift/conformance/parallel] [Suite:k8s]
[sig-apps] StatefulSet [k8s.io] Basic StatefulSet functionality [StatefulSetBasic] Burst scaling should run to completion even with unhealthy pods [Conformance] [Suite:openshift/conformance/parallel/minimal] [Suite:k8s]
[sig-storage] Projected should set mode on item file [NodeConformance] [Conformance] [Suite:openshift/conformance/parallel/minimal] [Suite:k8s]

Writing JUnit report to /tmp/artifacts/junit/junit_e2e_20190112-072844.xml

Error: 6 fail, 519 pass, 97 skip (28m54s)

/retest

dgoodwin · 2019-01-14T12:48:25Z

Per last arch call Hive and service delivery both have a need to know the cluster ID, before we start provisioning resources. With this change it looks like we can no longer specify our own, but I'm not seeing the cluster metadata on disk when I generate manifests. Did this change make it in before this? Any advice on what our best path here is to know that UUID before we provision? I do see a few mentions of it in state file and cvo-overrides.yml after generating manifests.

CC @crawford @wking

crawford · 2019-01-14T19:46:12Z

@dgoodwin oops, I think this was merged a little prematurely. We'll need to follow up with an easy way to capture that UUID before the provisioning is started.

CC @abhinavdahiya

wking · 2019-01-14T23:48:30Z

We'll need to follow up with an easy way to capture that UUID before the provisioning is started.

Instead of just the UUID, can we make a metadata.json asset that is part of the ignition-configs target?

crawford · 2019-01-14T23:49:12Z

@dgoodwin Abhinav pointed out that metadata.json will always be written out after the cluster has finished creation (regardless of whether or not it was successful). Is this sufficient? Is the cluster ID needed before we start provisioning?

wking · 2019-01-14T23:53:53Z

You need metadata.json before create cluster because bring-your-own-infrastructure will never actually call create cluster, and they'll still want the metadata to be able to destroy clusters later.

wking · 2019-01-15T05:09:06Z

You need metadata.json before create cluster because bring-your-own-infrastructure...

Never mind, I was confused. Bring-your-own-infrastructure must also be destroy-your-own-infrastructure. For Hive or anything that is actually calling create cluster, I agree that "metadata.json will be there when we exit" seems to cover things. @dgoodwin, can you remind us why you wanted it before calling create cluster? I've pushed up an implementation here if it turns out we do need this, but realized I had misunderstood the motivation when writing up the commit message.

dgoodwin · 2019-01-15T11:58:22Z

The two uses cases were (1) service delivery will start receiving telemetry for the cluster while it's installing, but they have no knowledge of the UUID which is a problem for them, and (2) if Hive fails to upload that UUID after install we have an orphaned cluster that can't be cleaned up automatically. Writing the metadata.json as an asset is a perfect solution, we can upload once ready and if it fails, no harm done, we'll just keep retrying.

From Devan Goodwin [1]: The two uses cases were (1) service delivery will start receiving telemetry for the cluster while it's installing, but they have no knowledge of the UUID which is a problem for them, and (2) if Hive fails to upload that UUID after install we have an orphaned cluster that can't be cleaned up automatically. Writing the metadata.json as an asset is a perfect solution, we can upload once ready and if it fails, no harm done, we'll just keep retrying. Matthew recommended the no-op load [2]: My suggestion is that, for now, Load should return false always. The installer will ignore any changes to metadata.json. In the future, perhaps we should introduce a read-only asset that would cause the installer to warn (or fail) in the face of changes. [1]: openshift#1057 (comment) [2]: openshift#1070 (comment)

Since 0.10.0, clusterID is no longer part of install-config.yaml See openshift/installer#1057

Catching up with openshift/installer@170fdc2d2c (pkg/asset/: Remove clusterID from installconfig and move it to cluster, 2018-12-04, openshift/installer#1057).

staebler and others added 8 commits January 11, 2019 09:36

asset: add rhcos Image asset to generate image location

2bd6c9d

added OPENSHIFT_INSTALL_OS_IMAGE_OVERRIDE env with warning to override the image that will be used.

pkg: use rhcos Image to fetch ami for AWS

3eb958b

asset/installconfig: drop image related code paths

94d9958

types/openstack: drop GetImageNames method

255566f

With baseimage user query removed from TUI, this function is no longer required. This also - drops the vendored files - updates the mocks using `hack/go-genmock.sh`

docs: update the resource dep graph

38e8cad

openshift-ci-robot added size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. approved Indicates a PR has been approved by an approver from all required OWNERS files. labels Jan 12, 2019

openshift-ci-robot requested review from aaronlevy and hardys January 12, 2019 01:30

openshift-ci-robot assigned wking Jan 12, 2019

openshift-ci-robot added do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. lgtm Indicates that a PR is ready to be merged. labels Jan 12, 2019

wking mentioned this pull request Jan 12, 2019

Remove clusterID from installconfig and move it to cluster #783

Closed

openshift-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jan 12, 2019

wking mentioned this pull request Jan 12, 2019

assets/installconfig: apply defaults to install config #1058

Merged

wking added this to the Freeze milestone Jan 12, 2019

openshift-merge-robot merged commit 55f3d9f into openshift:master Jan 12, 2019

wking mentioned this pull request Jan 15, 2019

cmd/openshift-install/create: Add metadata.json to ignition-configs #1070

Merged

wking mentioned this pull request Jan 18, 2019

Requesting Removal of InstallConfig ClusterID #538

Closed

wking mentioned this pull request Feb 16, 2019

image: Take explicit dependency on util-linux for uuidgen and gzip #1262

Merged

markmc added a commit to markmc/dev-scripts that referenced this pull request Feb 20, 2019

Don't generate clusterID

2268716

Since 0.10.0, clusterID is no longer part of install-config.yaml See openshift/installer#1057

markmc mentioned this pull request Feb 20, 2019

Don't generate clusterID openshift-metal3/dev-scripts#85

Merged

hardys pushed a commit to openshift-metal3/dev-scripts that referenced this pull request Feb 20, 2019

Don't generate clusterID

6fe7d45

Since 0.10.0, clusterID is no longer part of install-config.yaml See openshift/installer#1057

wking mentioned this pull request Apr 1, 2019

ci-operator/templates/openshift: Drop clusterID openshift/release#3320

Merged

wking mentioned this pull request Apr 11, 2019

Run vsphere UPI tests for installer repo openshift/release#3305

Merged

8 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Remove clusterID from installconfig and move it to cluster #1057

Remove clusterID from installconfig and move it to cluster #1057

staebler commented Jan 12, 2019

wking commented Jan 12, 2019

openshift-ci-robot commented Jan 12, 2019

wking commented Jan 12, 2019

wking commented Jan 12, 2019

abhinavdahiya commented Jan 12, 2019

dgoodwin commented Jan 14, 2019

crawford commented Jan 14, 2019

wking commented Jan 14, 2019

crawford commented Jan 14, 2019

wking commented Jan 14, 2019

wking commented Jan 15, 2019

dgoodwin commented Jan 15, 2019

Remove clusterID from installconfig and move it to cluster #1057

Remove clusterID from installconfig and move it to cluster #1057

Conversation

staebler commented Jan 12, 2019

wking commented Jan 12, 2019

openshift-ci-robot commented Jan 12, 2019

wking commented Jan 12, 2019

wking commented Jan 12, 2019

abhinavdahiya commented Jan 12, 2019

dgoodwin commented Jan 14, 2019

crawford commented Jan 14, 2019

wking commented Jan 14, 2019

crawford commented Jan 14, 2019

wking commented Jan 14, 2019

wking commented Jan 15, 2019

dgoodwin commented Jan 15, 2019