Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Installer fail with IPI on vSphere with OKD 4.6 #392

Closed
damhau opened this issue Nov 27, 2020 · 29 comments
Closed

Installer fail with IPI on vSphere with OKD 4.6 #392

damhau opened this issue Nov 27, 2020 · 29 comments
Labels
platform/vsphere triage/needs-information Indicates an issue needs more information in order to work on it.

Comments

@damhau
Copy link

damhau commented Nov 27, 2020

Describe the bug
Installer fail with IPI on vSphere with OKD 4.6 with http_proxy. After review of the state of the master nodes it seems that the hostname is not set.

Version
4.6.0-0.okd-2020-11-27-135746

How reproducible
Can be reproduced 100% on my infrastructure

Log bundle
The installer is unable to gather the log bundle but I can get log directly from the node if needed

@vrutkovs
Copy link
Member

Worked fine here on AWS - https://prow.ci.openshift.org/view/gs/origin-ci-test/logs/release-openshift-origin-installer-launch-aws/1332245347673575424

The installer is unable to gather the log bundle but I can get log directly from the node if needed

What was the error message? Was bootstrap reachable? What was operator state?

@vrutkovs vrutkovs added the triage/needs-information Indicates an issue needs more information in order to work on it. label Nov 27, 2020
@timbrd
Copy link

timbrd commented Nov 27, 2020

By "http_proxy", do you mean an egress proxy?
Egress proxy configuration on vsphere is supported only with upi, according to the documentation.

@damhau
Copy link
Author

damhau commented Nov 30, 2020

@vrutkovs please find the error during the deployment below.

In addition the hostname of the bootstrap and master node is fedora (it should be the name of node).

time="2020-11-30T17:12:21+01:00" level=debug msg="module.master.vsphere_virtual_machine.vm[2]: Creation complete after 21s [id=42146e11-490d-1197-646d-70a724bc0b40]"
time="2020-11-30T17:12:23+01:00" level=debug msg="module.bootstrap.vsphere_virtual_machine.vm: Creation complete after 22s [id=42142dfe-5591-3577-419e-4cc791da3eed]"
time="2020-11-30T17:12:23+01:00" level=debug
time="2020-11-30T17:12:23+01:00" level=debug msg="Apply complete! Resources: 7 added, 0 changed, 0 destroyed."
time="2020-11-30T17:12:23+01:00" level=debug msg="OpenShift Installer 4.6.0-0.okd-2020-11-27-200126"
time="2020-11-30T17:12:23+01:00" level=debug msg="Built from commit 7b5bd8be9e5f18cc9b8af5f95d629ddd64c015cc"
time="2020-11-30T17:12:23+01:00" level=info msg="Waiting up to 20m0s for the Kubernetes API at https://api.labhtest.gva.icrc.priv:6443..."
time="2020-11-30T17:12:53+01:00" level=debug msg="Still waiting for the Kubernetes API: Get "https://api.labhtest.gva.icrc.priv:6443/version?timeout=32s\": dial tcp 10.64.16.205:6443: i/o timeout"
time="2020-11-30T17:15:35+01:00" level=info msg="API v1.19.0-rc.2.1077+43983cda8af930-dirty up"
time="2020-11-30T17:15:35+01:00" level=info msg="Waiting up to 30m0s for bootstrapping to complete..."
time="2020-11-30T17:45:35+01:00" level=error msg="Cluster operator network Degraded is True with BootstrapError: Internal error while reconciling platform networking resources: Unable to bootstrap OVN, expected amount of control plane nodes (3) do not match found (1): timed out waiting for the condition"
time="2020-11-30T17:45:35+01:00" level=debug msg="Fetching Bootstrap SSH Key Pair..."
time="2020-11-30T17:45:35+01:00" level=debug msg="Loading Bootstrap SSH Key Pair..."
time="2020-11-30T17:45:35+01:00" level=debug msg="Using Bootstrap SSH Key Pair loaded from state file"
time="2020-11-30T17:45:35+01:00" level=debug msg="Reusing previously-fetched Bootstrap SSH Key Pair"
time="2020-11-30T17:45:35+01:00" level=debug msg="Fetching Install Config..."
time="2020-11-30T17:45:35+01:00" level=debug msg="Loading Install Config..."
time="2020-11-30T17:45:35+01:00" level=debug msg=" Loading SSH Key..."
time="2020-11-30T17:45:35+01:00" level=debug msg=" Loading Base Domain..."
time="2020-11-30T17:45:35+01:00" level=debug msg=" Loading Platform..."
time="2020-11-30T17:45:35+01:00" level=debug msg=" Loading Cluster Name..."
time="2020-11-30T17:45:35+01:00" level=debug msg=" Loading Base Domain..."
time="2020-11-30T17:45:35+01:00" level=debug msg=" Loading Platform..."
time="2020-11-30T17:45:35+01:00" level=debug msg=" Loading Pull Secret..."
time="2020-11-30T17:45:35+01:00" level=debug msg=" Loading Platform..."
time="2020-11-30T17:45:35+01:00" level=debug msg="Using Install Config loaded from state file"
time="2020-11-30T17:45:35+01:00" level=debug msg="Reusing previously-fetched Install Config"
time="2020-11-30T17:46:37+01:00" level=error msg="Attempted to gather debug logs after installation failure: failed to get bootstrap and control plane host addresses from "/home/admin/okd_labhtest_install/terraform.tfstate": failed to lookup bootstrap ipv4 address: Post "https://gvavcenterva01p.gva.icrc.priv/sdk\": context deadline exceeded"
time="2020-11-30T17:46:37+01:00" level=fatal msg="Bootstrap failed to complete: failed to wait for bootstrapping to complete: timed out waiting for the condition"

@damhau
Copy link
Author

damhau commented Nov 30, 2020

@timbrd by http proxy I mean https://github.com/openshift/installer/blob/master/docs/user/customization.md
proxy (optional object): The proxy settings for the cluster. If unset, the cluster will not be configured to use a proxy.

httpProxy (optional string): The URL of the proxy for HTTP requests.
httpsProxy (optional string): The URL of the proxy for HTTPS requests.
noProxy (optional string): A comma-separated list of domains and CIDRs for which the proxy should not be used.

Btw, the same deployment work with 4.5.0-0.okd-2020-10-15-235428

@damhau
Copy link
Author

damhau commented Nov 30, 2020

Please find in attachment a 4.5 install that was successful "4.5.0-0.okd-2020-10-15-235428_openshift_install.log"
And a failed 4.6 install "4.6.0-0.okd-2020-11-27-200126_openshift_install.log"

Let me know if I can provide additional logs

4.6.0-0.okd-2020-11-27-200126_openshift_install.log
4.5.0-0.okd-2020-10-15-235428_openshift_install.log

@damhau damhau changed the title Installer fail with IPI on vSphere with OKD 4.6 and http_proxy Installer fail with IPI on vSphere with OKD 4.6 Nov 30, 2020
@damhau
Copy link
Author

damhau commented Nov 30, 2020

I think that my issue is related to #399 and/or #396

I've the same symptoms, fedora hostname and gcp-hostname service is enabled but fails.

@damhau
Copy link
Author

damhau commented Nov 30, 2020

and here is the output of hostnamectl on a working system:

[root@iikstest-mjc75-master-0 ~]# hostnamectl
Static hostname: iikstest-mjc75-master-0
Icon name: computer-vm
Chassis: vm
Machine ID: a8a961d3a90d4f7385c996fb0e6d3401
Boot ID: 48a04d9bcc0b472fa8b9d4bc25dfa653
Virtualization: vmware
Operating System: Fedora CoreOS 32.20200629.3.0
CPE OS Name: cpe:/o:fedoraproject:fedora:32
Kernel: Linux 5.6.19-300.fc32.x86_64
Architecture: x86-64

@damhau
Copy link
Author

damhau commented Nov 30, 2020

and on the failed 4.6 install:

[root@fedora ~]# hostnamectl
Static hostname: n/a
Transient hostname: fedora
Icon name: computer-vm
Chassis: vm
Machine ID: c055b75af52b4353b68c2992a42bb90c
Boot ID: 46ea3c39bbc24e04b6a74b4ae6d96195
Virtualization: vmware
Operating System: Fedora CoreOS 33.20201124.10.1
CPE OS Name: cpe:/o:fedoraproject:fedora:33
Kernel: Linux 5.9.9-200.fc33.x86_64
Architecture: x86-64

@damhau
Copy link
Author

damhau commented Nov 30, 2020

and in attachement the output of journalctl -xfe on the 1st master (which has the vip of the api service)
4.6.0-0.okd-2020-11-27-200126_journalctl_output.txt

@vrutkovs
Copy link
Member

failed to lookup bootstrap ipv4 address: Post "https://gvavcenterva01p.gva.icrc.priv/sdk\": context deadline exceeded

Invalid DNS namespace passed to master? Why can't it find vSphere API endpoint?

@damhau
Copy link
Author

damhau commented Dec 1, 2020

What do you mean by dns namespace ?

@damhau
Copy link
Author

damhau commented Dec 1, 2020

Are you referring to this error message 2020-11-30T17:46:37+01:00" level=error msg="Attempted to gather debug logs after installation failure: failed to get bootstrap and control plane host addresses from "/home/admin/okd_labhtest_install/terraform.tfstate": failed to lookup bootstrap ipv4 address: Post "https://gvavcenterva01p.gva.icrc.priv/sdk\": context deadline exceeded"

If yes my understanding (and correct me if I m wrong) is that when the timeout is reached and the installation is considered as failed the installer will try to gather log file by connecting to the bootstrap and master nodes, and to connect to the node it try to get this information from the vcenter api but the vcenter api request doesn’t sucées. (Which is not really surprising as the node are not reporting the correct hostname to vcenter through their vmtools because the hostname is set to fedora)

Maybe somebody can confirm if my understanding of this error message is correct ?

@damhau
Copy link
Author

damhau commented Dec 1, 2020

I checked this error message in the source code of the installer and my understanding seems to be correct, please see the code below:

ip, err := waitForVirtualMachineIP(client, moid)
if err != nil {
return "", errors.Wrap(err, "failed to lookup bootstrap ipv4 address")
}

The function “waitforvirtualmachineip” query the vcenter api until it get an ip and if it fail will output the error message you have mentionned in you comment.

And this is in line with what I see in vcenter, the bootstrap doesn’t report its ip in vcenter, maybe vmtools is not started ?

@vrutkovs
Copy link
Member

vrutkovs commented Dec 1, 2020

Which is not really surprising as the node are not reporting the correct hostname to vcenter through their vmtools because the hostname is set to fedora

I'm not too familiar with vSphere API details on this. Lets check this again when a valid hostname is set (see #394)

@damhau
Copy link
Author

damhau commented Dec 1, 2020

I'll deploy a new cluster and set the hostname correclty in each VM with hostnamectl and let you know

@amelie1979
Copy link

Upgrade from 4.5 to 4.6 breaks the machineset feature, we cant recreate node or scale up, all the new node are named fedora. (IPI with DHCP). There is no name reserv or hardcode to the DNS like in UPI.

@damhau
Copy link
Author

damhau commented Dec 1, 2020

I'm not doing an upgrade, it is an install from scratch.

@amelie1979
Copy link

I have done an updrade from 4.5 to 4.6, after the upgrade no new machine can get created because of the hostname (IPI with DHCP not sending the DNS name).

@damhau
Copy link
Author

damhau commented Dec 1, 2020

@amelie1979, thanks for the input but I m not sure to understand how it is related to my issue ?

@amelie1979
Copy link

@amelie1979, thanks for the input but I m not sure to understand how it is related to my issue ?

I only add more info for the issue. :)

@damhau
Copy link
Author

damhau commented Dec 2, 2020

@amelie1979 ah ok, sorry about my message. Thanks for your feedback. (i'm not used to open issue on github)

@damhau
Copy link
Author

damhau commented Dec 2, 2020

Which is not really surprising as the node are not reporting the correct hostname to vcenter through their vmtools because the hostname is set to fedora

I'm not too familiar with vSphere API details on this. Lets check this again when a valid hostname is set (see #394)

I've re deployed a new cluster in 4.6 and i've set the hostname as soon as the vm finished to boot and the result is excatly the same.
Maybe an additional info, the VM tool are not started on the bootstrap node and I see this error message in journald:

Dec 01 16:12:38 fedora zincati[10449]: Error: Os { code: 13, kind: PermissionDenied, message: "Permission denied" }
Dec 01 16:12:38 fedora zincati[10449]: failed to open file '/etc/zincati/config.d/90-disable-feature.toml'

I've attached the output of journalctl -xfe on the bootstrap node.

Any idee on why the bootrap node doesn't start the vmtool service ?

Here is the status of the vmtoolsd on one of the master node:

[core@labhtest-6bzxj-master-0 ~]$ systemctl status vmtoolsd
● vmtoolsd.service - Service for virtual machines hosted on VMware
     Loaded: loaded (/usr/lib/systemd/system/vmtoolsd.service; enabled; vendor preset: enabled)
     Active: active (running) since Tue 2020-12-01 16:13:22 UTC; 16h ago
       Docs: http://github.com/vmware/open-vm-tools
   Main PID: 742 (vmtoolsd)
      Tasks: 3 (limit: 19108)
     Memory: 3.8M
        CPU: 57.204s
     CGroup: /system.slice/vmtoolsd.service
             └─742 /usr/bin/vmtoolsd

Dec 01 16:13:22 fedora systemd[1]: Started Service for virtual machines hosted on VMware.

and the state on the bootstrap node:

[core@labhtest-6bzxj-bootstrap ~]$ systemctl status vmtoolsd
Unit vmtoolsd.service could not be found.
[core@labhtest-6bzxj-bootstrap ~]$ 

bootstrap-journald.txt

@vrutkovs
Copy link
Member

vrutkovs commented Dec 2, 2020

Please don't dump all found issues in a single ticket, its really hard to track what's going on here.

failed to open file '/etc/zincati/config.d/90-disable-feature.toml'

#215

Any idee on why the bootrap node doesn't start the vmtool service ?

This service doesn't get installed there. Why does bootstrap node need it?

@damhau
Copy link
Author

damhau commented Dec 2, 2020

Please don't dump all found issues in a single ticket, its really hard to track what's going on here.

Ok no problem, please let me know how I should proceed to help to solve this issue.
Do you need any logs or anything else ?

failed to open file '/etc/zincati/config.d/90-disable-feature.toml'

#215

Any idee on why the bootrap node doesn't start the vmtool service ?

This service doesn't get installed there. Why does bootstrap node need it?

In 4.5 it was installed, and I think it is "needed" by the installer to get the bootstrap ip to gather the log when the install fails and this is why we see the error message "Attempted to gather debug logs after installation failure: failed to get bootstrap and control plane host addresses from"

@darth-hp
Copy link

darth-hp commented Dec 3, 2020

Me too - 4.5 worked for me, 4.6 fails

INFO Waiting up to 30m0s for bootstrapping to complete...
E1203 16:45:42.875336    4004 reflector.go:307] k8s.io/client-go/tools/watch/informerwatcher.go:146: Failed to watch *v1.ConfigMap: Get "https://api.ocp03.zfwx.lan:6443/api/v1/namespaces/kube-system/configmaps?allowWatchBookmarks=true&fieldSelector=metadata.name%3Dbootstrap&resourceVersion=4102&timeoutSeconds=428&watch=true": dial tcp 172.18.128.5:6443: connect: connection refused
[above repeats]
ERROR Attempted to gather debug logs after installation failure: failed to get bootstrap and control plane host addresses from "terraform.tfstate": failed to lookup bootstrap ipv4 address: Post "https://<vcenterhost>/sdk": context deadline exceeded
FATAL Bootstrap failed to complete: failed to wait for bootstrapping to complete: timed out waiting for the condition

@damhau
Copy link
Author

damhau commented Dec 7, 2020

I've tried the workaround mentioned in another issue (renaming the hostname with hostnamectl) and after a reboot of the 3 master the installation is proceeding.

@vrutkovs Is there a estimated deadline for the fix for the hostname problem on vSphere ?

@damhau
Copy link
Author

damhau commented Dec 7, 2020

FYI, this is 100% related to this: openshift/machine-config-operator#2289
I'm waiting on this PR to test again.

@damhau
Copy link
Author

damhau commented Dec 15, 2020

Solved by #422

Success !!!

NAME                          STATUS   ROLES    AGE     VERSION
labhtest-5lhxr-master-0       Ready    master   4h8m    v1.19.2+7070803-1008
labhtest-5lhxr-master-1       Ready    master   4h8m    v1.19.2+7070803-1008
labhtest-5lhxr-master-2       Ready    master   4h9m    v1.19.2+7070803-1008
labhtest-5lhxr-worker-2g764   Ready    worker   3h54m   v1.19.2+7070803-1008
labhtest-5lhxr-worker-jghdh   Ready    worker   3h54m   v1.19.2+7070803-1008
labhtest-5lhxr-worker-qhpwq   Ready    worker   3h53m   v1.19.2+7070803-1008

@damhau damhau closed this as completed Dec 15, 2020
@darth-hp
Copy link

Upgrade worked (after running quite some time) but 4.6.0-0.okd-2020-12-12-135354 installer fails for me where 4.5.0-0.okd-2020-08-12-020541 was working.

INFO Waiting up to 30m0s for bootstrapping to complete...
E1215 10:48:17.057371   27364 reflector.go:307] k8s.io/client-go/tools/watch/informerwatcher.go:146: Failed to watch *v1.ConfigMap: Get "https://api.ocp03.zfwx.lan:6443/api/v1/namespaces/kube-system/configmaps?allowWatchBookmarks=true&fieldSelector=metadata.name%3Dbootstrap&resourceVersion=4531&timeoutSeconds=448&watch=true": dial tcp 172.18.128.5:6443: connect: connection refused
E1215 10:48:18.059490   27364 reflector.go:307] k8s.io/client-go/tools/watch/informerwatcher.go:146: Failed to watch *v1.ConfigMap: Get "https://api.ocp03.zfwx.lan:6443/api/v1/namespaces/kube-system/configmaps?allowWatchBookmarks=true&fieldSelector=metadata.name%3Dbootstrap&resourceVersion=4531&timeoutSeconds=522&watch=true": dial tcp 172.18.128.5:6443: connect: connection refused
E1215 10:48:19.061616   27364 reflector.go:307] k8s.io/client-go/tools/watch/informerwatcher.go:146: Failed to watch *v1.ConfigMap: Get "https://api.ocp03.zfwx.lan:6443/api/v1/namespaces/kube-system/configmaps?allowWatchBookmarks=true&fieldSelector=metadata.name%3Dbootstrap&resourceVersion=4531&timeoutSeconds=370&watch=true": dial tcp 172.18.128.5:6443: connect: connection refused
ERROR Cluster operator authentication Degraded is True with APIServerDeployment_UnavailablePod::IngressStateEndpoints_MissingSubsets::OAuthServiceCheckEndpointAccessibleController_SyncError::OAuthServiceEndpointsCheckEndpointAccessibleController_SyncError: OAuthServiceCheckEndpointAccessibleControllerDegraded: Get "https://172.30.105.132:443/healthz": dial tcp 172.30.105.132:443: connect: connection refused
OAuthServiceEndpointsCheckEndpointAccessibleControllerDegraded: oauth service endpoints are not ready
IngressStateEndpointsDegraded: No subsets found for the endpoints of oauth-server
APIServerDeploymentDegraded: 3 of 3 requested instances are unavailable for apiserver.openshift-oauth-apiserver (no pods found with labels "apiserver=true,app=openshift-oauth-apiserver,oauth-apiserver-anti-affinity=true,revision=1")
INFO Cluster operator authentication Progressing is True with APIServerDeployment_NewGeneration: APIServerDeploymentProgressing: deployment/apiserver.openshift-oauth-apiserver: observed generation is 0, desired generation is 1.
INFO Cluster operator authentication Available is False with APIServerDeployment_NoPod::OAuthServiceCheckEndpointAccessibleController_EndpointUnavailable::OAuthServiceEndpointsCheckEndpointAccessibleController_EndpointUnavailable::ReadyIngressNodes_NoReadyIngressNodes: ReadyIngressNodesAvailable: Authentication requires functional ingress which requires at least one schedulable and ready node. Got 0 worker nodes, 3 master nodes, 0 custom target nodes (none are schedulable or ready for ingress pods).
OAuthServiceEndpointsCheckEndpointAccessibleControllerAvailable: Failed to get oauth-openshift enpoints
OAuthServiceCheckEndpointAccessibleControllerAvailable: Get "https://172.30.105.132:443/healthz": dial tcp 172.30.105.132:443: connect: connection refused
APIServerDeploymentAvailable: no apiserver.openshift-oauth-apiserver pods available on any node.
ERROR Cluster operator cluster-autoscaler Degraded is True with MissingDependency: machine-api not ready
INFO Cluster operator ingress Available is False with IngressUnavailable: Not all ingress controllers are available.
INFO Cluster operator ingress Progressing is True with Reconciling: Not all ingress controllers are available.
ERROR Cluster operator ingress Degraded is True with IngressControllersDegraded: Some ingresscontrollers are degraded: ingresscontroller "default" is degraded: DegradedConditions: One or more other status conditions indicate a degraded state: DeploymentAvailable=False (DeploymentUnavailable: The deployment has Available status condition set to False (reason: MinimumReplicasUnavailable) with message: Deployment does not have minimum availability.), DeploymentReplicasMinAvailable=False (DeploymentMinimumReplicasNotMet: 0/2 of replicas are available, max unavailable is 1)
ERROR Cluster operator insights Degraded is True with PeriodicGatherFailed: Source config could not be retrieved: certificatesigningrequests.certificates.k8s.io is forbidden: User "system:serviceaccount:openshift-insights:gather" cannot list resource "certificatesigningrequests" in API group "certificates.k8s.io" at the cluster scope: RBAC: clusterrole.rbac.authorization.k8s.io "cluster-reader" not found, configmaps is forbidden: User "system:serviceaccount:openshift-insights:gather" cannot list resource "configmaps" in API group "" in the namespace "openshift-config": RBAC: clusterrole.rbac.authorization.k8s.io "cluster-reader" not found, customresourcedefinitions.apiextensions.k8s.io "volumesnapshots.snapshot.storage.k8s.io" is forbidden: User "system:serviceaccount:openshift-insights:gather" cannot get resource "customresourcedefinitions" in API group "apiextensions.k8s.io" at the cluster scope: RBAC: clusterrole.rbac.authorization.k8s.io "cluster-reader" not found, hostsubnets.network.openshift.io is forbidden: User "system:serviceaccount:openshift-insights:gather" cannot list resource "hostsubnets" in API group "network.openshift.io" at the cluster scope: RBAC: clusterrole.rbac.authorization.k8s.io "cluster-reader" not found, pods is forbidden: User "system:serviceaccount:openshift-insights:gather" cannot list resource "pods" in API group "" at the cluster scope: RBAC: clusterrole.rbac.authorization.k8s.io "cluster-reader" not found
INFO Cluster operator insights Disabled is False with AsExpected:
ERROR Cluster operator kube-apiserver Degraded is True with StaticPods_Error: StaticPodsDegraded: pods "kube-apiserver-ocp03-dr2rq-master-0" not found
StaticPodsDegraded: pods "kube-apiserver-ocp03-dr2rq-master-2" not found
INFO Cluster operator kube-apiserver Progressing is True with NodeInstaller: NodeInstallerProgressing: 2 nodes are at revision 0; 1 nodes are at revision 2; 0 nodes have achieved new revision 3
INFO Cluster operator kube-controller-manager Progressing is True with NodeInstaller: NodeInstallerProgressing: 2 nodes are at revision 0; 1 nodes are at revision 7
ERROR Cluster operator kube-scheduler Degraded is True with NodeInstaller_InstallerPodFailed: NodeInstallerDegraded: 1 nodes are failing on revision 2:
NodeInstallerDegraded: static pod of revision 2 has been installed, but is not ready while new revision 3 is pending; 1 nodes are failing on revision 3:
NodeInstallerDegraded: static pod of revision 3 has been installed, but is not ready while new revision 4 is pending; 1 nodes are failing on revision 5:
NodeInstallerDegraded:
INFO Cluster operator kube-scheduler Progressing is True with NodeInstaller: NodeInstallerProgressing: 3 nodes are at revision 0; 0 nodes have achieved new revision 6
INFO Cluster operator kube-scheduler Available is False with StaticPods_ZeroNodesActive: StaticPodsAvailable: 0 nodes are active; 3 nodes are at revision 0; 0 nodes have achieved new revision 6
INFO Cluster operator kube-storage-version-migrator Available is False with _NoMigratorPod: Available: deployment/migrator.openshift-kube-storage-version-migrator: no replicas are available
INFO Cluster operator machine-api Progressing is True with SyncingResources: Progressing towards operator: 4.6.0-0.okd-2020-12-12-135354
INFO Cluster operator machine-api Available is False with Initializing: Operator is initializing
INFO Cluster operator monitoring Progressing is True with RollOutInProgress: Rolling out the stack.
ERROR Cluster operator monitoring Degraded is Unknown with :
INFO Cluster operator monitoring Available is Unknown with :
INFO Cluster operator openshift-apiserver Available is False with APIServices_PreconditionNotReady: APIServicesAvailable: PreconditionNotReady
INFO Cluster operator operator-lifecycle-manager Progressing is True with : Deployed 0.16.1
INFO Cluster operator operator-lifecycle-manager-catalog Progressing is True with : Deployed 0.16.1
INFO Cluster operator operator-lifecycle-manager-packageserver Available is False with :
INFO Cluster operator operator-lifecycle-manager-packageserver Progressing is True with : Working toward 0.16.1
INFO Pulling debug logs from the bootstrap machine
ERROR Attempted to gather debug logs after installation failure: failed to create SSH client: failed to use pre-existing agent, make sure the appropriate keys exist in the agent for authentication: ssh: handshake failed: ssh: unable to authenticate, attempted methods [publickey none], no supported methods remain
FATAL Bootstrap failed to complete: failed to wait for bootstrapping to complete: timed out waiting for the condition

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
platform/vsphere triage/needs-information Indicates an issue needs more information in order to work on it.
Projects
None yet
Development

No branches or pull requests

5 participants