Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to tell if OLM is ready after OpenShift install? #934

Closed
newgoliath opened this issue Jun 28, 2019 · 5 comments
Closed

How to tell if OLM is ready after OpenShift install? #934

newgoliath opened this issue Jun 28, 2019 · 5 comments
Labels
triage/unresolved Indicates an issue that can not or will not be resolved.

Comments

@newgoliath
Copy link

We deploy dozens of OpenShift 4s daily for our students. Sometimes we need to deploy OLM based operators immediately after install. We do all this with Ansible. Oftentimes, we see that the OLM is not ready, and our playbooks that innocently deploy these operators time out and fail our deployments. We'd like to add a service check for this.

What is the best indication that OLM is ready to service subscriptions, etc?

In [1] I see kubectl -n local get deployments - would this work with OCP4.1 reliably if I change to oc get deployments -n openshift-operator-lifecycle-manager?

[1] https://github.com/operator-framework/operator-lifecycle-manager/blob/master/Documentation/install/install.md#run-locally-with-minikube

Thanks!

@ecordell
Copy link
Member

Thanks for the question @newgoliath -

For 4.1+ clusters, the cluster itself should not signal that it's ready unless OLM is up and ready to service requests.

Can you give an example of some commands that are taking longer than you expect or failing?

@newgoliath
Copy link
Author

newgoliath commented Jun 28, 2019

This subscription for operator-metering has hung a few times:

https://github.com/redhat-cop/agnosticd/blob/development/ansible/roles/ocp4-workload-metering/templates/subscription.j2

Per the Ansible below we give it 5 minutes.

https://github.com/redhat-cop/agnosticd/blob/development/ansible/roles/ocp4-workload-metering/tasks/workload.yml#L21

I find that when I apply the same ansible role to a cluster that's been up for 1+ hours, it works flawlessly.

Unfortunately, our system immediately deletes failed clusters, so grabbing logs is a bit of a challenge. A remote log collector is in our future.

@stale
Copy link

stale bot commented Feb 26, 2020

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the wontfix label Feb 26, 2020
@openshift-ci-robot openshift-ci-robot added triage/unresolved Indicates an issue that can not or will not be resolved. and removed wontfix labels Feb 27, 2020
@stale
Copy link

stale bot commented Apr 27, 2020

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@ecordell
Copy link
Member

ecordell commented Jun 5, 2020

Is this still an issue? Please re-open if so.

@ecordell ecordell closed this as completed Jun 5, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
triage/unresolved Indicates an issue that can not or will not be resolved.
Projects
None yet
Development

No branches or pull requests

3 participants