Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OLM Unable to Upgrade Through Multiple Versions #755

Closed
dgoodwin opened this issue Mar 13, 2019 · 4 comments
Closed

OLM Unable to Upgrade Through Multiple Versions #755

dgoodwin opened this issue Mar 13, 2019 · 4 comments
Labels
kind/bug Categorizes issue or PR as related to a bug.

Comments

@dgoodwin
Copy link

We currently have two CSVs:

NAME                             DISPLAY   VERSION           REPLACES                         PHASE
hive-operator.v0.1.519-5d3c2e7   Hive      0.1.519-5d3c2e7   hive-operator.v0.1.517-8c61fe5   Succeeded
hive-operator.v0.1.531-efbce36   Hive      0.1.531-efbce36   hive-operator.v0.1.526-12c88f1   Failed

The second one appears to have failed due to: Warning OwnerConflict 3m (x105 over 28m) operator-lifecycle-manager crd owner conflict: CRD owned by another ClusterServiceVersion

There are two install plans:

NAME            CSV                              SOURCE   APPROVAL    APPROVED
install-hwqn5   hive-operator.v0.1.531-efbce36            Automatic   true
install-xtgcp   hive-operator.v0.1.519-5d3c2e7            Automatic   true

The installed 519 version should be upgraded through several versions to reach the desired 531: https://gist.github.com/9b09a651fca981ac6c072d04a5b920c0

The contents of the image registry catalog can be found here: https://github.com/app-sre/saas-hive-operator-bundle/tree/staging/hive

In the OLM operator log we found:

time="2019-03-13T17:50:55Z" level=info msg="syncing CSV" csv=hive-operator.v0.1.531-efbce36 id=78ooM namespace=hive phase=Pending
time="2019-03-13T17:50:55Z" level=info msg="checking hive-operator.v0.1.519-5d3c2e7"
time="2019-03-13T17:50:55Z" level=info msg="checking hive-operator.v0.1.531-efbce36"
time="2019-03-13T17:50:55Z" level=info msg="csv in operatorgroup" csv=hive-operator.v0.1.531-efbce36 id=YvK8F namespace=hive phase=Pending
time="2019-03-13T17:51:03Z" level=info msg="retrying hive/hive-operator.v0.1.531-efbce36"
E0313 17:51:03.845415       1 queueinformer_operator.go:155] Sync "hive/hive-operator.v0.1.531-efbce36" failed: CRD owned by another ClusterServiceVersion
time="2019-03-13T17:51:03Z" level=info msg="syncing CSV" csv=hive-operator.v0.1.531-efbce36 id=A6Fkd namespace=hive phase=Failed
time="2019-03-13T17:51:03Z" level=info msg="checking hive-operator.v0.1.531-efbce36"
time="2019-03-13T17:51:03Z" level=info msg="checking hive-operator.v0.1.519-5d3c2e7"
time="2019-03-13T17:51:03Z" level=info msg="csv in operatorgroup" csv=hive-operator.v0.1.531-efbce36 id=cDabD namespace=hive phase=Failed
time="2019-03-13T17:51:03Z" level=warning msg="needs reinstall: AnnotationsMissing: annotations on deployment don't match. couldn't find createdAt: 2019-03-13T16:52:10Z" csv=hive-operator.v0.1.531-efbce36 id=cDabD namespace=hive phase=Failed strategy=deployment

@jzelinskie
Copy link
Contributor

jzelinskie commented Mar 13, 2019

Here's an excerpt of the catalog operator logs:

xtgcp = v519 of hive
hwqn5 = v531 of hive

time="2019-03-12T16:46:42Z" level=info msg="retrying hive"
E0312 16:46:42.733882       1 queueinformer_operator.go:155] Sync "hive" failed: no catalog sources available
time="2019-03-12T16:46:42Z" level=info msg="building connection to registry" currentSource="{hive-catalog hive}" id=aF4dI source=hive-catalog
time="2019-03-12T16:46:42Z" level=info msg="client hasn't yet become healthy, attempt a health check" currentSource="{hive-catalog hive}" id=aF4dI source=hive-catalog
time="2019-03-12T16:46:42Z" level=info msg="building connection to registry" currentSource="{olm-operators olm}" id=mWzrv source=olm-operators
time="2019-03-12T16:46:42Z" level=info msg="client hasn't yet become healthy, attempt a health check" currentSource="{olm-operators olm}" id=mWzrv source=olm-operators
time="2019-03-12T16:47:02Z" level=warning msg="no installplan found with matching manifests, creating new one" id=DGilW namespace=hive
time="2019-03-12T16:47:02Z" level=info msg="building connection to registry" currentSource="{olm-operators olm}" id=InyrJ source=olm-operators
time="2019-03-12T16:47:02Z" level=info msg="client hasn't yet become healthy, attempt a health check" currentSource="{olm-operators olm}" id=InyrJ source=olm-operators
time="2019-03-12T16:47:02Z" level=info msg=syncing id=JUFTw ip=install-xtgcp namespace=hive phase=
time="2019-03-12T16:47:02Z" level=info msg="skip processing installplan without status - subscription sync responsible for initial status" id=JUFTw ip=install-xtgcp namespace=hive phase=
time="2019-03-12T16:47:02Z" level=info msg=syncing id=5Qxge ip=install-xtgcp namespace=hive phase=Installing
time="2019-03-12T16:47:04Z" level=info msg="retrying olm"
E0312 16:47:04.406219       1 queueinformer_operator.go:155] Sync "olm" failed: no catalog sources available
time="2019-03-12T16:47:05Z" level=info msg="building connection to registry" currentSource="{olm-operators olm}" id=J4lfy source=olm-operators
time="2019-03-12T16:47:05Z" level=info msg="client hasn't yet become healthy, attempt a health check" currentSource="{olm-operators olm}" id=J4lfy source=olm-operators
time="2019-03-12T16:47:06Z" level=info msg="retrying olm"
E0312 16:47:06.209334       1 queueinformer_operator.go:155] Sync "olm" failed: no catalog sources available
time="2019-03-12T16:47:07Z" level=info msg=syncing id=npfIU ip=install-xtgcp namespace=hive phase=Complete

... thousands of lines of sync "olm" failing and checking the existing completed installplan ...

E0313 16:53:29.623070       1 queueinformer_operator.go:155] Sync "olm" failed: no catalog sources available
time="2019-03-13T16:53:50Z" level=info msg="couldn't get from queue" key=hive/hive-catalog-q78gm queue=pod
time="2019-03-13T16:53:50Z" level=info msg="removed client for deleted catalogsource" source="{hive-catalog hive}"
time="2019-03-13T16:53:50Z" level=info msg="couldn't get from queue" key=hive/hive-catalog queue=catsrc
time="2019-03-13T16:53:50Z" level=info msg="building connection to registry" currentSource="{olm-operators olm}" id=E2Op7 source=olm-operators
time="2019-03-13T16:53:50Z" level=info msg="client hasn't yet become healthy, attempt a health check" currentSource="{olm-operators olm}" id=E2Op7 source=olm-operators
time="2019-03-13T16:54:10Z" level=info msg="retrying hive"
E0313 16:54:10.608044       1 queueinformer_operator.go:155] Sync "hive" failed: no catalog sources available
time="2019-03-13T16:54:10Z" level=warning msg="couldn't find service in cache" service=hive-catalog
time="2019-03-13T16:54:10Z" level=info msg="retrying olm"
E0313 16:54:10.649661       1 queueinformer_operator.go:155] Sync "olm" failed: no catalog sources available
time="2019-03-13T16:54:10Z" level=info msg="retrying hive"
E0313 16:54:10.655086       1 queueinformer_operator.go:155] Sync "hive" failed: no catalog sources available
time="2019-03-13T16:54:10Z" level=info msg="building connection to registry" currentSource="{hive-catalog hive}" id=q9HpN source=hive-catalog
time="2019-03-13T16:54:10Z" level=info msg="client hasn't yet become healthy, attempt a health check" currentSource="{hive-catalog hive}" id=q9HpN source=hive-catalog
time="2019-03-13T16:54:10Z" level=info msg="retrying olm"
E0313 16:54:10.705620       1 queueinformer_operator.go:155] Sync "olm" failed: no catalog sources available
time="2019-03-13T16:54:16Z" level=info msg="retrying hive"
E0313 16:54:16.676838       1 queueinformer_operator.go:155] Sync "hive" failed: no catalog sources available
time="2019-03-13T16:54:16Z" level=info msg="building connection to registry" currentSource="{olm-operators olm}" id=C1zgg source=olm-operators
time="2019-03-13T16:54:16Z" level=info msg="client hasn't yet become healthy, attempt a health check" currentSource="{olm-operators olm}" id=C1zgg source=olm-operators
time="2019-03-13T16:54:16Z" level=info msg="retrying olm"
E0313 16:54:16.712184       1 queueinformer_operator.go:155] Sync "olm" failed: no catalog sources available
time="2019-03-13T16:54:36Z" level=info msg="building connection to registry" currentSource="{hive-catalog hive}" id=lV2O8 source=hive-catalog
time="2019-03-13T16:54:36Z" level=info msg="client hasn't yet become healthy, attempt a health check" currentSource="{hive-catalog hive}" id=lV2O8 source=hive-catalog
time="2019-03-13T16:54:36Z" level=info msg="building connection to registry" currentSource="{olm-operators olm}" id=NJ7AR source=olm-operators
time="2019-03-13T16:54:36Z" level=info msg="client hasn't yet become healthy, attempt a health check" currentSource="{olm-operators olm}" id=NJ7AR source=olm-operators
time="2019-03-13T16:54:56Z" level=info msg="building connection to registry" currentSource="{olm-operators olm}" id=5hnb9 source=olm-operators
time="2019-03-13T16:54:56Z" level=info msg="client hasn't yet become healthy, attempt a health check" currentSource="{olm-operators olm}" id=5hnb9 source=olm-operators
time="2019-03-13T16:54:59Z" level=info msg="building connection to registry" currentSource="{olm-operators olm}" id=lNBXK source=olm-operators
time="2019-03-13T16:54:59Z" level=info msg="client hasn't yet become healthy, attempt a health check" currentSource="{olm-operators olm}" id=lNBXK source=olm-operators
time="2019-03-13T16:54:59Z" level=info msg="retrying olm"
E0313 16:54:59.807628       1 queueinformer_operator.go:155] Sync "olm" failed: no catalog sources available
time="2019-03-13T16:55:19Z" level=info msg="building connection to registry" currentSource="{olm-operators olm}" id=zFeon source=olm-operators
time="2019-03-13T16:55:19Z" level=info msg="client hasn't yet become healthy, attempt a health check" currentSource="{olm-operators olm}" id=zFeon source=olm-operators
time="2019-03-13T16:55:19Z" level=warning msg="no installplan found with matching manifests, creating new one" id=M6QWA namespace=hive
time="2019-03-13T16:55:19Z" level=info msg=syncing id=T+8Ib ip=install-hwqn5 namespace=hive phase=
time="2019-03-13T16:55:19Z" level=info msg="skip processing installplan without status - subscription sync responsible for initial status" id=T+8Ib ip=install-hwqn5 namespace=hive phase=
time="2019-03-13T16:55:19Z" level=info msg=syncing id=1aZ0T ip=install-hwqn5 namespace=hive phase=Installing
time="2019-03-13T16:55:22Z" level=info msg="building connection to registry" currentSource="{olm-operators olm}" id=cMp7P source=olm-operators
time="2019-03-13T16:55:22Z" level=info msg="client hasn't yet become healthy, attempt a health check" currentSource="{olm-operators olm}" id=cMp7P source=olm-operators
time="2019-03-13T16:55:23Z" level=info msg="retrying olm"
E0313 16:55:23.422555       1 queueinformer_operator.go:155] Sync "olm" failed: no catalog sources available
time="2019-03-13T16:55:24Z" level=info msg=syncing id=Al3SE ip=install-hwqn5 namespace=hive phase=Complete
time="2019-03-13T16:55:43Z" level=info msg="building connection to registry" currentSource="{olm-operators olm}" id=AH3/U source=olm-operators
time="2019-03-13T16:55:43Z" level=info msg="client hasn't yet become healthy, attempt a health check" currentSource="{olm-operators olm}" id=AH3/U source=olm-operators
time="2019-03-13T16:56:03Z" level=info msg="building connection to registry" currentSource="{olm-operators olm}" id=8VuP4 source=olm-operators
time="2019-03-13T16:56:03Z" level=info msg="client hasn't yet become healthy, attempt a health check" currentSource="{olm-operators olm}" id=8VuP4 source=olm-operators
time="2019-03-13T16:56:03Z" level=info msg="retrying olm"
E0313 16:56:03.049626       1 queueinformer_operator.go:155] Sync "olm" failed: no catalog sources available
time="2019-03-13T16:56:23Z" level=info msg="building connection to registry" currentSource="{olm-operators olm}" id=S5kW7 source=olm-operators
time="2019-03-13T16:56:23Z" level=info msg="client hasn't yet become healthy, attempt a health check" currentSource="{olm-operators olm}" id=S5kW7 source=olm-operators
time="2019-03-13T16:56:43Z" level=info msg="building connection to registry" currentSource="{olm-operators olm}" id=ROJIK source=olm-operators
time="2019-03-13T16:56:43Z" level=info msg="client hasn't yet become healthy, attempt a health check" currentSource="{olm-operators olm}" id=ROJIK source=olm-operators
time="2019-03-13T16:57:03Z" level=info msg="building connection to registry" currentSource="{olm-operators olm}" id=MbKCt source=olm-operators
time="2019-03-13T16:57:03Z" level=info msg="client hasn't yet become healthy, attempt a health check" currentSource="{olm-operators olm}" id=MbKCt source=olm-operators
time="2019-03-13T16:57:03Z" level=info msg="retrying olm"
E0313 16:57:03.111272       1 queueinformer_operator.go:155] Sync "olm" failed: no catalog sources available
time="2019-03-13T16:57:24Z" level=info msg="building connection to registry" currentSource="{olm-operators olm}" id=xLafX source=olm-operators
time="2019-03-13T16:57:24Z" level=info msg="client hasn't yet become healthy, attempt a health check" currentSource="{olm-operators olm}" id=xLafX source=olm-operators
time="2019-03-13T16:57:24Z" level=info msg="retrying olm"
E0313 16:57:24.231445       1 queueinformer_operator.go:155] Sync "olm" failed: no catalog sources available
time="2019-03-13T16:57:44Z" level=info msg="building connection to registry" currentSource="{olm-operators olm}" id=88BDS source=olm-operators
time="2019-03-13T16:57:44Z" level=info msg="client hasn't yet become healthy, attempt a health check" currentSource="{olm-operators olm}" id=88BDS source=olm-operators
time="2019-03-13T16:58:02Z" level=info msg=syncing id=fCtA3 ip=install-hwqn5 namespace=hive phase=Complete
time="2019-03-13T16:58:02Z" level=info msg=syncing id=vOv5Z ip=install-xtgcp namespace=hive phase=Complete
time="2019-03-13T16:58:04Z" level=info msg="building connection to registry" currentSource="{olm-operators olm}" id=E7P6d source=olm-operators
time="2019-03-13T16:58:04Z" level=info msg="client hasn't yet become healthy, attempt a health check" currentSource="{olm-operators olm}" id=E7P6d source=olm-operators

@ecordell
Copy link
Member

Thanks @dgoodwin and @jzelinskie. This should be fixed by #756

I didn't recognize it at first, but this is a bug that's been on our docket for a while. Thanks for bumping the priority!

@dgoodwin
Copy link
Author

Awesome thank you for the quick turnaround.

Would the bug you have in mind have possibly fixed itself? Subsequent pipeline builds seem to have resolved it, we now have:

(dgoodwin@wrx {admin-roles} ~/go/src/github.com/openshift/hive) $ k get csv                                                                                   (hive/api-hive-stage-openshift-com:443/dgoodwin/hive)
NAME                             DISPLAY   VERSION           REPLACES                         PHASE
hive-operator.v0.1.519-5d3c2e7   Hive      0.1.519-5d3c2e7   hive-operator.v0.1.517-8c61fe5   Failed
hive-operator.v0.1.536-4163fc2   Hive      0.1.536-4163fc2   hive-operator.v0.1.533-cc95376   Succeeded
(dgoodwin@wrx {admin-roles} ~/go/src/github.com/openshift/hive) $ k get installplan                                                                           (hive/api-hive-stage-openshift-com:443/dgoodwin/hive)
NAME            CSV                              SOURCE   APPROVAL    APPROVED
install-6g5g7   hive-operator.v0.1.536-4163fc2            Automatic   true
install-hwqn5   hive-operator.v0.1.531-efbce36            Automatic   true
install-wpmxq   hive-operator.v0.1.533-cc95376            Automatic   true
install-xtgcp   hive-operator.v0.1.519-5d3c2e7            Automatic   true

@jzelinskie jzelinskie added the kind/bug Categorizes issue or PR as related to a bug. label Mar 14, 2019
@ecordell
Copy link
Member

This is fixed by #761

Thanks for the detailed reports!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug.
Projects
None yet
Development

No branches or pull requests

3 participants