Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

none of the deployment works with okd 3.11 #715

Closed
gacopl opened this issue Feb 16, 2019 · 18 comments
Closed

none of the deployment works with okd 3.11 #715

gacopl opened this issue Feb 16, 2019 · 18 comments

Comments

@gacopl
Copy link

gacopl commented Feb 16, 2019

latest deployment for okd makes packageservice restarts
0.7.4 deployment make operator crashloopbackoff

catalogs don't work, and issuing CSV alone to openshift-operators fails, in cluster console failed staus and none deployments are created

any ideas how to install OLM with okd 3.11?

@njhale
Copy link
Member

njhale commented Feb 16, 2019

@gacopl could you try using the manifests from the 0.8.1 release and remove all container arguments from 0000_50_olm_06-olm-operator.deployment.yaml before you apply them to the cluster?

@ron1
Copy link
Contributor

ron1 commented Feb 16, 2019

I would have expected CI to catch this type of bug. Is that not the case?

@njhale
Copy link
Member

njhale commented Feb 16, 2019 via email

@gacopl
Copy link
Author

gacopl commented Feb 17, 2019

@njhale ok that seemed to work at least olm is not dying now i need to figure out what's with requirementsnotmet when installing CSV

@njhale
Copy link
Member

njhale commented Feb 17, 2019

@gacopl The requirement status section of the CSV's status should tell you exactly what's missing on the cluster for the CSV to run. You can either create the missing resources manually, or if you have an OLM CatalogSource that contains your CSV, you can create a Susbcription which will attempt to resolve and apply them for you.

All resource generation besides APIService and Deployment are now handled by the catalog-operator and requires a Susbcription.

@gacopl
Copy link
Author

gacopl commented Feb 17, 2019

Thanks @njhale i'm trying to wrap my head around catalogs, after installatio only pacageserver catalog is present i want to try out community operators, specificaly couchbase from certified operators, how can i add catalogs? I see the packageserver is served through grpc from some pod. In Enterprise OCP 3.11 there were special configmaps but that is more than half year old

@gacopl
Copy link
Author

gacopl commented Feb 17, 2019

I understand that CSV is parsed and the installplan is created but this happens only when you subscribe from something from catalog, how can i add more stuff or new catalog to test out patches i made for CSV

@ron1
Copy link
Contributor

ron1 commented Feb 17, 2019

Our CI is geared towards OpenShift 4.0 and we do not re-test older releases.

On Sat, Feb 16, 2019 at 1:02 PM ron1 @.***> wrote: I would have expected CI to catch this type of bug. Is that not the case? — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#715 (comment)>, or mute the thread https://github.com/notifications/unsubscribe-auth/AD-4LpZUmKEQHeaY_MwVOmcAp0rSqDgyks5vOEe8gaJpZM4a-sc9 .
-- Nick Hale

@njhale Thanks for the feedback. Given that OLM is in Tech Preview for OCP 3.11, is there intention to keep the latest version of OLM working on OCP 3.11 or are all efforts focused exclusively on deployments to OKD/OCP 4.0 pre-releases?

@njhale
Copy link
Member

njhale commented Feb 21, 2019

@ron1 We are really just restricted in whether we depend on any backwards incompatible kubernetes changes. So we can't guarantee the latest OLM image and manifests will work with OpenShift installations based on older kubernetes versions. Our previous release manifests are tied to specific OLM image digests, so if a version of these manifests works on 3.11 it should continue to work, unless the manifests were changed somewhere along the way (need to double check this hasn't already happened).

@njhale
Copy link
Member

njhale commented Feb 21, 2019

Thanks @njhale i'm trying to wrap my head around catalogs, after installatio only pacageserver catalog is present i want to try out community operators, specificaly couchbase from certified operators, how can i add catalogs? I see the packageserver is served through grpc from some pod. In Enterprise OCP 3.11 there were special configmaps but that is more than half year old

@gacopl If you just want to try out community-operators with a newer version of OLM 0.8.1 you can generate the following CatalogSource in the namespace you have OLM running in:

apiVersion: operators.coreos.com/v1alpha1
kind: CatalogSource
metadata:
  name: community-operators
spec:
  displayName: Community Operators
  image: quay.io/njhale/community-operators@sha256:37f1dd6ab4f1082af9d8f9ef028a2be4fb2837c5a75ba59bd127ebc723bfee8d
  publisher: community-operators
  sourceType: grpc

When you create a new subscription to the operator you want, be sure to specify the correct sourceNamespace field.

@ron1
Copy link
Contributor

ron1 commented Feb 22, 2019

@njhale Does it make sense that 0000_50_olm_14-operatorstatus.yaml fails to apply on OCP 3.11? I see that this file exists for OCP but not OKD. Am I correct this file exclusively targets OCP 4.0?

Also would you mind describing the process you used to create the community-operators image referenced above? Am I correct you used a variation of https://github.com/operator-framework/operator-registry/blob/master/upstream.Dockerfile with some additional steps?

Finally, when I created the community-operators CatalogSource you provided above, all my packageserver pods immediately started panicking with the following stack trace. Any thoughts?

$ oc logs packageserver-8df5d696c-kphcx
time="2019-02-22T18:21:44Z" level=info msg="Using in-cluster kube client config"
time="2019-02-22T18:21:44Z" level=info msg="package-server configured to watch namespaces []"
time="2019-02-22T18:21:44Z" level=info msg="Using in-cluster kube client config"
time="2019-02-22T18:21:44Z" level=info msg="connection established. cluster-version: v1.11.0+d4cacc0"
time="2019-02-22T18:21:44Z" level=info msg="operator ready"
time="2019-02-22T18:21:44Z" level=info msg="starting informers..."
time="2019-02-22T18:21:44Z" level=info msg="waiting for caches to sync..."
I0222 18:21:44.273402       1 reflector.go:202] Starting reflector *v1alpha1.CatalogSource (5m0s) from github.com/operator-framework/operator-lifecycle-manager/pkg/lib/queueinformer/queueinformer_operator.go:112
I0222 18:21:44.273439       1 reflector.go:240] Listing and watching *v1alpha1.CatalogSource from github.com/operator-framework/operator-lifecycle-manager/pkg/lib/queueinformer/queueinformer_operator.go:112
E0222 18:21:44.278692       1 runtime.go:69] Observed a panic: "invalid memory address or nil pointer dereference" (runtime error: invalid memory address or nil pointer dereference)
/go/src/github.com/operator-framework/operator-lifecycle-manager/vendor/k8s.io/apimachinery/pkg/util/runtime/runtime.go:76
/go/src/github.com/operator-framework/operator-lifecycle-manager/vendor/k8s.io/apimachinery/pkg/util/runtime/runtime.go:65
/go/src/github.com/operator-framework/operator-lifecycle-manager/vendor/k8s.io/apimachinery/pkg/util/runtime/runtime.go:51
/usr/local/go/src/runtime/asm_amd64.s:573
/usr/local/go/src/runtime/panic.go:502
/usr/local/go/src/runtime/panic.go:63
/usr/local/go/src/runtime/signal_unix.go:388
/go/src/github.com/operator-framework/operator-lifecycle-manager/pkg/api/apis/operators/v1alpha1/catalogsource_types.go:46
/go/src/github.com/operator-framework/operator-lifecycle-manager/pkg/package-server/provider/registry.go:166
/go/src/github.com/operator-framework/operator-lifecycle-manager/pkg/package-server/provider/registry.go:85
/go/src/github.com/operator-framework/operator-lifecycle-manager/vendor/k8s.io/client-go/tools/cache/controller.go:195
/go/src/github.com/operator-framework/operator-lifecycle-manager/vendor/k8s.io/client-go/tools/cache/shared_informer.go:554
/go/src/github.com/operator-framework/operator-lifecycle-manager/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:203
/go/src/github.com/operator-framework/operator-lifecycle-manager/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:203
/go/src/github.com/operator-framework/operator-lifecycle-manager/vendor/k8s.io/client-go/tools/cache/shared_informer.go:548
/go/src/github.com/operator-framework/operator-lifecycle-manager/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:133
/go/src/github.com/operator-framework/operator-lifecycle-manager/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:134
/go/src/github.com/operator-framework/operator-lifecycle-manager/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:88
/go/src/github.com/operator-framework/operator-lifecycle-manager/vendor/k8s.io/client-go/tools/cache/shared_informer.go:546
/go/src/github.com/operator-framework/operator-lifecycle-manager/vendor/k8s.io/client-go/tools/cache/shared_informer.go:390
/go/src/github.com/operator-framework/operator-lifecycle-manager/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:71
/usr/local/go/src/runtime/asm_amd64.s:2361
panic: runtime error: invalid memory address or nil pointer dereference [recovered]
	panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x13a93e8]

goroutine 27 [running]:
github.com/operator-framework/operator-lifecycle-manager/vendor/k8s.io/apimachinery/pkg/util/runtime.HandleCrash(0x0, 0x0, 0x0)
	/go/src/github.com/operator-framework/operator-lifecycle-manager/vendor/k8s.io/apimachinery/pkg/util/runtime/runtime.go:58 +0x107
panic(0x161a300, 0x26270a0)
	/usr/local/go/src/runtime/panic.go:502 +0x229
github.com/operator-framework/operator-lifecycle-manager/pkg/api/apis/operators/v1alpha1.(*RegistryServiceStatus).Address(0x0, 0xc42024c5a0, 0x1853ec6)
	/go/src/github.com/operator-framework/operator-lifecycle-manager/pkg/api/apis/operators/v1alpha1/catalogsource_types.go:46 +0x38
github.com/operator-framework/operator-lifecycle-manager/pkg/package-server/provider.(*RegistryProvider).catalogSourceAdded(0xc420233260, 0x181d4a0, 0xc420376e00)
	/go/src/github.com/operator-framework/operator-lifecycle-manager/pkg/package-server/provider/registry.go:166 +0x2a1
github.com/operator-framework/operator-lifecycle-manager/pkg/package-server/provider.(*RegistryProvider).(github.com/operator-framework/operator-lifecycle-manager/pkg/package-server/provider.catalogSourceAdded)-fm(0x181d4a0, 0xc420376e00)
	/go/src/github.com/operator-framework/operator-lifecycle-manager/pkg/package-server/provider/registry.go:85 +0x3e
github.com/operator-framework/operator-lifecycle-manager/vendor/k8s.io/client-go/tools/cache.ResourceEventHandlerFuncs.OnAdd(0xc420252600, 0xc420252610, 0xc420252620, 0x181d4a0, 0xc420376e00)
	/go/src/github.com/operator-framework/operator-lifecycle-manager/vendor/k8s.io/client-go/tools/cache/controller.go:195 +0x49
github.com/operator-framework/operator-lifecycle-manager/vendor/k8s.io/client-go/tools/cache.(*processorListener).run.func1.1(0x0, 0x0, 0x0)
	/go/src/github.com/operator-framework/operator-lifecycle-manager/vendor/k8s.io/client-go/tools/cache/shared_informer.go:554 +0x21a
github.com/operator-framework/operator-lifecycle-manager/vendor/k8s.io/apimachinery/pkg/util/wait.ExponentialBackoff(0x989680, 0x3ff0000000000000, 0x3fb999999999999a, 0x5, 0xc420715df0, 0x429b19, 0xc4204f60d0)
	/go/src/github.com/operator-framework/operator-lifecycle-manager/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:203 +0x9c
github.com/operator-framework/operator-lifecycle-manager/vendor/k8s.io/client-go/tools/cache.(*processorListener).run.func1()
	/go/src/github.com/operator-framework/operator-lifecycle-manager/vendor/k8s.io/client-go/tools/cache/shared_informer.go:548 +0x81
github.com/operator-framework/operator-lifecycle-manager/vendor/k8s.io/apimachinery/pkg/util/wait.JitterUntil.func1(0xc4205e6f68)
	/go/src/github.com/operator-framework/operator-lifecycle-manager/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:133 +0x54
github.com/operator-framework/operator-lifecycle-manager/vendor/k8s.io/apimachinery/pkg/util/wait.JitterUntil(0xc420715f68, 0xdf8475800, 0x0, 0x15e1801, 0xc4200be240)
	/go/src/github.com/operator-framework/operator-lifecycle-manager/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:134 +0xbd
github.com/operator-framework/operator-lifecycle-manager/vendor/k8s.io/apimachinery/pkg/util/wait.Until(0xc4205e6f68, 0xdf8475800, 0xc4200be240)
	/go/src/github.com/operator-framework/operator-lifecycle-manager/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:88 +0x4d
github.com/operator-framework/operator-lifecycle-manager/vendor/k8s.io/client-go/tools/cache.(*processorListener).run(0xc4207f6980)
	/go/src/github.com/operator-framework/operator-lifecycle-manager/vendor/k8s.io/client-go/tools/cache/shared_informer.go:546 +0x78
github.com/operator-framework/operator-lifecycle-manager/vendor/k8s.io/client-go/tools/cache.(*processorListener).(github.com/operator-framework/operator-lifecycle-manager/vendor/k8s.io/client-go/tools/cache.run)-fm()
	/go/src/github.com/operator-framework/operator-lifecycle-manager/vendor/k8s.io/client-go/tools/cache/shared_informer.go:390 +0x2a
github.com/operator-framework/operator-lifecycle-manager/vendor/k8s.io/apimachinery/pkg/util/wait.(*Group).Start.func1(0xc4201647d0, 0xc420504100)
	/go/src/github.com/operator-framework/operator-lifecycle-manager/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:71 +0x4f
created by github.com/operator-framework/operator-lifecycle-manager/vendor/k8s.io/apimachinery/pkg/util/wait.(*Group).Start
	/go/src/github.com/operator-framework/operator-lifecycle-manager/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:69 +0x62

@njhale
Copy link
Member

njhale commented Mar 4, 2019

@ron1 your first point is correct. 0000_50_olm_14-operatorstatus.yaml is a CustomResource for reporting second level operator (SLO) status to OCP 4.0s cluster-version-operator (no relation to OLM's CSVs), which is not present in any version < 4.0, OKD, or upstream.

You are also correct in that https://github.com/operator-framework/operator-registry/blob/master/upstream.Dockerfile is the "basis" for how I packaged community-operators as an OLM catalog - it provides an example of how to build a an OLM operator-registry image, which is OLM's preferred way to package operator catalog content. We have a PR in-flight that should be merging soon to update the docs in that repo to better reflect this.

As for the 3rd issue - it seems like the version of OLM being used is older than what's currently in master. From the provided panic log, registry.go:166 is supposed to be accessing a nil pointer, but in master this seems to be a log call. We also have a check early in this function to bail out early if the RegistryStatus is nil.

@njhale
Copy link
Member

njhale commented Mar 6, 2019

@gacopl I just tested the 0.7.4 OKD manifests against a kube 1.11 (which is the version used in OKD 3.11) cluster and everything worked fine. I even generated a subscription to etcd, which resolved correctly and installed etcd:

apiVersion: operators.coreos.com/v1alpha1
kind: Subscription
metadata:
  name: etcd
  namespace: openshift-operator-lifecycle-manager
spec:
  source: rh-operators
  sourceNamespace: openshift-operator-lifecycle-manager
  name: etcd
  channel: alpha

How did you install OKD? At this point, I'm reasonably sure OLM's manifests and images are good for 0.7.4.

@gacopl
Copy link
Author

gacopl commented Mar 6, 2019

basic oc cluster up of 3.11, the 0,8x branch worked for me after doing the args fix

@ron1
Copy link
Contributor

ron1 commented Mar 6, 2019

@njhale Does OLM 0.7.4 support the latest CSV schemas used by operators currently in the community-operators repository? Also, does it support the operator-registry-based CatalogSources with sourceType grpc you described in your prior comment?

@njhale
Copy link
Member

njhale commented Mar 6, 2019

@ron1

  1. CSVs are somewhat forward compatible - newer fields like InstallModes won't be respected.
  2. 0.7.4 does not support operator-registry based CatalogSources

@ron1
Copy link
Contributor

ron1 commented Mar 8, 2019

@njhale Given that OLM has changed significantly between 0.7.4 and 0.8.1+ including grpc CatalogSources, OperatorGroups, InstallModes, etc., and given that Operators currently in the community-operators repo are likely being tested only against OLM 0.8.1+, would you expect OLM 0.7.4 to reliably manage the current set of Operators in the community-operators repo? If so, what is the best way to assemble/deploy community-operators/upstream-community-operators into a ConfigMap-based CatalogSource for use by OLM 0.7.4?

@ecordell
Copy link
Member

Current status for 3.11:

  • There are not currently plans to backport newer OLM releases to the openshift-ansible installer for 3.11
  • We have some groups successfully using the latest upstream release on 3.11 when needed, but that is not an "officially" supported path for openshift 3.11.

If you need to play with OLM and are okay with those caveats on 3.11, you might try the upstream installation instructions: https://github.com/operator-framework/operator-lifecycle-manager/releases/tag/0.10.0

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants