Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use cluster-scoped cache only when required #3868

Merged
merged 1 commit into from
Oct 22, 2020

Conversation

charith-elastic
Copy link
Contributor

When the operator is installed with the restricted profile, the controller-runtime cache starts logging errors because the operator does not have permission to access resources outside the namespaces it manages.

E1022 10:08:01.940788       1 reflector.go:178] pkg/mod/k8s.io/client-go@v0.18.6/tools/cache/reflector.go:125: Failed to list *v1.Elasticsearch: elasticsearches.elasticsearch.k8s.elastic.co is forbidden: User "system:serviceaccount:elastic-system:elastic-operator" cannot list resource "elasticsearches" in API group "elasticsearch.k8s.elastic.co" at the cluster scope
E1022 10:08:03.129949       1 reflector.go:178] pkg/mod/k8s.io/client-go@v0.18.6/tools/cache/reflector.go:125: Failed to list *v1beta1.EnterpriseSearch: enterprisesearches.enterprisesearch.k8s.elastic.co is forbidden: User "system:serviceaccount:elastic-system:elastic-operator" cannot list resource "enterprisesearches" in API group "enterprisesearch.k8s.elastic.co" at the cluster scope

This is due to a change introduced in #3810 to allow watching cluster resources by default. It should be made conditional instead.

@charith-elastic charith-elastic added >bug Something isn't working v1.3.0 exclude-from-release-notes Exclude this PR from appearing in the release notes labels Oct 22, 2020
@charith-elastic
Copy link
Contributor Author

Jenkins test this please

@david-kow
Copy link
Contributor

I might be missing something, but why we would query for ES/Ent in the cluster scope if they are namespaced resources? I thought we do know that we don't do that.

@charith-elastic
Copy link
Contributor Author

charith-elastic commented Oct 22, 2020

Last time I checked, the controller-runtime caches used client-side filtering. So, they would create unfiltered informers for all the kinds they are interested in and fetch all the objects from the server. This is why the operator used to get OOM killed in very large clusters with lots of deployed resources (which we plastered over by increasing the memory limit). The issue is still open: kubernetes-sigs/controller-runtime#244

In this case, the presence of the empty string in the list of namespaces would result in the cache trying to eagerly fetch all the cluster-scoped resources it is interested in.

@david-kow
Copy link
Contributor

In this case, the presence of the empty string in the list of namespaces would result in the cache trying to eagerly fetch all the cluster-scoped resources it is interested in.

Oh, that's surprising (for me)! LGTM then.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>bug Something isn't working exclude-from-release-notes Exclude this PR from appearing in the release notes v1.3.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants