-
Notifications
You must be signed in to change notification settings - Fork 686
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Global vs. namespace vs. configurable operators #374
Comments
Some aspects to consider (notes from discussions with @\nkvoll):
|
Some additional thoughts on global vs. namespace operator RBAC permissions, with 2 use cases. Based on those 2 use cases, I think we should move away from the global/namespace operator split, to consider deploying N operators with different permissions and configuration options. Use casesCase 1: there's only one namespace I can useUse case: I belong to organization X, I have access to a single namespace of organization X k8s cluster. I need to do everything in that single namespace. I think we should be able to deploy the operator(s) in the namespace, to handle resources in that same namespace, with restricted permissions to that namespace only (service account + roles + role bindings). In this particular case:
Potential solution: Case 2: I don't want to grant your global operator cluster-wide admin permissionsUse case: I belong to organization X, we have several namespaces for different purposes, we want to perform CCR/CCS across multiple ES clusters in multiple namespaces. But there is no way I'm going to run your operator with super-large cluster-wide permissions on all namespaces. I think this is a valid use case for any production-grade large k8s cluster. In this particular case:
So far, the license controller that we intended to run in the global operator requires read/write access to secrets in all namespaces. Potential solutions:
Outcomes from those those 2 use cases
ProposalMaking it all configurableI think we should drop the "global/namespace" operator concept, because it's not good enough to represent what we want to achieve. Instead, move to a single operator concept with multiple configuration options:
Potential large-scale company setup we could achieve here:
By having this fully-configurable operator, I think it's then easier to extract a few "common" deployment setups that user may like to use (we can still do something similar to our original 1 global + N namespace operators here). Also, I don't think we need an operator to manage operators. I think it's fine to just have people deploying the operators they need, without having them watch each other. Technical concerns
I'm sorry for the super long post here 😄 |
@sebgl awesome analysis! Reading this though I think that there are only small semantic differences between a global operator as we understood up to now (access to all namespaces via I think the concepts outlined in the global operator proposal are still valid, but we need maybe the ability to restrict its domain to a set of namespaces. I also think that we always said that a single operator deployment should be a supported mode of operation (with all controllers in one process) Maybe global is just a misleading name at this point and we are really talking about a super-operator, parent or multi-namespace operator (all bad names I know) in order to express the fact the we want to have the ability to run certain control loops just once for multiple namespaces (licensing) and solve cross-namespace concerns (CCR/CCS). |
What other operators out there seem to be doing: tl;dr either:
(but they don't have cross-cluster features) etcd-operator RBAC guide: https://github.com/coreos/etcd-operator/blob/master/doc/user/rbac.md They suggest 2 ways of deploying the operator:
prometheus-operator RBAC guide: https://github.com/coreos/prometheus-operator/blob/master/Documentation/rbac.md airflow-operator Seems to be using kubebuilder's default cluster-wide permission (one operator per cluster): https://github.com/GoogleCloudPlatform/airflow-operator/tree/master/config/rbac kube-db databases operators Install doc: https://kubedb.com/docs/0.9.0/setup/install/ Mongo-db operator Seems to be running in its own namespace, and manages resources in its own namespace: https://github.com/mongodb/mongodb-enterprise-kubernetes Oracle MySQL operator Install doc: https://github.com/oracle/mysql-operator/blob/master/docs/tutorial.md#configuration NATS operator Install doc: https://github.com/nats-io/nats-operator GCP Spark operator Helm chart doc: https://github.com/helm/charts/tree/master/incubator/sparkoperator Vault Operator |
Decision Drivers
|
I'll make sure to turn what we discussed today and this post into a formal "configurable-operator" proposal. |
@sebgl 👍 Also just to document the outcome of our meeting:
|
Regarding multi-namespaces watches in the controller-runtime: There is already an issue open for it, as a follow-up for the one-namespace restriction: kubernetes-sigs/controller-runtime#218 operator-sdk folks seem to want that feature as well, and might contribute to the controller-runtime: operator-framework/operator-sdk#767 Meanwhile, the issue above suggests an interesting workaround: implement our own Manager that embeds the controller-runtime Manager, but override the cache to support something like prometheus-operator MultiListWatcher. My take on it would be to:
|
Let's keep this issue open as a meta-issue for the configurable operator. ARD design proposal. 2 child issues:
|
Admission control via webhooks slightly complicates the picture here. Assumption: the user wants to lock down the RBAC permissions for the operator as much as possible Without multi-namespace watches
With multi-namespace watches
|
Hi All, In our Openshift production environment we manage more than 6K namespaces in a multitenant way, from our security point of view it is critical to have operator's with support for per namespace deployment ( with Role and RoleBindings on that namespace). |
Hey @roldancer , ECK supports two modes of deployment - at a Global level watching all namespaces as well as on a per namespace level. Does this satisfy your requirement for deploying in OpenShift? |
We discussed offline how our initial "global" and "namespace" operators concept are a bit irrelevant and confusing in practice, and decided to remove the existing pre-built manifests via #2254 to only keep the all-in-one version. The same global vs. namespace configuration can still be achieved by customizing the operator manifests. The operator can:
This goes into the direction of the above discussion. https://github.com/elastic/cloud-on-k8s/blob/master/docs/design/0005-configurable-operator.md discusses the need for an easier way to configure the operator yaml manifests. I'm closing this issue now. |
@sebgl Hi, I landed here searching for how to deploy the operator and CRDs inside a single namespace. It seems like this discussion is more about how to provide the option of global vs namespace and implies both are possible currently? So does that mean it is currently possible to deploy the CRDs and the operator to a single namespace (requiring nothing outside the namespace) with some tweaks to the all-in-one as an interim solution for those of us limited to a single namespace on a multi-tenant cluster? If it is possible, is there an example? I imagine the ClusterRoles and ClusterRoleBindings need to become Roles and RoleBindings and I imagine there are a few more tweaks? |
Copy-paste of my answer from https://discuss.elastic.co/t/deploy-eck-to-a-single-namespace/211495/8: I think (not tested in a while) this is possible but requires some tweaking. However, you can only deploy the CRDs cluster-wide, AFAIK it is not possible to limit a CustomResourceDefinition resource to a particular namespace. And it looks it's not going to be supported anytime soon. I understand this feels a bit hacky. I'm opening an issue to track this so we can come up with an easier way to generate your own flavor of the manifests: #2406 |
One concept we discussed a few times is to have the global operator responsible for managing namespace operators.
It would need to:
Some questions we need to tackle:
The text was updated successfully, but these errors were encountered: