Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Global vs. namespace vs. configurable operators #374

Closed
sebgl opened this issue Feb 11, 2019 · 15 comments
Closed

Global vs. namespace vs. configurable operators #374

sebgl opened this issue Feb 11, 2019 · 15 comments
Labels
discuss We need to figure this out

Comments

@sebgl
Copy link
Contributor

sebgl commented Feb 11, 2019

One concept we discussed a few times is to have the global operator responsible for managing namespace operators.

It would need to:

  • Watch Stack, Elasticsearchcluster, Kibana resources
  • Spawn a namespace operator in a given namespace dynamically
    • And create its RBAC resources (service account, roles, role bindings) - we can probably use a single ClusterRole here, that we map to individual namespaced service accounts in a non cluster-wide RoleBinding.
  • Remove namespace operators from namespaces when not needed anymore

Some questions we need to tackle:

  • The global operator service account needs permissions to create ServiceAccount, ClusterRole, RoleBindings, ReplicaSet, etc. in any namespace. Is that OK to have such a high-level of privileges on all namespaces? Do we want an option to restrict it to a list of namespaces only?
  • Versioning: should the global operator auto-update running namespace operators to the latest version (ie. the version of the global operator itself, since we use the same Docker image)?
  • If we already have a global operator running with high privileges, is there really a benefit in deploying individual namespace operators in each namespace? Maybe having everything already handled by the global operator itself is good enough.
  • Deploy the namespace operator from the global operator code: reuse our yaml file, or handle everything as code?
@sebgl sebgl added discuss We need to figure this out >enhancement Enhancement of existing functionality and removed >enhancement Enhancement of existing functionality labels Feb 11, 2019
@pebrc
Copy link
Collaborator

pebrc commented Feb 12, 2019

Some aspects to consider (notes from discussions with @\nkvoll):

  • Targeted size of the deployment
    • SaaS scale: 10k + clusters: a single operator on a single host might not be able to handle that
      • limited number of ephemeral ports to connect to the individual clusters to retrieve cluster state etc.
      • resiliency and performance issues when reconciling that many clusters in a single process (even if we parallelize the reconciliation loop)
    • smaller deployments might be perfectly fine using just a single operator
  • If we target different use cases here in terms of scale do we in effect develop two different products (how can we avoid that?)
  • Can we 'fix' the issue that the global operator needs a lot of privileges by explicitly restricting it to a few namespaces and leaving the creation of the necessary RBAC resources to the user (maybe tool-supported)? E.g. do we actually need a ClusterRole or would a set of Roles for the namespaces the 'global' operator is supposed to manage suffice?

@sebgl
Copy link
Contributor Author

sebgl commented Feb 12, 2019

Some additional thoughts on global vs. namespace operator RBAC permissions, with 2 use cases.

Based on those 2 use cases, I think we should move away from the global/namespace operator split, to consider deploying N operators with different permissions and configuration options.

Use cases

Case 1: there's only one namespace I can use

Use case: I belong to organization X, I have access to a single namespace of organization X k8s cluster. I need to do everything in that single namespace.

I think we should be able to deploy the operator(s) in the namespace, to handle resources in that same namespace, with restricted permissions to that namespace only (service account + roles + role bindings).

In this particular case:

  • the user probably can't even create ClusterRoles (must be Roles in that namespace)
  • we still want what we identified as global operators responsibilities: licensing, CCS/CCR, etc.
  • if the user is ready to pay for those features but can't use them because of Organization X k8s management policies, it's pretty bad user experience.
  • what's the point of having both namespace + global operators in the same namespace here?

Potential solution:
Here, I think we want a single operator with all controllers baked-in here, restricted to that single namespace.

Case 2: I don't want to grant your global operator cluster-wide admin permissions

Use case: I belong to organization X, we have several namespaces for different purposes, we want to perform CCR/CCS across multiple ES clusters in multiple namespaces. But there is no way I'm going to run your operator with super-large cluster-wide permissions on all namespaces.

I think this is a valid use case for any production-grade large k8s cluster.

In this particular case:

  • we don't want the global operator to have read access to all secrets in all clusters
  • we don't want the global operator to have create permissions on Pods (and ReplicaSets, StatefulSets, Deployment, etc.), because of pod privilege escalation. Basically even if the operator does not have read access to a namespace's secrets, being able to deploy a pod in that namespace is enough to consider all secrets of that namespace are accessible (just use our own code in the pod, use the default service account, and tada we can access secrets). Meaning the global operator cannot create namespace operators here.
  • it's probably OK for the global operator to have read/write access to our CRDS (elasticsearchcluster, kibana, etc.), but not more than that

So far, the license controller that we intended to run in the global operator requires read/write access to secrets in all namespaces.

Potential solutions:

  • restrict the global operator to a set of namespaces instead of all namespaces
  • make sure the global operator only accesses limited cluster-wide resources (our CRDs only, no secrets, no pod creation)
  • consider that's too bad, but not our problem #thuglife

Outcomes from those those 2 use cases

  1. I think the global operator cannot be global as we intended. We must be able to restrict it to one or multiple namespaces, and still have it fully-featured (licensing, CCS, CCR, etc. limited to clusters in the managed namespaces).
  2. It also means we could have several global operators running in a k8s cluster (in different namespaces).
  3. If we have this "restricted global operator" running in a given namespace, there's maybe no real benefit in also having namespace operators running alongside.

Proposal

Making it all configurable

I think we should drop the "global/namespace" operator concept, because it's not good enough to represent what we want to achieve. Instead, move to a single operator concept with multiple configuration options:

  • namespace it should be deployed in
  • list of namespaces it should manage resources in
  • which controller it is running
    • elasticsearch
    • kibana
    • es/kibana associations
    • licensing
    • ccs/ccr
  • appropriate RBAC permissions according to namespaces and operator roles

Potential large-scale company setup we could achieve here:

  • team X have their own full-options operator running in team X namespace (and restricted to this namespace)
  • team Y and Z use multiple namespaces, they have one operator managing those multiple namespaces
  • team Y and Z want to scale more and maybe do A-B testing on a new operator version, they deploy a second operator for a single namespace, not covered by the previous operator.
  • the company wants to apply enterprise licenses to all clusters of teams X, Y and Z. What they'll do here is deploy the operator with the licensing controller only, with permissions on team X, Y, Z namespaces.
  • the company wants to perform cross-cluster search across team Y and team X clusters. They run an operator configured with the CCR controller in their own separate namespace, with permissions to access team Y and team X namespaces.

By having this fully-configurable operator, I think it's then easier to extract a few "common" deployment setups that user may like to use (we can still do something similar to our original 1 global + N namespace operators here).

Also, I don't think we need an operator to manage operators. I think it's fine to just have people deploying the operators they need, without having them watch each other.

Technical concerns

  1. So far, the controller-runtime allows watching either:
    • resources in all namespaces (need ClusterRoleBinding)
    • resources in a single namespace
      I think watching a list of namespaces is a valid use case we can propose/contribute to.
  2. The combination of controllers and RBAC permissions required to deploy the operator is too complex to be handled by hand. We need a tool to generate all those resources automatically (./elastic-operator-cli generate operator --controllers=elasticsearch,licensing --namespace=teamX,teamY). It would still be the responsibility of a human to the run kubectl apply -f generated-operator.yaml.
  3. Human mistakes could lead to have more than one operator responsible for a single cluster. Eg. we already had an operator managing the cluster, now we had another one to manager cluster licenses but we accidentally configure it to run the elasticsearch controller. Should we rely on some kind of basic leader-election here? "Basic" in the sense that it's probably ok to have concurrent activity for a limited timespan considering our reconciliation loop model.

I'm sorry for the super long post here 😄
If this (or part of this) makes sense, I'll be happy to turn it to a formal design proposal.

@pebrc
Copy link
Collaborator

pebrc commented Feb 12, 2019

@sebgl awesome analysis! Reading this though I think that there are only small semantic differences between a global operator as we understood up to now (access to all namespaces via ClusterRole) and a more restricted notion of global which is what we seem to be converging to in this discussion.

I think the concepts outlined in the global operator proposal are still valid, but we need maybe the ability to restrict its domain to a set of namespaces.

I also think that we always said that a single operator deployment should be a supported mode of operation (with all controllers in one process)

Maybe global is just a misleading name at this point and we are really talking about a super-operator, parent or multi-namespace operator (all bad names I know) in order to express the fact the we want to have the ability to run certain control loops just once for multiple namespaces (licensing) and solve cross-namespace concerns (CCR/CCS).

@pebrc pebrc added this to the Alpha milestone Feb 12, 2019
@sebgl
Copy link
Contributor Author

sebgl commented Feb 12, 2019

What other operators out there seem to be doing:

tl;dr either:

  • one operator manages cluster-wide resources
  • one operator per namespace, manages resources in its own namespace

(but they don't have cross-cluster features)


etcd-operator

RBAC guide: https://github.com/coreos/etcd-operator/blob/master/doc/user/rbac.md

They suggest 2 ways of deploying the operator:

  • in a single namespace, with Role and RoleBindings on that namespace
  • cluster-wide, with ClusterRole and ClusterRoleBindings. In this setup, it's also responsible for creating the CRDs.
    They provide a script to help generate corresponding yaml files.

prometheus-operator

RBAC guide: https://github.com/coreos/prometheus-operator/blob/master/Documentation/rbac.md
Design doc: https://github.com/coreos/prometheus-operator/blob/master/Documentation/design.md
It seems limited to watching and handling resources in a single namespace (the one it's deployed in). But need cluster-wide permissions to deploy the CRD (not super clear to be honest).

airflow-operator

Seems to be using kubebuilder's default cluster-wide permission (one operator per cluster): https://github.com/GoogleCloudPlatform/airflow-operator/tree/master/config/rbac

kube-db databases operators

Install doc: https://kubedb.com/docs/0.9.0/setup/install/
Runs in the kube-system namespace by default, with cluster-wide access.

Mongo-db operator

Seems to be running in its own namespace, and manages resources in its own namespace: https://github.com/mongodb/mongodb-enterprise-kubernetes

Oracle MySQL operator

Install doc: https://github.com/oracle/mysql-operator/blob/master/docs/tutorial.md#configuration
Can be installed either cluster-wide or per-namespace (manages resources in its own namespace).

NATS operator

Install doc: https://github.com/nats-io/nats-operator
Can be installed either cluster-wide or per-namespace (manages resources in its own namespace).

GCP Spark operator

Helm chart doc: https://github.com/helm/charts/tree/master/incubator/sparkoperator
A single operator with cluster-wide permissions.

Vault Operator
Link: https://github.com/coreos/vault-operator
One operator per namespace, manages resources in its own namespace.

@nkvoll
Copy link
Member

nkvoll commented Feb 13, 2019

Decision Drivers

  • Scalability (down): Must be able to scale down to single-cluster deployments without being overly complex to manage.
  • Scalability (up): Must be able to scale up with large k8s installations and manage tens of thousands of clusters simultaneously.
    • In any sufficiently large installation with clusters under load there is going to be high variance between response times for different ES API calls, and one clusters responsiveness should not be able to negatively affect the operations of any other cluster.
  • Security: The solution should have an easy to understand story around credentials and service accounts.
    • As far as possible, adhere to the principle of least amount of access: we should not require more permissions than strictly necessary to for the operators to accomplish what they need to.

@sebgl
Copy link
Contributor Author

sebgl commented Feb 13, 2019

I'll make sure to turn what we discussed today and this post into a formal "configurable-operator" proposal.

@pebrc
Copy link
Collaborator

pebrc commented Feb 13, 2019

@sebgl 👍

Also just to document the outcome of our meeting:

  • @sebgl will open an issue in controller-runtime to explore the possibility of multi-namespace controller managers (we also have the backup strategy of running multiple controller managers)
  • we will create an ADR that de-emphasizes 'global' vs. 'namespace' over 'configurable operators' and probably supersedes the existing global operators ADR
  • we will work on tooling to create the necessary k8s resources from an input configuration for the desired operator deployment style
  • we will investigate controller runtime leader election to make sure overlapping roles of multiple operators don't create problmes

@sebgl
Copy link
Contributor Author

sebgl commented Feb 13, 2019

Regarding multi-namespaces watches in the controller-runtime:

There is already an issue open for it, as a follow-up for the one-namespace restriction: kubernetes-sigs/controller-runtime#218
Looks like it's long-termed planned 👍

operator-sdk folks seem to want that feature as well, and might contribute to the controller-runtime: operator-framework/operator-sdk#767

Meanwhile, the issue above suggests an interesting workaround: implement our own Manager that embeds the controller-runtime Manager, but override the cache to support something like prometheus-operator MultiListWatcher.

My take on it would be to:

  1. Try implementing the multi-namespaces watches in the controller-runtime itself and create a PR upstream.
  2. If 1. turns out not to work that well, use our own cache implementation (the workaround described above).

@sebgl sebgl changed the title Make the global operator operate namespace operators Global vs. namespace vs. configurable operators Feb 13, 2019
@sebgl
Copy link
Contributor Author

sebgl commented Feb 14, 2019

Let's keep this issue open as a meta-issue for the configurable operator. ARD design proposal.

2 child issues:

@pebrc pebrc modified the milestones: Alpha, Beta Mar 14, 2019
@pebrc
Copy link
Collaborator

pebrc commented Mar 29, 2019

Admission control via webhooks slightly complicates the picture here.

Assumption: the user wants to lock down the RBAC permissions for the operator as much as possible

Without multi-namespace watches

  1. generate the webhook and supporting manifests (secret/service) statically eg. via Configurable operator CLI tool to generate yaml files #404 and run operator restricted to single namespace without webhook auto-install
  2. give the operator the privilege to install a webhook, use webhook auto-install and restrict the operator to a single namespace: In this case the operator has to run in the same namespace as the ES clusters it is managing (no separate control plane namespace) due to the controller-runtime restriction

With multi-namespace watches

  1. as above, statically generate all manifests including CA for webhook either manually or via CLI from Configurable operator CLI tool to generate yaml files #404 (safest option)
  2. allow operator to install webhooks and run operator in separate control plane namespace managing ES clusters in a separate namespace (needs cluster wide access to webhooks)

@roldancer
Copy link

Hi All, In our Openshift production environment we manage more than 6K namespaces in a multitenant way, from our security point of view it is critical to have operator's with support for per namespace deployment ( with Role and RoleBindings on that namespace).

@agup006
Copy link

agup006 commented Sep 16, 2019

Hey @roldancer , ECK supports two modes of deployment - at a Global level watching all namespaces as well as on a per namespace level. Does this satisfy your requirement for deploying in OpenShift?

@sebgl
Copy link
Contributor Author

sebgl commented Dec 12, 2019

We discussed offline how our initial "global" and "namespace" operators concept are a bit irrelevant and confusing in practice, and decided to remove the existing pre-built manifests via #2254 to only keep the all-in-one version.

The same global vs. namespace configuration can still be achieved by customizing the operator manifests. The operator can:

  • enable or disable webhook support
  • be deployed in namespace X
  • manage resources in 1 namespace, N namespaces, or all namespaces
  • live alongside other operator instances that manage resources in different namespaces

This goes into the direction of the above discussion. https://github.com/elastic/cloud-on-k8s/blob/master/docs/design/0005-configurable-operator.md discusses the need for an easier way to configure the operator yaml manifests.

I'm closing this issue now.

@sebgl sebgl closed this as completed Dec 12, 2019
@rogerschlachter
Copy link

rogerschlachter commented Jan 9, 2020

@sebgl Hi, I landed here searching for how to deploy the operator and CRDs inside a single namespace. It seems like this discussion is more about how to provide the option of global vs namespace and implies both are possible currently?

So does that mean it is currently possible to deploy the CRDs and the operator to a single namespace (requiring nothing outside the namespace) with some tweaks to the all-in-one as an interim solution for those of us limited to a single namespace on a multi-tenant cluster?

If it is possible, is there an example? I imagine the ClusterRoles and ClusterRoleBindings need to become Roles and RoleBindings and I imagine there are a few more tweaks?

@sebgl
Copy link
Contributor Author

sebgl commented Jan 10, 2020

Copy-paste of my answer from https://discuss.elastic.co/t/deploy-eck-to-a-single-namespace/211495/8:

I think (not tested in a while) this is possible but requires some tweaking.
Basically you can change the ClusterRole and ClusterRoleBinding from the all-in-one manifests to their corresponding Role and RoleBinding translations, with your desired namespace.
Then, you can patch the operator StatefulSet manifest to be deployed in the desired namespace. Also make sure the --namespaces flag of the operator cmd matches the namespace you want the operator to work with (probably the same it's deployed in).

However, you can only deploy the CRDs cluster-wide, AFAIK it is not possible to limit a CustomResourceDefinition resource to a particular namespace. And it looks it's not going to be supported anytime soon.

I understand this feels a bit hacky. I'm opening an issue to track this so we can come up with an easier way to generate your own flavor of the manifests: #2406

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
discuss We need to figure this out
Projects
None yet
Development

No branches or pull requests

6 participants