Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extend the ability for CAPI to filter what CRs it reconciles #7775

Open
richardcase opened this issue Dec 19, 2022 · 24 comments
Open

Extend the ability for CAPI to filter what CRs it reconciles #7775

richardcase opened this issue Dec 19, 2022 · 24 comments
Labels
help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. kind/feature Categorizes issue or PR as related to a new feature. priority/backlog Higher priority than priority/awaiting-more-evidence. triage/accepted Indicates an issue or PR is ready to be actively worked on.

Comments

@richardcase
Copy link
Member

User Story

As an operator
I would like to be able to run multiple instances of CAPI (and its providers) in the management cluster
for operational reasons like multi-tenancy.

As an operator
I would like to be able to have more control over which CRs instances of CAPI controllers watch & reconcile
for operational reasons like multi-tenancy.

Detailed Description

As a result of #4119 (issue #4004) it's possible to tell instances of CAPI to only reconcile CRs that have a specific value for the cluster.x-k8s.io/watch-label label. This means when filtering that every instance of CAPI must correspond to a specific value for this filter.

It is not currently possible to say instance 1 of CAPI reconciles CRs with a label value val1 and instance 2 of CAPI reconciles CRs where the label value is not val1.

It would be good to have the ability to provide a wider range of "filters". For my scenario, i'm interested in having support for == and != but it could easily be || or &&.

Anything else you would like to add:

The current filtering based on the label (and the WatchFilterValue) is limited by the allowed characters for the label value (i.e. ! = && are not supported) and that its a single label.

One option to support != is to look for a prefix in the label value like: cluster.x-k8s.io/watch-label: not system. But this seems less than ideal for a number of reasons.

Another option is that we support label selectors (or some other mechanism) in the CAPI controllers.

/kind feature

@k8s-ci-robot k8s-ci-robot added kind/feature Categorizes issue or PR as related to a new feature. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Dec 19, 2022
@sbueringer
Copy link
Member

Sounds okay to me.

@richardcase just a question about a related topic out of curiosity. do you know how webhooks are used in those scenarios? (i.e. are all provider instances called or only one of them)

This question shouldn't block this issue as the issue is about extending an existing functionality

@richardcase
Copy link
Member Author

Good question @sbueringer, I hadn't overly considered that.....but it's something of interest now you've mentioned it 🤣 Will have a think and get back to you.

@fabriziopandini
Copy link
Member

/triage accepted
If the problem is multi-tenancy, in the sense of infrastructure tenant, our recommended way is to achieve this by using a single instance of the controller while leveraging different credentials.

Based on my experience, running many instances of the same controllers will lead to more problems than benefits, because in Kubernetes today we have a unique type space (CRD and webhooks) and this conflicts with managing the lifecycle of many instances. One day, when https://github.com/kcp-dev/kcp will be upstream this will be much simpler

However, we already committed to make this possible despite well know issues, and this seems an iteration of this.
If we have to improve the semantics for filtering, let me throw a crazy idea on the table.
What about deprecating the current watch-label and introducing something like watch-label-selector (a serialized label selector), so we give full power to users while leveraging core Kubernetes capabilities/concepts?

@k8s-ci-robot k8s-ci-robot added triage/accepted Indicates an issue or PR is ready to be actively worked on. and removed needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Dec 21, 2022
@enxebre
Copy link
Member

enxebre commented Jan 2, 2023

What are the limitations that prevent this use case from being supported by leveraging --namespace to watch the each capi controller?
Also can we elaborate further the use case that motivates this need?

@richardcase
Copy link
Member Author

The --namespace approach has a similar limitation to the cluster.x-k8s.io/watch-label label in that it's a 1:1 mapping from the filter value to the CAPI instance. I'm after something a bit more flexible, as an example:

  • CAPI instance 1 reconciles cluster definitions with label value XYZ
  • CAPI instance 2 reconciles cluster definitions without label value XYZ

@hzxuzhonghu
Copy link
Member

What we need is not only filter on labels. In kurator, we also need to filter by the spec, likely the kubeconfigControlPlane's reference.

@hzxuzhonghu
Copy link
Member

So i propose passing a filter function to the Reconfiler, and every reconcile can has its own customization filter.

Today it is very friendly, because it requires all resources to have the same label.

@sbueringer
Copy link
Member

sbueringer commented Jan 31, 2023

Regarding the discussion from here: https://github.com/kubernetes-sigs/cluster-api/pull/8016/files#r1090501708

Filters should be applied before actually entering in the reconcile loop (e.g. in while setting up controllers)

But then how can we filter out resources that are triggered by another kind of resource?

I think @hzxuzhonghu is right.

All predicates that we can pass into the controller builder or the controller when using For / Owns / Watches / Watch etc. only filter the incoming events. The "pipeline" for watches looks something like this:

  • Watch adds an eventhandler (usually pkg/source/internal.EventHandler)
    • internal.EventHandler wraps an handler.EventHandler (usually a enqueueRequestsFromMapFunc)
  • internal.EventHandler receives an event
  • internal.EventHandler iterates through predicates and drops the event if one predicate returns false
  • Otherwise it calls the Create/Update/.. func on the handler.EventHandler (which is usually a enqueueRequestsFromMapFunc)
  • enqueueRequestsFromMapFunc maps the incoming object to a reconcile.Request and adds it to the queue
  • Eventually the reconcile.Request is taken from the queue and Reconcile on our reconciler is called

tl;dr all the predicates we add are only applied to the events. They are not applied to the final objects we reconcile.

I'm not aware of a mechanism in CR today which allows us to filter based on the reconcile.Requests and especially not on the full objects (@vincepri please correct me if I'm wrong).

Some options:

  1. Add the Filter parameter to all our reconcilers and use them during Reconcile after we got the reconciled object (your PR: ✨ Added filter function to controller reconciler #8016)
  2. Implement a "wrapper" Reconciler around all our reconcilers which basically does the same (get the object and filter) (could work with generics)
  3. Implement something in controller-runtime (not sure how exactly)

cc @vincepri WDYT?

@hzxuzhonghu
Copy link
Member

@sbueringer @vincepri any conclusion

@sbueringer
Copy link
Member

I posted my take, can't answer for others.

@hzxuzhonghu
Copy link
Member

cc @fabriziopandini If you are ok with #7775 (comment), can we let #8016 go

@richardcase
Copy link
Member Author

@sbueringer - great write-up and suggested options 🥇

When i created this issue, i had in mind a way of filtering events on any attribute of client.Object, so limited to things like namespace, labels, annotations etc and not on anything in spec/status. And then support operators like == and!=. The idea being that i can say "only reconcile Clusters where the watch label value != val1"

Something like this could ultimately translate down to predicates (like the WatchFilterValue does) unless i'm mistaken.

@alexander-demicev
Copy link
Contributor

@fabriziopandini @richardcase watch-label-selector seems like a good approach for our use case, how do we structure it? We can possibly reuse https://github.com/kubernetes/apimachinery/blob/master/pkg/apis/meta/v1/types.go#L1203 and all the logic that comes with parsing it, but it might be too complicated for an annotation.

@vincepri
Copy link
Member

vincepri commented Jul 6, 2023

Going back to the original use case:

I would like to be able to run multiple instances of CAPI (and its providers) in the management cluster
for operational reasons like multi-tenancy.

Controller runtime has some mechanics today to filter events, although if the goal is to have hard tenancy requirements, data should probably be logically separated in different namespaces first, and then caches should pre-filter on those namespaces.

There is some missing flexibility in all of this, for example adding or removing namespaces being watched at runtime isn't something controller runtime supports; but there are other ways to get around it maybe with the dynamic reloading on RBAC (kubernetes-sigs/controller-runtime#2176) if that gets implemented.

In general, I'd start by looking at the problem from controller-runtime lenses and list the Cluster API use case as a primary goal.

@Danil-Grigorev
Copy link
Member

Could #8982 be a way to solve it? It allows to add event filtering for all discussed use-cases, with a different level of complexity.

@vincepri
Copy link
Member

Maybe? We should start from a small proposal in controller runtime first before implementing something very custom in Cluster API; the client-side predicates don't solve any of the hard tenancy requirements though. For example: cache would still hold objects that a specific manager shouldn't reconcile, they're just filtered before being passed along.

@alexander-demicev
Copy link
Contributor

Is it possible to somehow filter objects based on a logical expression? This way the solution to the problem can be more flexible. Caches won’t hold the object if the expression evaluates to true.

@vincepri
Copy link
Member

The cache can be filtered by namespace, labels, or fields. Although these are very simple equality parameters; supporting dynamic behaviors more than that would require changes in client-go and potentially api-server as well. Ultimately we'd want the internal cache to follow RBAC rules; results returned from list/watch calls should be pre-filtered based on those parameters.

@Danil-Grigorev
Copy link
Member

I think that to provide full flexibility to solving this the expression selector should be added all the way down to apimachinery ListOptions, then implemented it in client-go and proxy this in controller-runtime leveraged with ByObject option. It is a lot of work. kubernetes/kubernetes#53459 is talking about the need to achieve some more generic filtering mechanism, and they suggest to do this on local controller caches anyway, so it might be the way, though a complicated one.

RBAC approach sounds interesting but I'm not sure it provides such granularity the issue describes. You can't do logical expressions as well as support != on some label key, and not always you can modify something you want to ignore to apply to existing rule set.

@vincepri
Copy link
Member

Are we trying to solve a hard tenancy problem, or a sharding one?

@richardcase
Copy link
Member Author

Are we trying to solve a hard tenancy problem, or a sharding one?

For our use case, it's more a case of sharding. We'd like to have multiple instances of the controllers, each reconciling a subset of cluster definitions, where each instance isn't limited to a single watch filter or namespace.

@vincepri
Copy link
Member

That's a bit easier to think through, although I'd start by opening an issue in Controller Runtime first. We'd need a small proposal: we can make an experimental features that allows controllers to shard data automatically based on a subset of parameters. We'd still be limited to what the api-server can offer if we care about caching only specific data, although that use case becomes an optimization more than anything for this goal.

@fabriziopandini
Copy link
Member

/priority backlog

@k8s-ci-robot k8s-ci-robot added the priority/backlog Higher priority than priority/awaiting-more-evidence. label Apr 12, 2024
@fabriziopandini fabriziopandini added the help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. label May 3, 2024
@sbueringer
Copy link
Member

We should merge this discussion with #11192 #11193

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. kind/feature Categorizes issue or PR as related to a new feature. priority/backlog Higher priority than priority/awaiting-more-evidence. triage/accepted Indicates an issue or PR is ready to be actively worked on.
Projects
None yet
9 participants