Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Why DelegatingClient is needed? #180

Closed
M00nF1sh opened this issue Oct 24, 2018 · 19 comments
Closed

Why DelegatingClient is needed? #180

M00nF1sh opened this issue Oct 24, 2018 · 19 comments
Labels
kind/design Categorizes issue or PR as related to design. lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed.

Comments

@M00nF1sh
Copy link

M00nF1sh commented Oct 24, 2018

From the code,
the Manager.GetClient returns an DelegatingClient which read from cache for structured object, but read directly for unstructured object.

This forbids us to use the client before controller loops(since cache needs to be synced). (Our use case is load configurations from configMap before controller loops).

From my perspective, users should use manager.GetCache if they need caching behavior(and we can do the delegation there to read directly for unstructured object).
For manager.GetClient, it's better to let it always read directly without cache.

@DirectXMan12
Copy link
Contributor

We're trying to make it as easy as possible to use the "correct" Kubernetes pattern. So you don't have to manage two separate objects -- you just get the correct behavior off the bat. This does mean that for things like loading configuration from specific objects, you do need to manually instantiate the client.

You do raise a valid point here, and it's difficult to balance the "can we make it really easy so the user just doesn't have to care" and "is this magic going to trip people up". Did this trip you up or cause significant confusion? I'm happy to have a conversation on this feature, and/or to see if we can make the pitfalls of the default client clearer. I definitely don't want using controller-runtime to be confusing or painful :-)

@DirectXMan12
Copy link
Contributor

/kind design

@k8s-ci-robot k8s-ci-robot added the kind/design Categorizes issue or PR as related to design. label Oct 25, 2018
@M00nF1sh
Copy link
Author

M00nF1sh commented Oct 25, 2018

the major confusing part came from manager exposed both getClient & getCache(both of them provided reader interface), which make me think naturally that one provides uncached behavior and one provides cached behavior.
If we can merge them together, then it's less confusing.

Also, i think loading configuration from configMaps before controller loops seems a pretty general use case. Maybe we can make waitForCacheSync happen inside manager.New(or another explicit method call) instead of manager.Start? ( i think it's not ideal for users to manually init an k8s client just to fulfill this use case)

@DirectXMan12
Copy link
Contributor

@droot thoughts here?

Yeah, I definitely see how it's a common use case. We can prob make the naming a bit clearer, and introducing an explicit GetRemoteOnlyClient helper (terrible name, just a placeholder) might make this clearer too.

@DirectXMan12
Copy link
Contributor

DirectXMan12 commented Oct 25, 2018

The problem with starting the caches in manager.New is that it makes it impossible to safely add indexes (at least, how we do it now, IIRC), which can be super useful in clearing up bits of code. It's a bit of an advanced use case, but nice to have.

@M00nF1sh
Copy link
Author

M00nF1sh commented Oct 25, 2018

another tricky solution i can think of is make the manager.GetClient read directly at first, and change to delegated cache implementation once cache is synced 😸 . (maybe via a flag in the delegating client). Then all use case are satisfied and transparent to users.

@DirectXMan12
Copy link
Contributor

So, the other "nice" thing that the reader does it that it auto-adds new watches if you request a new object. This might be too magic, but so far it seems to have worked out well. Your suggestion, while not necessarily bad in the absence of this logic, cause the behavior of the client to become inconsistent when paired with the auto-add.

@M00nF1sh
Copy link
Author

M00nF1sh commented Oct 25, 2018

Just checked the auto-add implementation. I don't think the behavior of client will be inconsistent.

Since auto-add behavior is not necessary for Read/List operations from user's perspective.(It's an implementation detail needed when read/list is performed by cache).

And when they need watch behavior, they have to use cache.GetInformers, which setup auto-add correctly. 😸

@shawn-hurley
Copy link

This might be a really bad idea but what if before starting the controllers, we use a direct client, and then during start change the client to a cache client.

Allows for the setup and then does the right things for the controllers?

@M00nF1sh
Copy link
Author

M00nF1sh commented Oct 25, 2018

happy to contribute if we are going for the solution of switch read client 😸 .

But the problem of both manager.GetClient and manager.GetCache exposed reader still exists. I'm a big fan of the python zen There should be one-- and preferably only one --obvious way to do it 😄

@shawn-hurley
Copy link

I am going to take the other side of the argument here. I think that exposing the cache helps advanced users, and making the client “smart” and have a default that is to use the cache on reads makes sense 90% of the time. This reduces the cognitive overhead for new users and helps them get the right outcome.

What about something like for the status writer. A client.NoCache().Get(...)?

@droot
Copy link
Contributor

droot commented Oct 25, 2018

the major confusing part came from manager exposed both getClient & getCache(both of them provided reader interface), which make me think naturally that one provides uncached behavior and one provides cached behavior.

This is good feedback. Now I read it, it is confusing.

For our next major revision, we can probably try to do the following:

  • Remove GetCache API from manager
  • Introduce method on Client interface NoCache (as @shawn-hurley suggested above)

@shawn-hurley
Copy link

@droot Where would I get these methods: https://github.com/kubernetes-sigs/controller-runtime/blob/master/pkg/cache/cache.go#L51-L55 then if we remove the GetCache() interface?

Maybe a GetInformer() method is added and an Informer interface is exposed?

@nkvoll
Copy link

nkvoll commented Dec 11, 2018

So, the other "nice" thing that the reader does it that it auto-adds new watches if you request a new object. This might be too magic, but so far it seems to have worked out well

FWIW, coming in as a (relatively new) user of controller-runtime, this is something that just caught me off-guard: if you Get any object, a List watch will be set up behind the scenes for me.

Doesn't this have implications if there are a lot of objects (e.g Pods), and I only want to:

  1. Get a single one: I end up watching every Pod (in all namespaces, unless I namespace the controller?)
  2. Watch a subset (e.g only pods labelled foo: bar): I end up watching as above (see example below for a possible start of a workaround)

If an informer has been auto added, there is also no way for me to indicate that it is no longer needed, so my operator will possibly keep watching events for resources that might no longer be relevant,.

Perhaps it's just me, but I'm currently struggling a little finding how to set a watch with a selector, and I'm not sure how it would work out with the current cache behaviors (e.g only one informer per GVK, so there seems to be a strict 1-to-1 between a gvk and an informer? unless I create my own listwatch, informer and cache reader manually, meaning I have to be careful and manage reads correctly between the default client and my own cache reader?)

See example code (not complete) for the manual lifting required (click to expand)

	clientSet, err := kubernetes.NewForConfig(mgr.GetConfig())
	if err != nil {
		return err
	}

	informer := informersCoreV1.NewFilteredSecretInformer(
		clientSet,
		"default",
		10 * time.Hour,
		cache.Indexers{cache.NamespaceIndex: cache.MetaNamespaceIndexFunc},
		func(options *v1.ListOptions) {
			options.LabelSelector = labels.Set{"foo": "bar"}.String()
		},
	)

	if err := c.Watch(
		&source.Informer{
			Informer: informer,
		},
		&handler.EnqueueRequestForObject{},
	); err != nil {
		return err
	}

	if err := mgr.Add(manager.RunnableFunc(func(stopCh <-chan struct{}) error {
		informer.Run(stopCh)
		return nil
	})); err != nil {
		return err
	}

	// TODO: not sure how to actually go from the above Informer to a cache.Cache?
	//   e.g actually make it usable using the controller-runtime client.Reader interface.

In this scenario I cannot use the mgr.GetClient() to do any reads of Secrets because that would set up a watch, so I have to fall back to using the good ole ClientSet?

What I (as an operator developer) would like to be able to do (excluding the current patterns of setting up namespace-wide watches and auto-adding watches for all resources of a given gvk when getting):

  • Get + watch for single objects (and a way to stop watching). (Use case: user can provide a reference to a named resource I should use when reconciling, e.g a config map containing some config, but I don't want to watch /all/ config maps in the namespace)
  • List + watch for objects with filters (field selectors, label selectors) (and a way to stop watching) (Use case: having multiple controllers not all having to watch all the resources of each watched kind)
  • Get without watching (no caching) (Use case: one-off getting a resource, e.g where the gets are rare and unlikely to be repeated or updates through watches would cause more load than just sporadic gets)
  • List without watching (w or w/o selectors, no caching) (Use case: same as above, but for selected resources)

Apologies if this is a bit long and more suitable to its own issue(s), but it felt very relevant to the caching behavior.

EDIT: I've taken this comment and tried splitting some of the functionality asks into separate issues (#244, #245, #246, #247), but they are all related to this issue in one way or another, and I'm not sure about the main maintainers would like this to be handled.

shawn-hurley pushed a commit to shawn-hurley/controller-runtime that referenced this issue Jan 2, 2019
Adding a new interface `APIClient` to expose as part of the client
a non caching reader interface.

fixes kubernetes-sigs#180
fixes kubernetes-sigs#245
shawn-hurley pushed a commit to shawn-hurley/controller-runtime that referenced this issue Jan 2, 2019
Adding a new interface `APIClient` to expose as part of the client
a non caching reader interface.

fixes kubernetes-sigs#180
fixes kubernetes-sigs#245
@DirectXMan12
Copy link
Contributor

I think your concern is relevant here. This is one of the things that made me a bit nervous initially, but it is a bit of a corner case, and generally the auto-caching follows the behavior we encourage in Kubernetes (always cache, except in really rare cases). Let's deal with the cases in the issues you've opened up -- I think splitting the use cases out was a good idea.

@DirectXMan12
Copy link
Contributor

(and yes, there's currently not a good way to limit/filter watches. it's something we need to solve)

@fejta-bot
Copy link

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Apr 28, 2019
jcrossley3 added a commit to openshift-knative/knative-serving-operator that referenced this issue May 12, 2019
This also introduces the idea of configuring the resources *before*
they're applied, which in hindsight makes more sense, i.e. make the
first configmap the controllers see the correct one.

We may want to do the same for the config in the Install CR, but
because I'm still unsure whether having the config in there at all is
a good idea, I'm keeping it post-install, mostly because the logging
is set up to show previous values of whatever it overwrites. Doing
that pre-install would require manifestival enhancements.

Another thing pre-install config solves is swimming upstream against
the controller-runtime caching client, which occasionally fails to
find newly-created resources as the informer doesn't yet know about
them.

See kubernetes-sigs/controller-runtime#180
for more info.
@fejta-bot
Copy link

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle rotten

@k8s-ci-robot k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels May 28, 2019
@DirectXMan12
Copy link
Contributor

closing this master issue in favor of the sub-issues

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/design Categorizes issue or PR as related to design. lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants