Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Manager.Add does not start Cache after Manager has already started #1673

Closed
jsanda opened this issue Sep 25, 2021 · 8 comments · Fixed by #1681
Closed

Manager.Add does not start Cache after Manager has already started #1673

jsanda opened this issue Sep 25, 2021 · 8 comments · Fixed by #1681
Labels
kind/bug Categorizes issue or PR as related to a bug.

Comments

@jsanda
Copy link
Contributor

jsanda commented Sep 25, 2021

I am working on a multi-cluster operator project. We following the changes described in https://github.com/kubernetes-sigs/controller-runtime/blob/master/designs/move-cluster-specific-code-out-of-manager.md initializing a Cluster for each remote k8s cluster. It works great, so thanks for that :)

I would like to add (and remove) Clusters at runtime after the Manager has already started. It doesn't look like that is possible today. I have been trying to get more familiar with the Manager code to get a better sense of what I need to do to manage the Cluster on my own. It definitely looks non-trivial. Maybe I am wrong but it looks like I would have to reimplement a fair bit of logic from controllerManager.

This brings me to my question. Are there any plans to support adding a Cluster to the Manager after it has been started? What are some of the caveats with doing this?

@alvaroaleman
Copy link
Member

Hey @jsafrane , actually adding a new cluster is something I'd have expected to already work 🤔 So seems there is a bug.

Removing one isn't currently supported, we would have to make sure that we can properly stop everything without leaks and think about how we want to add that to our API.

@FillZpp
Copy link
Contributor

FillZpp commented Sep 29, 2021

actually adding a new cluster is something I'd have expected to already work

Although we can new a cluster and add it into an existing manager, but it is added as a Runnable and the main cluster can only be one.

type controllerManager struct {
	// cluster holds a variety of methods to interact with a cluster. Required.
	cluster cluster.Cluster
}

It somehow triggers my obsession... Could we change it to a list and provide something like AddCluster(cfg) to add clusters after manager started? @alvaroaleman

@jsanda
Copy link
Contributor Author

jsanda commented Sep 29, 2021

I guess the title is a bit misleading. I know that a Runnable can be added after the Manager is started but my understanding is that it won't be started. I want to add a Cluster to a Manager after the Manager has already started and then have the Cluster started.

@jsanda jsanda changed the title Add cluster after Manager has already started Add and start cluster after Manager has already started Sep 29, 2021
@alvaroaleman
Copy link
Member

I want to add a Cluster to a Manager after the Manager has already started and then have the Cluster started.

Yeah that should work. It seems the issue is that we don't start the cache:

// Add the runnable to the leader election or the non-leaderelection list
if leRunnable, ok := r.(LeaderElectionRunnable); ok && !leRunnable.NeedLeaderElection() {
shouldStart = cm.started
cm.nonLeaderElectionRunnables = append(cm.nonLeaderElectionRunnables, r)
} else if hasCache, ok := r.(hasCache); ok {
cm.caches = append(cm.caches, hasCache)
} else {
shouldStart = cm.startedLeader
cm.leaderElectionRunnables = append(cm.leaderElectionRunnables, r)
}

@FillZpp
Copy link
Contributor

FillZpp commented Sep 29, 2021

but my understanding is that it won't be started

@jsanda Actually it will be started. When adding a Cluster into a started Manager (and the Cluster does not require resource lock), Manager will start it immediately. https://github.com/kubernetes-sigs/controller-runtime/blob/master/pkg/manager/internal.go#L221

BTW, it is also weird that the new cluster is decided to start according to the resource lock acquired in the previous cluster. They could be different K8s clusters.

@jsanda
Copy link
Contributor Author

jsanda commented Sep 29, 2021

Yeah that should work. It seems the issue is that we don't start the cache

That's what I observed from prior testing.

@jsanda
Copy link
Contributor Author

jsanda commented Oct 1, 2021

@alvaroaleman I may have some questions along the way, but I would be happy to work on a PR for this.

@alvaroaleman
Copy link
Member

@jsafrane sure, feel free to take a stab

@jsanda jsanda changed the title Add and start cluster after Manager has already started Manager.Add does not start Cache after Manager has already started Oct 4, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants