-
Notifications
You must be signed in to change notification settings - Fork 892
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Multiple cluster affinity groups not working as expected #4990
Comments
I think this should be a bug. karmada/pkg/scheduler/scheduler.go Line 530 in d676996
affinityIndex not always from zero.Should the design here be such that it always stays on the backup cluster, and when there is a problem with the backup cluster, it moves to the primary cluster, or should it transfer back to the old cluster after the primary cluster recovers? Or should the user be allowed to choose how to handle it? |
IMO, primary is primary for a reason and backup is usually intended for holding the workload temporarily until the primary recovers. OTOH, I can see an option e.g. |
Hi @vicaya, As you describe, this is the expected behavior. When multiple cluster groups are scheduled, if the current group is not suitable, the next group will be enabled and there will be no fallback. How about try with this: placement:
clusterAffinities:
- affinityName: primary
clusterNames:
- c0
- affinityName: backup
clusterNames:
- c0
- c1 |
Hi @XiShanYongYe-Chang |
Hi @dominicqi, you can try the rebalance feature. It will be released in v1.10, the day after tomorrow. |
Are you talking about #4840? Are you saying that with the same config as above, rebalancer will move the workload back to primary? If I also tried to use staticWeightList along with maxGroups: 1 to make sure all the replicas are in the the primary cluster with higher weight, which doesn't work after failover either. Hope the rebalancer would make this work as well, at the expense of verbosity and clarity. Primary/backup scenario is such a common use case, it'd be great if the multiple cluster affinity groups would work out of the box as intended. |
What happened:
According to https://karmada.io/docs/userguide/scheduling/resource-propagating/#multiple-cluster-affinity-groups ,
there are 2 potential use cases: 1. local bursts to cloud; 2. primary failover to backup. I tested the use case 2 with the following policy with a simple workload httpbin:
What you expected to happen:
How to reproduce it (as minimally and precisely as possible):
See the the above steps to reproduce the problem. It's as minimal as you can get.
Anything else we need to know?:
Please provide a working example policy for the primary failover to backup use case. Make sure workload would move back to primary from backup when primary is ready again.
Environment:
kubectl-karmada version
orkarmadactl version
): 1.9.1The text was updated successfully, but these errors were encountered: