Improve the user experience of Failover #5150

XiShanYongYe-Chang · 2024-07-06T11:29:13Z

What would you like to be added:

Improve the user experience of Failover

Why is this needed:

The Failover and GracefulEviction features are currently in the Beta phase, which means they are enabled by default.

There is a scenario where users propagate configuration resources by directly specifying the cluster names. When a cluster is disconnected from the Karmada control plane for several hours, it is identified as NotReady. Once the cluster recovers, the configuration resources on that cluster are deleted unexpectedly. If this occurs in a production environment, it could lead to serious consequences.

Therefore, we need to optimize the Failover feature for this scenario to provide users with a more stable and reliable experience.

The text was updated successfully, but these errors were encountered:

whitewindmills · 2024-07-08T01:54:17Z

are you saying that resources on unhealthy member clusters will be deleted cause they're migrated to other member clusters? if so, how to improve it?

XiShanYongYe-Chang · 2024-07-08T02:08:04Z

are you saying that resources on unhealthy member clusters will be deleted cause they're migrated to other member clusters?

That's it. One thing to note is that it's not migrated to another cluster, it's just moved out of the failed cluster.

if so, how to improve it?

I hope to hear everyone's opinion.

whitewindmills · 2024-07-08T02:29:28Z

One thing to note is that it's not migrated to another cluster, it's just moved out of the failed cluster.

let me guest how it happened, no fit cluster?

XiShanYongYe-Chang · 2024-07-08T02:36:13Z

In other words, the clusters that need to be distributed have been listed, and these clusters will be distributed with configuration resources.

whitewindmills · 2024-07-08T02:44:32Z

does it look like this? if the cluster foo becomes unhealthy, the configuration will be stuck in the cluster until the cluster becomes healthy and then it is deleted. am I right?

apiVersion: policy.karmada.io/v1alpha1
kind: PropagationPolicy
metadata:
  name: test-pp
  namespace: default
spec:
  resourceSelectors:
    - apiVersion: v1
      kind: ConfigMap
      name: conf
  placement:
    clusterAffinity:
      clusterNames:
      - foo
      - bar
  ...

XiShanYongYe-Chang · 2024-07-08T03:21:42Z

Yes, you are right.

whitewindmills · 2024-07-08T06:07:57Z

yes, this is a noteworthy case where we would prefer not to delete resources when there is no new cluster to migrate to.

XiShanYongYe-Chang added the kind/feature Categorizes issue or PR as related to a new feature. label Jul 6, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve the user experience of Failover #5150

Improve the user experience of Failover #5150

XiShanYongYe-Chang commented Jul 6, 2024

whitewindmills commented Jul 8, 2024

XiShanYongYe-Chang commented Jul 8, 2024

whitewindmills commented Jul 8, 2024

XiShanYongYe-Chang commented Jul 8, 2024

whitewindmills commented Jul 8, 2024

XiShanYongYe-Chang commented Jul 8, 2024

whitewindmills commented Jul 8, 2024

Improve the user experience of Failover #5150

Improve the user experience of Failover #5150

Comments

XiShanYongYe-Chang commented Jul 6, 2024

whitewindmills commented Jul 8, 2024

XiShanYongYe-Chang commented Jul 8, 2024

whitewindmills commented Jul 8, 2024

XiShanYongYe-Chang commented Jul 8, 2024

whitewindmills commented Jul 8, 2024

XiShanYongYe-Chang commented Jul 8, 2024

whitewindmills commented Jul 8, 2024