native retain replicas should look at observedObject's labels #5072

a7i · 2024-06-20T22:53:40Z

I created a COP to add the retain replicas label in hopes that the member cluster replicas will be retained. this label is not in the karmada resource template.

apiVersion: policy.karmada.io/v1alpha1
kind: ClusterOverridePolicy
metadata:
  labels:
    role: retain-replicas
  name: retain-replicas
spec:
  overrideRules:
  - overriders:
      labelsOverrider:
      - operator: add
        value:
          resourcetemplate.karmada.io/retain-replicas: "true"
    targetCluster:
      clusterNames:
      - bar-baz
  resourceSelectors:
  - apiVersion: apps/v1
    kind: Deployment
    labelSelector:
      matchLabels:
        foo.org/duplicate: "true"

What happened:

native retainWorkloadReplicas looks at desiredObject labels

karmada/pkg/resourceinterpreter/default/native/retain.go

Lines 146 to 151 in 0bc96a2

    
           labels, _, err := unstructured.NestedStringMap(desired.Object, "metadata", "labels") 
        
           if err != nil { 
        
           	return nil, fmt.Errorf("failed to get metadata.label from desired.Object: %+v", err) 
        
           } 
        
           if label, labelExist := labels[util.RetainReplicasLabel]; labelExist && label == util.RetainReplicasValue {

What you expected to happen:

I expect it to look at observedObject labels

How to reproduce it (as minimally and precisely as possible):

Anything else we need to know?:

Environment:

Karmada version: 1.10.1
kubectl-karmada or karmadactl version (the result of kubectl-karmada version or karmadactl version):
Others:

The text was updated successfully, but these errors were encountered:

XiShanYongYe-Chang · 2024-06-21T01:35:36Z

HI @a7i, thanks, ~~I think you are right. Would you like to submit a PR?~~

I thought wrong.

RainbowMango · 2024-06-21T09:32:20Z

Oh, hold on guys.

The desired object should be the one after overriding by COP, which means it should have the annotation resourcetemplate.karmada.io/retain-replicas. Is there something wrong with the COP?

Can you help to check if the annotation in the manifest of the relevant Work object?

a7i · 2024-06-21T15:30:26Z

It is certainly in the Work object:

spec:
  suspend: false
  workload:
    manifests:
    - apiVersion: apps/v1
      kind: Deployment
      metadata:
        annotations:
          clusterpropagationpolicy.karmada.io/name: duplicate
          resourcebinding.karmada.io/name: gatekeeper-controller-manager-deployment
          resourcebinding.karmada.io/namespace: gatekeeper-system
          resourcetemplate.karmada.io/managed-annotations: clusterpropagationpolicy.karmada.io/name,resourcebinding.karmada.io/name,resourcebinding.karmada.io/namespace,resourcetemplate.karmada.io/managed-annotations,resourcetemplate.karmada.io/managed-labels,resourcetemplate.karmada.io/uid,work.karmada.io/conflict-resolution,work.karmada.io/name,work.karmada.io/namespace
          resourcetemplate.karmada.io/managed-labels: app.kubernetes.io/managed-by,app.kubernetes.io/name,clusterpropagationpolicy.karmada.io/name,clusterpropagationpolicy.karmada.io/permanent-id,foo.org/duplicate,karmada.io/managed,product,project,resourcebinding.karmada.io/permanent-id,resourcetemplate.karmada.io/retain-replicas,work.karmada.io/permanent-id
          resourcetemplate.karmada.io/uid: 0e9bd222-a216-4951-b731-652ab17d051e
          work.karmada.io/conflict-resolution: overwrite
          work.karmada.io/name: gatekeeper-controller-manager-75b56bc84c
          work.karmada.io/namespace: karmada-es-redacted-eks
        labels:
          clusterpropagationpolicy.karmada.io/name: duplicate
          clusterpropagationpolicy.karmada.io/permanent-id: 48d755a2-d5fe-499d-ab73-c64a35a9156b
          foo.org/duplicate: "true"                                     # redacted other internal labels
          karmada.io/managed: "true"
          resourcebinding.karmada.io/permanent-id: d3e7a12c-8db0-45a8-badd-d84ae657d92c
          resourcetemplate.karmada.io/retain-replicas: "true"
          work.karmada.io/permanent-id: 588239b5-90e3-496d-946e-247bf4d4fb1c

what's the significance of resourcetemplate.karmada.io/retain-replicas being in managed-labels?

RainbowMango · 2024-06-25T07:41:32Z

what's the significance of resourcetemplate.karmada.io/retain-replicas being in managed-labels?

It's nothing to do with the annotation resourcetemplate.karmada.io/managed-labels. But the label resourcetemplate.karmada.io/retain-replicas: "true" in the manifest.

The desired object that retainWorkloadReplicas look at is exactly from the manifest.
That is saying that retainWorkloadReplicas should work as the label resourcetemplate.karmada.io/retain-replicas: "true" exists in the desired object.

RainbowMango · 2024-06-25T07:55:04Z

I don't know why it isn't work on your side, but I just made a test with the Get Started Example.

1 step: Create a similar ClusterOverridePolicy based on the one on this issue description:

apiVersion: policy.karmada.io/v1alpha1
kind: ClusterOverridePolicy
metadata:
  labels:
    role: retain-replicas
  name: retain-replicas
spec:
  overrideRules:
  - overriders:
      labelsOverrider:
      - operator: add
        value:
          resourcetemplate.karmada.io/retain-replicas: "true"
    targetCluster:
      clusterNames:
      - member1
  resourceSelectors:
  - apiVersion: apps/v1
    kind: Deployment

2 step: update the replicas of the sample Deployment on member1 from 1 to 9.

I can see that the replicas won't be change back to 1, that means retain works:

[rainbowmango]# kubectl get deployments.apps 
NAME    READY   UP-TO-DATE   AVAILABLE   AGE
nginx   9/9     9            9           3d23h

a7i · 2024-06-25T16:20:40Z

@RainbowMango I believe you need to modify the replicas on the karmada apiserver to reproduce my issue.

modify replicas on karmada-apiserver
observe that member1 replicas are not modified (this is failing)

I will try to get a reproducible setup this week.

RainbowMango · 2024-06-26T02:14:41Z

Yeah, I did another test that modified the replicas on the Karmada apiserver, and the retention still works.

Deploy sample PropagationPolicy and Deployment

-bash-5.0# kubectl apply -f https://raw.githubusercontent.com/karmada-io/karmada/master/samples/nginx/propagationpolicy.yaml
propagationpolicy.policy.karmada.io/nginx-propagation created
-bash-5.0# kubectl apply -f https://raw.githubusercontent.com/karmada-io/karmada/master/samples/nginx/deployment.yaml
deployment.apps/nginx created

Then apply the ClusterOverridePolicy as per native retain replicas should look at observedObject's labels #5072 (comment)
By now, karmada propagates two replicas and member1 cluster get 1 replica:

-bash-5.0# kubectl get deployments.apps 
NAME    READY   UP-TO-DATE   AVAILABLE   AGE
nginx   2/2     2            2           83s
-bash-5.0# 
-bash-5.0# karmadactl get deployments
NAME    CLUSTER   READY   UP-TO-DATE   AVAILABLE   AGE   ADOPTION
nginx   member1   1/1     1            1           94s   Y
nginx   member2   1/1     1            1           94s   Y

Modify the deployment on Karmada apiserver, to scale replicas from 2 to 4:

-bash-5.0# kubectl scale deployment nginx --replicas 4
deployment.apps/nginx scaled

You can see member2 cluster got another one replica, but the replica number on member1 didn't change.

-bash-5.0# karmadactl get deployments
NAME    CLUSTER   READY   UP-TO-DATE   AVAILABLE   AGE     ADOPTION
nginx   member2   2/2     2            2           4m22s   Y
nginx   member1   1/1     1            1           4m22s   Y

Chase-Marino · 2024-06-27T02:28:50Z

Hi @RainbowMango, here are some reproducible steps to create the retain issue we are seeing, using hpa

in short --

create and propagate deployment and hpa (scaling disabled, just set fixed replica count (x))
create OP for deployment to apply retain label (nothing should change at the moment)
create OP to update hpa fixed replica count (y)

The expected behavior at this point, is that the source cluster deployment will have x replicas, as controlled by the original hpa, and the member cluster deployment (with retain) will have y replicas, controlled by the member cluster hpa (which has been updated by the OP).

The observed behavior is that both source and member clusters deployments are constantly updating between x and y replica count

src on top, member1 on bottom. x=2, y=4

RainbowMango · 2024-06-28T07:58:40Z

Hi @Chase-Marino I tried to reproduce it with the environment launched by hack/local-up-karmada.sh, but failed to produce
it. My operation steps as follows:

S1: create a namespace named retain-test

[rainbowmango]# kubectl create namespace retain-test
namespace/retain-test created

S2: Apply deployments/hpa/pp

[rainbowmango]# kubectl apply -f https://raw.githubusercontent.com/Chase-Marino/retain-hpa-test/main/deployment.yaml
deployment.apps/sample-deployment created

[rainbowmango]# kubectl apply -f https://raw.githubusercontent.com/Chase-Marino/retain-hpa-test/main/hpa.yaml
horizontalpodautoscaler.autoscaling/test-retain-replicas created

[rainbowmango]# kubectl apply -f https://raw.githubusercontent.com/Chase-Marino/retain-hpa-test/main/pp.yaml
propagationpolicy.policy.karmada.io/test-retain-propagation created

At this point, I can see the HPA is up and running and successfully scaled the deployment on member with 2 replicas:

[rainbowmango]# karmadactl get pods -n retain-test
NAME                                 CLUSTER   READY   STATUS    RESTARTS   AGE
sample-deployment-84d8c49fc8-br842   member1   1/1     Running   0          3m52s
sample-deployment-84d8c49fc8-dhcpl   member1   1/1     Running   0          3m37s

Note that the replica of deployment on Karmada now is still 1, because the HPA on Karmda doesn't work (not enable HPA controller on Karmada control plane)

apiVersion: apps/v1
kind: Deployment
metadata:
  labels:
    resourcetemplate.karmada.io/retain-replicas: "true"
    role: retain-replicas
  name: sample-deployment
  namespace: retain-test
spec:
  replicas: 1
  # ...
status:
  availableReplicas: 2
  observedGeneration: 2
  readyReplicas: 2
  replicas: 2
  updatedReplicas: 2

S3: apply op

[rainbowmango]# kubectl apply -f https://raw.githubusercontent.com/Chase-Marino/retain-hpa-test/main/retain-label-op.yaml
overridepolicy.policy.karmada.io/retain-replicas created

[rainbowmango]# kubectl apply -f https://raw.githubusercontent.com/Chase-Marino/retain-hpa-test/main/hpa-replicas-op.yaml
overridepolicy.policy.karmada.io/retain-replicas configured

Now, I can see the replica has been scaled to 4 replicas on the member cluster, as the min/max replica of HPA are fixed to 4.

[rainbowmango]# karmadactl get deployments -n retain-test 
NAME                CLUSTER   READY   UP-TO-DATE   AVAILABLE   AGE     ADOPTION
sample-deployment   member1   4/4     4            4           3h11m   Y

The observed behavior is that both source and member clusters deployments are constantly updating between x and y replica count

Not happen on my side.

So, If I remember correctly, in your environment you take the host cluster's kube-apiserver as the Karmada API Server, please confirm if it is true.

In addition, can you help to check the options of karmada-controller-manager, especially the --controllers?

RainbowMango · 2024-06-28T08:00:00Z

Maybe you can try to do it again with the environment launched by hack/local-up-karmada.sh.

a7i added the kind/bug Categorizes issue or PR as related to a bug. label Jun 20, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

native retain replicas should look at observedObject's labels #5072

native retain replicas should look at observedObject's labels #5072

a7i commented Jun 20, 2024

XiShanYongYe-Chang commented Jun 21, 2024 •

edited

Loading

RainbowMango commented Jun 21, 2024

a7i commented Jun 21, 2024 •

edited

Loading

RainbowMango commented Jun 25, 2024

RainbowMango commented Jun 25, 2024

a7i commented Jun 25, 2024 •

edited

Loading

RainbowMango commented Jun 26, 2024

Chase-Marino commented Jun 27, 2024 •

edited

Loading

RainbowMango commented Jun 28, 2024

RainbowMango commented Jun 28, 2024

native retain replicas should look at observedObject's labels #5072

native retain replicas should look at observedObject's labels #5072

Comments

a7i commented Jun 20, 2024

XiShanYongYe-Chang commented Jun 21, 2024 • edited Loading

RainbowMango commented Jun 21, 2024

a7i commented Jun 21, 2024 • edited Loading

RainbowMango commented Jun 25, 2024

RainbowMango commented Jun 25, 2024

a7i commented Jun 25, 2024 • edited Loading

RainbowMango commented Jun 26, 2024

Chase-Marino commented Jun 27, 2024 • edited Loading

RainbowMango commented Jun 28, 2024

RainbowMango commented Jun 28, 2024

XiShanYongYe-Chang commented Jun 21, 2024 •

edited

Loading

a7i commented Jun 21, 2024 •

edited

Loading

a7i commented Jun 25, 2024 •

edited

Loading

Chase-Marino commented Jun 27, 2024 •

edited

Loading