Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

native retain replicas should look at observedObject's labels #5072

Open
a7i opened this issue Jun 20, 2024 · 10 comments
Open

native retain replicas should look at observedObject's labels #5072

a7i opened this issue Jun 20, 2024 · 10 comments
Labels
kind/bug Categorizes issue or PR as related to a bug.

Comments

@a7i
Copy link
Contributor

a7i commented Jun 20, 2024

I created a COP to add the retain replicas label in hopes that the member cluster replicas will be retained. this label is not in the karmada resource template.

apiVersion: policy.karmada.io/v1alpha1
kind: ClusterOverridePolicy
metadata:
  labels:
    role: retain-replicas
  name: retain-replicas
spec:
  overrideRules:
  - overriders:
      labelsOverrider:
      - operator: add
        value:
          resourcetemplate.karmada.io/retain-replicas: "true"
    targetCluster:
      clusterNames:
      - bar-baz
  resourceSelectors:
  - apiVersion: apps/v1
    kind: Deployment
    labelSelector:
      matchLabels:
        foo.org/duplicate: "true"

What happened:

native retainWorkloadReplicas looks at desiredObject labels

labels, _, err := unstructured.NestedStringMap(desired.Object, "metadata", "labels")
if err != nil {
return nil, fmt.Errorf("failed to get metadata.label from desired.Object: %+v", err)
}
if label, labelExist := labels[util.RetainReplicasLabel]; labelExist && label == util.RetainReplicasValue {

What you expected to happen:

I expect it to look at observedObject labels

How to reproduce it (as minimally and precisely as possible):

Anything else we need to know?:

Environment:

  • Karmada version: 1.10.1
  • kubectl-karmada or karmadactl version (the result of kubectl-karmada version or karmadactl version):
  • Others:
@a7i a7i added the kind/bug Categorizes issue or PR as related to a bug. label Jun 20, 2024
@XiShanYongYe-Chang
Copy link
Member

XiShanYongYe-Chang commented Jun 21, 2024

HI @a7i, thanks, I think you are right. Would you like to submit a PR?

I thought wrong.

@RainbowMango
Copy link
Member

Oh, hold on guys.

The desired object should be the one after overriding by COP, which means it should have the annotation resourcetemplate.karmada.io/retain-replicas. Is there something wrong with the COP?

Can you help to check if the annotation in the manifest of the relevant Work object?

@a7i
Copy link
Contributor Author

a7i commented Jun 21, 2024

It is certainly in the Work object:

spec:
  suspend: false
  workload:
    manifests:
    - apiVersion: apps/v1
      kind: Deployment
      metadata:
        annotations:
          clusterpropagationpolicy.karmada.io/name: duplicate
          resourcebinding.karmada.io/name: gatekeeper-controller-manager-deployment
          resourcebinding.karmada.io/namespace: gatekeeper-system
          resourcetemplate.karmada.io/managed-annotations: clusterpropagationpolicy.karmada.io/name,resourcebinding.karmada.io/name,resourcebinding.karmada.io/namespace,resourcetemplate.karmada.io/managed-annotations,resourcetemplate.karmada.io/managed-labels,resourcetemplate.karmada.io/uid,work.karmada.io/conflict-resolution,work.karmada.io/name,work.karmada.io/namespace
          resourcetemplate.karmada.io/managed-labels: app.kubernetes.io/managed-by,app.kubernetes.io/name,clusterpropagationpolicy.karmada.io/name,clusterpropagationpolicy.karmada.io/permanent-id,foo.org/duplicate,karmada.io/managed,product,project,resourcebinding.karmada.io/permanent-id,resourcetemplate.karmada.io/retain-replicas,work.karmada.io/permanent-id
          resourcetemplate.karmada.io/uid: 0e9bd222-a216-4951-b731-652ab17d051e
          work.karmada.io/conflict-resolution: overwrite
          work.karmada.io/name: gatekeeper-controller-manager-75b56bc84c
          work.karmada.io/namespace: karmada-es-redacted-eks
        labels:
          clusterpropagationpolicy.karmada.io/name: duplicate
          clusterpropagationpolicy.karmada.io/permanent-id: 48d755a2-d5fe-499d-ab73-c64a35a9156b
          foo.org/duplicate: "true"                                     # redacted other internal labels
          karmada.io/managed: "true"
          resourcebinding.karmada.io/permanent-id: d3e7a12c-8db0-45a8-badd-d84ae657d92c
          resourcetemplate.karmada.io/retain-replicas: "true"
          work.karmada.io/permanent-id: 588239b5-90e3-496d-946e-247bf4d4fb1c

what's the significance of resourcetemplate.karmada.io/retain-replicas being in managed-labels?

@RainbowMango
Copy link
Member

what's the significance of resourcetemplate.karmada.io/retain-replicas being in managed-labels?

It's nothing to do with the annotation resourcetemplate.karmada.io/managed-labels. But the label resourcetemplate.karmada.io/retain-replicas: "true" in the manifest.

The desired object that retainWorkloadReplicas look at is exactly from the manifest.
That is saying that retainWorkloadReplicas should work as the label resourcetemplate.karmada.io/retain-replicas: "true" exists in the desired object.

@RainbowMango
Copy link
Member

I don't know why it isn't work on your side, but I just made a test with the Get Started Example.

1 step: Create a similar ClusterOverridePolicy based on the one on this issue description:

apiVersion: policy.karmada.io/v1alpha1
kind: ClusterOverridePolicy
metadata:
  labels:
    role: retain-replicas
  name: retain-replicas
spec:
  overrideRules:
  - overriders:
      labelsOverrider:
      - operator: add
        value:
          resourcetemplate.karmada.io/retain-replicas: "true"
    targetCluster:
      clusterNames:
      - member1
  resourceSelectors:
  - apiVersion: apps/v1
    kind: Deployment

2 step: update the replicas of the sample Deployment on member1 from 1 to 9.

  1. I can see that the replicas won't be change back to 1, that means retain works:
[rainbowmango]# kubectl get deployments.apps 
NAME    READY   UP-TO-DATE   AVAILABLE   AGE
nginx   9/9     9            9           3d23h

@a7i
Copy link
Contributor Author

a7i commented Jun 25, 2024

@RainbowMango I believe you need to modify the replicas on the karmada apiserver to reproduce my issue.

  • modify replicas on karmada-apiserver
  • observe that member1 replicas are not modified (this is failing)

I will try to get a reproducible setup this week.

@RainbowMango
Copy link
Member

Yeah, I did another test that modified the replicas on the Karmada apiserver, and the retention still works.

  1. Deploy sample PropagationPolicy and Deployment
-bash-5.0# kubectl apply -f https://raw.githubusercontent.com/karmada-io/karmada/master/samples/nginx/propagationpolicy.yaml
propagationpolicy.policy.karmada.io/nginx-propagation created
-bash-5.0# kubectl apply -f https://raw.githubusercontent.com/karmada-io/karmada/master/samples/nginx/deployment.yaml
deployment.apps/nginx created
  1. Then apply the ClusterOverridePolicy as per native retain replicas should look at observedObject's labels #5072 (comment)
  2. By now, karmada propagates two replicas and member1 cluster get 1 replica:
-bash-5.0# kubectl get deployments.apps 
NAME    READY   UP-TO-DATE   AVAILABLE   AGE
nginx   2/2     2            2           83s
-bash-5.0# 
-bash-5.0# karmadactl get deployments
NAME    CLUSTER   READY   UP-TO-DATE   AVAILABLE   AGE   ADOPTION
nginx   member1   1/1     1            1           94s   Y
nginx   member2   1/1     1            1           94s   Y
  1. Modify the deployment on Karmada apiserver, to scale replicas from 2 to 4:
-bash-5.0# kubectl scale deployment nginx --replicas 4
deployment.apps/nginx scaled
  1. You can see member2 cluster got another one replica, but the replica number on member1 didn't change.
-bash-5.0# karmadactl get deployments
NAME    CLUSTER   READY   UP-TO-DATE   AVAILABLE   AGE     ADOPTION
nginx   member2   2/2     2            2           4m22s   Y
nginx   member1   1/1     1            1           4m22s   Y

@Chase-Marino
Copy link

Chase-Marino commented Jun 27, 2024

Hi @RainbowMango, here are some reproducible steps to create the retain issue we are seeing, using hpa

in short --

  1. create and propagate deployment and hpa (scaling disabled, just set fixed replica count (x))
  2. create OP for deployment to apply retain label (nothing should change at the moment)
  3. create OP to update hpa fixed replica count (y)

The expected behavior at this point, is that the source cluster deployment will have x replicas, as controlled by the original hpa, and the member cluster deployment (with retain) will have y replicas, controlled by the member cluster hpa (which has been updated by the OP).

The observed behavior is that both source and member clusters deployments are constantly updating between x and y replica count

image

src on top, member1 on bottom. x=2, y=4

@RainbowMango
Copy link
Member

Hi @Chase-Marino I tried to reproduce it with the environment launched by hack/local-up-karmada.sh, but failed to produce
it. My operation steps as follows:

S1: create a namespace named retain-test

[rainbowmango]# kubectl create namespace retain-test
namespace/retain-test created

S2: Apply deployments/hpa/pp

[rainbowmango]# kubectl apply -f https://raw.githubusercontent.com/Chase-Marino/retain-hpa-test/main/deployment.yaml
deployment.apps/sample-deployment created
[rainbowmango]# kubectl apply -f https://raw.githubusercontent.com/Chase-Marino/retain-hpa-test/main/hpa.yaml
horizontalpodautoscaler.autoscaling/test-retain-replicas created
[rainbowmango]# kubectl apply -f https://raw.githubusercontent.com/Chase-Marino/retain-hpa-test/main/pp.yaml
propagationpolicy.policy.karmada.io/test-retain-propagation created

At this point, I can see the HPA is up and running and successfully scaled the deployment on member with 2 replicas:

[rainbowmango]# karmadactl get pods -n retain-test
NAME                                 CLUSTER   READY   STATUS    RESTARTS   AGE
sample-deployment-84d8c49fc8-br842   member1   1/1     Running   0          3m52s
sample-deployment-84d8c49fc8-dhcpl   member1   1/1     Running   0          3m37s

Note that the replica of deployment on Karmada now is still 1, because the HPA on Karmda doesn't work (not enable HPA controller on Karmada control plane)

apiVersion: apps/v1
kind: Deployment
metadata:
  labels:
    resourcetemplate.karmada.io/retain-replicas: "true"
    role: retain-replicas
  name: sample-deployment
  namespace: retain-test
spec:
  replicas: 1
  # ...
status:
  availableReplicas: 2
  observedGeneration: 2
  readyReplicas: 2
  replicas: 2
  updatedReplicas: 2

S3: apply op

[rainbowmango]# kubectl apply -f https://raw.githubusercontent.com/Chase-Marino/retain-hpa-test/main/retain-label-op.yaml
overridepolicy.policy.karmada.io/retain-replicas created
[rainbowmango]# kubectl apply -f https://raw.githubusercontent.com/Chase-Marino/retain-hpa-test/main/hpa-replicas-op.yaml
overridepolicy.policy.karmada.io/retain-replicas configured

Now, I can see the replica has been scaled to 4 replicas on the member cluster, as the min/max replica of HPA are fixed to 4.

[rainbowmango]# karmadactl get deployments -n retain-test 
NAME                CLUSTER   READY   UP-TO-DATE   AVAILABLE   AGE     ADOPTION
sample-deployment   member1   4/4     4            4           3h11m   Y

The observed behavior is that both source and member clusters deployments are constantly updating between x and y replica count

Not happen on my side.

So, If I remember correctly, in your environment you take the host cluster's kube-apiserver as the Karmada API Server, please confirm if it is true.

In addition, can you help to check the options of karmada-controller-manager, especially the --controllers?

@RainbowMango
Copy link
Member

Maybe you can try to do it again with the environment launched by hack/local-up-karmada.sh.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug.
Projects
Status: No status
Development

No branches or pull requests

4 participants