Skip to content

Commit

Permalink
add userguide for workload-rebalancer
Browse files Browse the repository at this point in the history
Signed-off-by: chaosi-zju <chaosi@zju.edu.cn>
  • Loading branch information
chaosi-zju committed May 25, 2024
1 parent c76b569 commit e2df75e
Show file tree
Hide file tree
Showing 3 changed files with 586 additions and 0 deletions.
263 changes: 263 additions & 0 deletions docs/tutorials/workload-rebalancer.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,263 @@
---
title: Workload Rebalancer
---

## Objectives

In general case, after replicas of workloads is scheduled, it will keep the scheduling result inert and the replicas
distribution will not change. Now, assuming in some special scenario you want to actively trigger a fresh rescheduling,
you can achieve it by Workload Rebalancer.

So, this section will guide you to cover how to use Workload Rebalancer to trigger a rescheduling.

## Prerequisites

### Karmada with multi cluster has been installed

Run the command:

```shell
git clone https://github.com/karmada-io/karmada
cd karmada
hack/local-up-karmada.sh
export KUBECONFIG=~/.kube/karmada.config:~/.kube/members.config
```

> **Note:**
>
> Before guide started, we should install at least three kubernetes clusters, one is for Karmada control plane, the other two for member clusters.
> For convenience, we use [hack/local-up-karmada.sh](https://karmada.io/docs/installation/#install-karmada-for-development-environment) script to quickly prepare the above clusters.
>
> After the above command executed, you will see Karmada control plane installed with multi member clusters.
## Tutorial

### Step 1: create a Deployment

First prepare a Deployment named `demo-deploy-1`, you can create a new file `deployment.yaml` and content with the following:

<details>
<summary>deployment.yaml</summary>

```yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: demo-deploy-1
labels:
app: test
spec:
replicas: 3
selector:
matchLabels:
app: demo-deploy-1
template:
metadata:
labels:
app: demo-deploy-1
spec:
terminationGracePeriodSeconds: 0
containers:
- image: nginx
name: demo-deploy-1
resources:
limits:
cpu: 10m
memory: 10Mi
---
apiVersion: policy.karmada.io/v1alpha1
kind: PropagationPolicy
metadata:
name: default-pp
spec:
placement:
clusterTolerations:
- effect: NoExecute
key: workload-rebalancer-test
operator: Exists
tolerationSeconds: 0
clusterAffinity:
clusterNames:
- member1
- member2
replicaScheduling:
replicaDivisionPreference: Weighted
replicaSchedulingType: Divided
weightPreference:
dynamicWeight: AvailableReplicas
resourceSelectors:
- apiVersion: apps/v1
kind: Deployment
name: demo-deploy-1
namespace: default
```
</details>
Then run the following command to create those resources:
```bash
kubectl --context karmada-apiserver apply -f deployment.yaml
```

And you can check whether this step succeed like this:

```bash
$ kubectl --context karmada-apiserver get deploy demo-deploy-1
NAME READY UP-TO-DATE AVAILABLE AGE
demo-deploy-1 3/3 3 3 3m18s

$ kubectl --context member1 get po
NAME READY STATUS RESTARTS AGE
demo-deploy-1-784cd456bf-dv6xw 1/1 Running 0 3m18s
demo-deploy-1-784cd456bf-fgjn7 1/1 Running 0 3m18s

$ kubectl --context member2 get po
NAME READY STATUS RESTARTS AGE
demo-deploy-1-784cd456bf-856rf 1/1 Running 0 3m18s
```

thus, 2 replicas propagated to member1 cluster and 1 replica propagated to member2 cluster.

### Step 2: add `NoExecute` taint to member1 cluster to mock cluster failover

* Run the following command to add `NoExecute` taint to member1 cluster:

```bash
$ karmadactl --karmada-context=karmada-apiserver taint clusters member1 workload-rebalancer-test:NoExecute
cluster/member1 tainted
```

Then, reschedule will be triggered for the reason of cluster failover, and all replicas will be propagated to member2 cluster,
you can see:

```bash
$ kubectl --context member1 get po
No resources found in default namespace.

$ kubectl --context member2 get po
NAME READY STATUS RESTARTS AGE
demo-deploy-1-784cd456bf-856rf 1/1 Running 0 5m27s
demo-deploy-1-784cd456bf-b5977 1/1 Running 0 35s
demo-deploy-1-784cd456bf-pqthv 1/1 Running 0 35s
```

* Run the following command to remove the above `NoExecute` taint from member1 cluster:

```bash
$ karmadactl --karmada-context=karmada-apiserver taint clusters member1 workload-rebalancer-test:NoExecute-
cluster/member1 untainted
```

Removing the taint will not lead to replicas propagation changed for the reason of scheduling result inert,
all replicas will keep in member2 cluster unchanged.

### Step 3. apply a WorkloadRebalancer to trigger rescheduling.

Assuming you want to trigger the rescheduling of above resources, you can create a new file `workload-rebalancer.yaml`
and content with the following:

```yaml
apiVersion: apps.karmada.io/v1alpha1
kind: WorkloadRebalancer
metadata:
name: demo
spec:
workloads:
- apiVersion: apps/v1
kind: Deployment
name: demo-deploy-1
namespace: default
```
Then run the following command to apply it:
```bash
kubectl --context karmada-apiserver apply -f workload-rebalancer.yaml
```

you will get a `workloadrebalancer.apps.karmada.io/demo created` result, which means the API created success.

### Step 4: check the status of WorkloadRebalancer.

Run the following command:

```bash
$ kubectl --context karmada-apiserver get workloadrebalancer demo -o yaml
apiVersion: apps.karmada.io/v1alpha1
kind: WorkloadRebalancer
metadata:
...
creationTimestamp: "2024-05-22T11:16:10Z"
name: demo
...
spec:
...
status:
finishTime: "2024-05-22T11:16:10Z"
observedGeneration: 1
observedWorkloads:
- result: Successful
workload:
apiVersion: apps/v1
kind: Deployment
name: demo-deploy-1
namespace: default
```

Thus, you can observe the rescheduling result at `status.observedWorkloads` field of `workloadrebalancer/demo`.
As you can see, `Deployment/demo-deploy-1` rescheduled successfully.

### Step 5: Observe the real effect of WorkloadRebalancer

You can observe the real replicas propagation status of `Deployment/demo-deploy-1`:

```bash
$ kubectl --context member1 get po
NAME READY STATUS RESTARTS AGE
demo-deploy-1-784cd456bf-82kt6 1/1 Running 0 89s
demo-deploy-1-784cd456bf-k9fhl 1/1 Running 0 89s

$ kubectl --context member2 get po
NAME READY STATUS RESTARTS AGE
demo-deploy-1-784cd456bf-856rf 1/1 Running 0 9m23s
```

As you see, rescheduling happened and 2 replicas migrated back to member1 cluster while 1 replica in member2 cluster keep unchanged.

Besides, you can observe a schedule event emitted by `default-scheduler`, such as:

```bash
$ kubectl --context karmada-apiserver describe deployment demo-deploy-1
...
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
...
Normal ScheduleBindingSucceed 31s default-scheduler Binding has been scheduled successfully. Result: {member2:2, member1:1}
Normal GetDependenciesSucceed 31s dependencies-distributor Get dependencies([]) succeed.
Normal SyncSucceed 31s execution-controller Successfully applied resource(default/demo-deploy-1) to cluster member1
Normal AggregateStatusSucceed 31s (x4 over 31s) resource-binding-status-controller Update resourceBinding(default/demo-deploy-1-deployment) with AggregatedStatus successfully.
Normal SyncSucceed 31s execution-controller Successfully applied resource(default/demo-deploy-1) to cluster member2
```

### Step 6: Update and Auto-clean WorkloadRebalancer

Assuming you want the WorkloadRebalancer resource been auto cleaned in the future, you can just edit it and set
`spec.ttlSecondsAfterFinished` field to `300`, just like:

```yaml
apiVersion: apps.karmada.io/v1alpha1
kind: WorkloadRebalancer
metadata:
name: demo
spec:
ttlSecondsAfterFinished: 300
workloads:
- apiVersion: apps/v1
kind: Deployment
name: demo-deploy-1
namespace: default
```
After you applied this modification, this WorkloadRebalancer resource will be auto deleted after 300 seconds.
Loading

0 comments on commit e2df75e

Please sign in to comment.