diff --git a/solutions/kueue-admission-check/README.md b/solutions/kueue-admission-check/README.md index 321a0670f..13f976023 100644 --- a/solutions/kueue-admission-check/README.md +++ b/solutions/kueue-admission-check/README.md @@ -1,6 +1,8 @@ -# OCM Kueue Admission Check Controller +# Set up Multikueue with OCM Kueue Admission Check Controller -This script outlines the creation of an external [Kueue Admission Check Controller](https://kueue.sigs.k8s.io/docs/concepts/admission_check/) integrating OCM `Placement` results with [MultiKueue](https://kueue.sigs.k8s.io/docs/concepts/multikueue/). The controller reads OCM `Placement` decisions and generates corresponding `MultiKueueConfig` and `MultiKueueCluster` resources, streamlining the setup of the [MultiKueue](https://kueue.sigs.k8s.io/docs/concepts/multikueue/) environment and enabling users to select clusters based on custom criteria. +This guide demonstrates how to use the external OCM [Kueue Admission Check Controller](https://kueue.sigs.k8s.io/docs/concepts/admission_check/) which integrates OCM `Placement` results with [MultiKueue](https://kueue.sigs.k8s.io/docs/concepts/multikueue/) for intelligent multi-cluster job scheduling. +The controller reads OCM `Placement` decisions and generates corresponding `MultiKueueConfig` and `MultiKueueCluster` resources, streamlining the setup of the [MultiKueue](https://kueue.sigs.k8s.io/docs/concepts/multikueue/) environment and enabling users to select clusters based on custom criteria. +We'll walk through different user stories that showcase the power and flexibility of this integration. ## Background @@ -26,98 +28,129 @@ REF: [MultiKueue](https://kueue.sigs.k8s.io/docs/concepts/multikueue/), [Admissi REF: [Setup a MultiKueue environment](https://kueue.sigs.k8s.io/docs/tasks/manage/setup_multikueue/#multikueue-specific-kubeconfig) -## Prerequsite +## Prerequisites -Set up the enviroment by running the following command: +1. A Kubernetes environment with OCM installed on a hub cluster and at least three managed clusters. +2. [Kueue](https://kueue.sigs.k8s.io/docs/installation/) deployed across all clusters. +3. [Managed-serviceaccount](https://github.com/open-cluster-management-io/managed-serviceaccount), [cluster-permission](https://github.com/open-cluster-management-io/cluster-permission) and [resource-usage-collect-addon](https://github.com/open-cluster-management-io/addon-contrib/tree/main/resource-usage-collect-addon) installed on managed clusters. +You can set up these above by running the command: ```bash ./setup-env.sh ``` +After that, you can verify your setup. -### User Stories +- Check the managed clusters. -#### Story 1 - -As an admin, I want to automate [MultiKueue](https://kueue.sigs.k8s.io/docs/concepts/multikueue/) configuration across both manager and worker clusters, so that I can streamline the setup process without manual intervention. - -#### Story 2 - -As an admin, I want to use OCM `Placement` results for scheduling, so that clusters with specific attributes, like those with the `nvidia-t4` GPU accelerator label, are automatically selected and converted into a [MultiKueue](https://kueue.sigs.k8s.io/docs/concepts/multikueue/) for targeted workload deployment. +```bash +kubectl get mcl +NAME HUB ACCEPTED MANAGED CLUSTER URLS JOINED AVAILABLE AGE +cluster1 true https://cluster1-control-plane:6443 True True 72s +cluster2 true https://cluster2-control-plane:6443 True True 72s +cluster3 true https://cluster3-control-plane:6443 True True 72s +``` +- Verify the installed addons. +```bash +kubectl get mca -A +NAMESPACE NAME AVAILABLE DEGRADED PROGRESSING +cluster1 managed-serviceaccount True False +cluster1 resource-usage-collect True False +cluster2 managed-serviceaccount True False +cluster2 resource-usage-collect True False +cluster3 managed-serviceaccount True False +cluster3 resource-usage-collect True False +``` +- Confirm Kueue is running on the clusters. +```bash +kubectl get pods -n kueue-system --context kind-hub # Same for managed clusters. +NAME READY STATUS RESTARTS AGE +kueue-controller-manager-87bd7888b-f65rc 2/2 Running 11 (14m ago) 2d +``` -#### Story 3 +- On the hub cluster, check `ClusterProfiles`. +```bash +kubectl get clusterprofile -A +NAMESPACE NAME AGE +open-cluster-management cluster1 2d +open-cluster-management cluster2 2d +open-cluster-management cluster3 2d +``` +- The `ClusterProfile` status contains credentials that Kueue can use. +```bash +kubectl get clusterprofile -A -ojson | jq '.items[] | .metadata.name, .status.credentials[]' +"cluster1" +{ + "accessRef": { + "kind": "Secret", + "name": "kueue-admin-cluster1-kubeconfig", + "namespace": "kueue-system" + }, + "consumer": "kueue-admin" +} +"cluster2" +{ + "accessRef": { + "kind": "Secret", + "name": "kueue-admin-cluster2-kubeconfig", + "namespace": "kueue-system" + }, + "consumer": "kueue-admin" +} +"cluster3" +{ + .. +} +``` +- On hub cluster, Check secrets with `kubeconfig` for the managed cluster created under `kueue-system` namespace. +```bash +kubectl get secret -n kueue-system +NAME TYPE DATA AGE +kueue-admin-cluster1-kubeconfig Opaque 1 2d +kueue-admin-cluster2-kubeconfig Opaque 1 2d +kueue-admin-cluster3-kubeconfig Opaque 1 2d +kueue-webhook-server-cert Opaque 4 2d +``` -As an admin, I want to leverage OCM's `AddonPlacementScore` for dynamic workload scheduling, so that clusters with higher GPU scores, indicating clusters with more GPU resources, are selected and converted into a [MultiKueue](https://kueue.sigs.k8s.io/docs/concepts/multikueue/), which automatically adjusts by adding or removing clusters as scores change. +## User Stories -## Example +#### Story 1 -Below is an example of how you can use OCM Admission Check Controller to automate the Multikueue setup process, `MultiKueueConfigs` and `MultiKueueClusters` are generated dynamically based on OCM `Placement` decisions. +As an admin, I want to automate [MultiKueue](https://kueue.sigs.k8s.io/docs/concepts/multikueue/) configuration across multiple clusters, so that I can streamline the setup process without manual intervention. -```yaml -apiVersion: kueue.x-k8s.io/v1beta1 -kind: ResourceFlavor -metadata: - name: "default-flavor" ---- -apiVersion: kueue.x-k8s.io/v1beta1 -kind: ClusterQueue -metadata: - name: "cluster-queue" -spec: - namespaceSelector: {} # match all. - resourceGroups: - - coveredResources: ["cpu", "memory","nvidia.com/gpu"] - flavors: - - name: "default-flavor" - resources: - - name: "cpu" - nominalQuota: 9 - - name: "memory" - nominalQuota: 36Gi - - name: "nvidia.com/gpu" - nominalQuota: 3 - admissionChecks: - - multikueue - - ocm-multikueue ---- -apiVersion: kueue.x-k8s.io/v1beta1 -kind: LocalQueue -metadata: - namespace: "default" - name: "user-queue" -spec: - clusterQueue: "cluster-queue" ---- -apiVersion: kueue.x-k8s.io/v1beta1 -kind: AdmissionCheck -metadata: - name: multikueue -spec: - controllerName: kueue.x-k8s.io/multikueue - parameters: - apiGroup: kueue.x-k8s.io - kind: MultiKueueConfig # Automates the process of setting up `MultiKueueConfig` and `MultiKueueCluster`. - name: ocm-multikueue ---- -apiVersion: kueue.x-k8s.io/v1beta1 -kind: AdmissionCheck -metadata: - name: ocm-multikueue -spec: - controllerName: open-cluster-management.io/placement - parameters: - apiGroup: cluster.open-cluster-management.io - kind: Placement - name: placement-sample1 # An example placement to select clusters labeled with "nvidia-tesla-t4" GPU accelerator. +- With the help of the `ClusterProfile` API, we can easily set up MultiKueue environment. +```bash +kubectl apply -f ./multikueue-setup-demo1.yaml ``` +- After that, check the status of `MultiKueueCluster`, `AdmissionChecks` and `Clusterqueues` +```bash +kubectl get multikueuecluster -A -ojson | jq '.items[] | .metadata.name, .status.conditions' +kubectl get admissionchecks -ojson | jq '.items[] | .metadata.name, .status.conditions' +kubectl get clusterqueues -ojson | jq '.items[] | .metadata.name, .status.conditions' +``` +- Deploy a job to the MultiKueue. +```bash +kubectl create -f ./job-demo1.yaml +``` +- Check the workload on the managed clusters. +```bash +kubectl get workload --context kind-cluster1 +kubectl get workload --context kind-cluster2 +``` +#### Story 2 +As an admin, I want to use OCM `Placement` results for scheduling, so that clusters with specific attributes, like those with the `nvidia-t4` GPU accelerator label, are automatically selected and converted into a [MultiKueue](https://kueue.sigs.k8s.io/docs/concepts/multikueue/) for targeted workload deployment. -`placememt-sample1` selects clusters with the `nvidia-tesla-t4` accelerator label. - +- You can manually label the accelerators on the clusters. +```bash +kubectl label managedcluster cluster2 accelerator=nvidia-tesla-t4 +kubectl label managedcluster cluster3 accelerator=nvidia-tesla-t4 +``` +The `placememt-demo2-1.yaml` selects clusters with the `nvidia-tesla-t4` accelerator label. ```yaml apiVersion: cluster.open-cluster-management.io/v1beta1 kind: Placement metadata: - name: placement-sample1 + name: placement-demo2 namespace: kueue-system spec: clusterSets: @@ -132,10 +165,45 @@ spec: labelSelector: matchLabels: accelerator: nvidia-tesla-t4 - ``` +- Bind the cluster set to the Kueue namespace and verify the bindings. +```bash +clusteradm clusterset bind spoke --namespace kueue-system +clusteradm get clustersets +``` +- Apply the placement policy. +```bash +kubectl apply -f placement-demo2-1.yaml +``` +- Apply the MultiKueue setup configuration. +```bash +kubectl apply -f ./multikueue-setup-demo2.yaml +``` +- Check the `MultikueueKonfig` and `MultikueueClusters`. +```bash +kubectl get multikueueconfig +kubectl get multikueuecluster +``` +- After that, check the status of `MultiKueueCluster`, `AdmissionChecks` and `Clusterqueues` +```bash +kubectl get multikueuecluster -A -ojson | jq '.items[] | .metadata.name, .status.conditions' +kubectl get admissionchecks -ojson | jq '.items[] | .metadata.name, .status.conditions' +kubectl get clusterqueues -ojson | jq '.items[] | .metadata.name, .status.conditions' +``` +- Create a job requesting GPU resources to the MultiKueue. +```bash +kubectl create -f ./job-demo2.yaml +``` +- Check the workload on managed clusters. +```bash +kubectl get workload --context kind-cluster2 +kubectl get workload --context kind-cluster3 +``` +#### Story 3 -`placememt-sample2` selects clusters with the `nvidia-tesla-t4` accelerator label, and select one cluster with the highest GPU-score, indicating having more GPU resources. +As an admin, I want to leverage OCM's `AddonPlacementScore` for dynamic workload scheduling, so that clusters with higher GPU scores, indicating clusters with more GPU resources, are selected and converted into a [MultiKueue](https://kueue.sigs.k8s.io/docs/concepts/multikueue/), which automatically adjusts by adding or removing clusters as scores change. + +`placememt-demo2-2` selects clusters with the `nvidia-tesla-t4` accelerator label, and select one cluster with the highest GPU-score, indicating having more GPU resources. ```yaml apiVersion: cluster.open-cluster-management.io/v1beta1 @@ -166,11 +234,27 @@ spec: resourceName: resource-usage-score scoreName: gpuAvailable weight: 1 - ``` - - - +- You can manually edit the GPU resources on the managed clusters for testing. +```bash +kubectl edit-status node cluster2-control-plane --context kind-cluster2 +kubectl edit-status node cluster3-control-plane --context kind-cluster3 +``` +- Apply the changes in the `Placement` to update MultiKueue dynamically. +```bash +kubectl apply -f ./placement-demo2-2.yaml +``` +- Review the update in `MultikueueKonfig` and `MultikueueClusters` +```bash +kubectl get multikueueconfig +kubectl get multikueuecluster +``` +- Create a job for the updated MultiKueue and check the workload. +```bash +kubectl create -f ./job-demo2.yaml +kubectl get workload --context kind-cluster2 +kubectl get workload --context kind-cluster3 +``` ## Design Details @@ -193,7 +277,7 @@ spec: parameters: apiGroup: cluster.open-cluster-management.io kind: Placement # Placement is under kueue-system namespace. - name: placement-sample1 + name: placement-demo2-1 ``` ### Changes in the Configuration Process with OCM Admission Check Controller diff --git a/solutions/kueue-admission-check/bak/clusterpermission.yaml b/solutions/kueue-admission-check/bak/clusterpermission.yaml index fa2aaf888..be27abb6e 100644 --- a/solutions/kueue-admission-check/bak/clusterpermission.yaml +++ b/solutions/kueue-admission-check/bak/clusterpermission.yaml @@ -60,4 +60,4 @@ spec: subject: kind: ServiceAccount name: multikueue-sa - namespace: open-cluster-management-agent-addon \ No newline at end of file + namespace: open-cluster-management-agent-addon diff --git a/solutions/kueue-admission-check/bak/kueue-mwrs-0.7.1.yaml b/solutions/kueue-admission-check/bak/kueue-mwrs-0.7.1.yaml index b3ec732b3..834c766eb 100644 --- a/solutions/kueue-admission-check/bak/kueue-mwrs-0.7.1.yaml +++ b/solutions/kueue-admission-check/bak/kueue-mwrs-0.7.1.yaml @@ -12590,4 +12590,4 @@ spec: resources: - workloads - workloads/status - sideEffects: None \ No newline at end of file + sideEffects: None diff --git a/solutions/kueue-admission-check/bak/placement.yaml b/solutions/kueue-admission-check/bak/placement.yaml index e530a4a30..63f6da576 100644 --- a/solutions/kueue-admission-check/bak/placement.yaml +++ b/solutions/kueue-admission-check/bak/placement.yaml @@ -11,4 +11,4 @@ spec: - key: cluster.open-cluster-management.io/unreachable operator: Exists - key: cluster.open-cluster-management.io/unavailable - operator: Exists \ No newline at end of file + operator: Exists diff --git a/solutions/kueue-admission-check/job-demo1.yaml b/solutions/kueue-admission-check/job-demo1.yaml new file mode 100644 index 000000000..598e81bed --- /dev/null +++ b/solutions/kueue-admission-check/job-demo1.yaml @@ -0,0 +1,25 @@ +apiVersion: batch/v1 +kind: Job +metadata: + generateName: demo1-job + namespace: default + labels: + kueue.x-k8s.io/queue-name: user-queue-demo1 +spec: + parallelism: 1 + completions: 1 + suspend: true + template: + spec: + containers: + - name: dummy-job + image: gcr.io/k8s-staging-perf-tests/sleep:v0.1.0 + args: ["30s"] + resources: + requests: + cpu: 1 + memory: "200Mi" + limits: + cpu: 1 + memory: "200Mi" + restartPolicy: Never diff --git a/solutions/kueue-admission-check/job-demo2.yaml b/solutions/kueue-admission-check/job-demo2.yaml new file mode 100644 index 000000000..b8c16b9c2 --- /dev/null +++ b/solutions/kueue-admission-check/job-demo2.yaml @@ -0,0 +1,27 @@ +apiVersion: batch/v1 +kind: Job +metadata: + generateName: demo2-job + namespace: default + labels: + kueue.x-k8s.io/queue-name: "user-queue-demo2" +spec: + parallelism: 1 + completions: 1 + suspend: true + template: + spec: + containers: + - name: dummy-job + image: gcr.io/k8s-staging-perf-tests/sleep:v0.1.0 + args: ["600s"] + resources: + requests: + cpu: 1 + memory: "200Mi" + nvidia.com/gpu: "1" + limits: + cpu: 1 + memory: "200Mi" + nvidia.com/gpu: "1" # This job requires one GPU. + restartPolicy: Never diff --git a/solutions/kueue-admission-check/multikueue-setup-demo1.yaml b/solutions/kueue-admission-check/multikueue-setup-demo1.yaml new file mode 100644 index 000000000..3d4888c03 --- /dev/null +++ b/solutions/kueue-admission-check/multikueue-setup-demo1.yaml @@ -0,0 +1,71 @@ +apiVersion: kueue.x-k8s.io/v1beta1 +kind: ResourceFlavor +metadata: + name: "default-flavor-demo1" +--- +apiVersion: kueue.x-k8s.io/v1beta1 +kind: ClusterQueue +metadata: + name: "cluster-queue-demo1" +spec: + namespaceSelector: {} # match all. + resourceGroups: + - coveredResources: ["cpu", "memory"] + flavors: + - name: "default-flavor-demo1" + resources: + - name: "cpu" + nominalQuota: 9 + - name: "memory" + nominalQuota: 36Gi + admissionChecks: + - multikueue-demo1 +--- +apiVersion: kueue.x-k8s.io/v1beta1 +kind: LocalQueue +metadata: + namespace: "default" + name: "user-queue-demo1" +spec: + clusterQueue: "cluster-queue-demo1" +--- +apiVersion: kueue.x-k8s.io/v1beta1 +kind: AdmissionCheck +metadata: + name: multikueue-demo1 +spec: + controllerName: kueue.x-k8s.io/multikueue + parameters: + apiGroup: kueue.x-k8s.io + kind: MultiKueueConfig + name: multikueue-config-demo1 +--- +apiVersion: kueue.x-k8s.io/v1alpha1 +kind: MultiKueueConfig +metadata: + name: multikueue-config-demo1 +spec: + clusters: + - multikueue-demo1-cluster1 + - multikueue-demo1-cluster2 +--- +apiVersion: kueue.x-k8s.io/v1alpha1 +kind: MultiKueueCluster +metadata: + name: multikueue-demo1-cluster1 +spec: + kubeConfig: + locationType: Secret + location: kueue-admin-cluster1-kubeconfig + # a secret called "kueue-admin-cluster1-kubeconfig" should be created in the namespace the kueue + # controller manager runs into, holding the kubeConfig needed to connect to the + # worker cluster in the "kubeconfig" key; +--- +apiVersion: kueue.x-k8s.io/v1alpha1 +kind: MultiKueueCluster +metadata: + name: multikueue-demo1-cluster2 +spec: + kubeConfig: + locationType: Secret + location: kueue-admin-cluster2-kubeconfig diff --git a/solutions/kueue-admission-check/multikueue-setup-demo2.yaml b/solutions/kueue-admission-check/multikueue-setup-demo2.yaml index 84327ad7f..d6c2a19ff 100644 --- a/solutions/kueue-admission-check/multikueue-setup-demo2.yaml +++ b/solutions/kueue-admission-check/multikueue-setup-demo2.yaml @@ -40,7 +40,7 @@ spec: controllerName: kueue.x-k8s.io/multikueue parameters: apiGroup: kueue.x-k8s.io - kind: MultiKueueConfig + kind: MultiKueueConfig # Automates the process of setting up `MultiKueueConfig` and `MultiKueueCluster`. name: ocm-multikueue --- apiVersion: kueue.x-k8s.io/v1beta1 @@ -52,4 +52,4 @@ spec: parameters: apiGroup: cluster.open-cluster-management.io kind: Placement - name: placement-sample1 + name: placement-sample1 # An example placement to select clusters labeled with "nvidia-tesla-t4" GPU accelerator. diff --git a/solutions/kueue-admission-check/placement-sample1-1.yaml b/solutions/kueue-admission-check/placement-demo2-1.yaml similarity index 94% rename from solutions/kueue-admission-check/placement-sample1-1.yaml rename to solutions/kueue-admission-check/placement-demo2-1.yaml index b1e03d111..8d58c64a4 100644 --- a/solutions/kueue-admission-check/placement-sample1-1.yaml +++ b/solutions/kueue-admission-check/placement-demo2-1.yaml @@ -1,7 +1,7 @@ apiVersion: cluster.open-cluster-management.io/v1beta1 kind: Placement metadata: - name: placement-sample1 + name: placement-demo2 namespace: kueue-system spec: clusterSets: diff --git a/solutions/kueue-admission-check/placement-sample1-2.yaml b/solutions/kueue-admission-check/placement-demo2-2.yaml similarity index 96% rename from solutions/kueue-admission-check/placement-sample1-2.yaml rename to solutions/kueue-admission-check/placement-demo2-2.yaml index 7edea720b..4934613b2 100644 --- a/solutions/kueue-admission-check/placement-sample1-2.yaml +++ b/solutions/kueue-admission-check/placement-demo2-2.yaml @@ -1,7 +1,7 @@ apiVersion: cluster.open-cluster-management.io/v1beta1 kind: Placement metadata: - name: placement-sample1 + name: placement-demo2 namespace: kueue-system spec: clusterSets: diff --git a/solutions/kueue-admission-check/setup-env.sh b/solutions/kueue-admission-check/setup-env.sh index 372a73e5f..1e7a7d437 100755 --- a/solutions/kueue-admission-check/setup-env.sh +++ b/solutions/kueue-admission-check/setup-env.sh @@ -20,9 +20,9 @@ c3ctx="kind-${c3}" #kind delete cluster --name ${c3} kind create cluster --name "${hub}" --image kindest/node:v1.29.0@sha256:eaa1450915475849a73a9227b8f201df25e55e268e5d619312131292e324d570 -kind create cluster --name "${c1}" --config=cluster-config.yaml -kind create cluster --name "${c2}" --config=cluster-config.yaml -kind create cluster --name "${c3}" --config=cluster-config.yaml +kind create cluster --name "${c1}" --image kindest/node:v1.29.0@sha256:eaa1450915475849a73a9227b8f201df25e55e268e5d619312131292e324d570 +kind create cluster --name "${c2}" --image kindest/node:v1.29.0@sha256:eaa1450915475849a73a9227b8f201df25e55e268e5d619312131292e324d570 +kind create cluster --name "${c3}" --image kindest/node:v1.29.0@sha256:eaa1450915475849a73a9227b8f201df25e55e268e5d619312131292e324d570 echo "Initialize the ocm hub cluster" @@ -73,7 +73,7 @@ kubectl create -f env/multicluster.x-k8s.io_authtokenrequests.yaml kubectl create -f env/multicluster.x-k8s.io_clusterprofiles.yaml echo "Install managed-serviceaccount" -cd /Users/zhe/Documents/OCM/managed-serviceaccount +cd /path/to/managed-serviceaccount # TODO: Replace here with your actual path. helm uninstall -n open-cluster-management-addon managed-serviceaccount || true helm install \ -n open-cluster-management-addon --create-namespace \ @@ -92,28 +92,23 @@ kubectl apply -f env/placement.yaml || true kubectl apply -f env/mg-sa-cma-0.6.0.yaml || true echo "Install cluster-permission" -cd /Users/zhe/Documents/OCM/cluster-permission-main +cd /path/to/OCM/cluster-permission # TODO: Replace here with your actual path. make install make deploy cd - echo "Install resource-usage-collect-addon" -cd /Users/zhe/Documents/OCM/addon-contrib/resource-usage-collect-addon +cd /path/to/addon-contrib/resource-usage-collect-addon # TODO: Replace here with your actual path. IMAGE_NAME=zheshen/resource-usage-collect-addon:latest make deploy cd - -echo "Enable multiqueue on the hub" +echo "Enable MultiKueue on the hub" kubectl patch deployment kueue-controller-manager -n kueue-system --type='json' -p='[{"op": "replace", "path": "/spec/template/spec/containers/0/args", "value": ["--config=/controller_manager_config.yaml", "--zap-log-level=2", "--feature-gates=MultiKueue=true"]}]' echo "Setup queue on the spoke" kubectl apply -f env/single-clusterqueue-setup-mwrs.yaml -kubectl label managedcluster cluster2 accelerator=nvidia-tesla-t4 -kubectl label managedcluster cluster3 accelerator=nvidia-tesla-t4 echo "Setup credentials for clusterprofile" kubectl apply -f env/authtokenrequest-c1.yaml kubectl apply -f env/authtokenrequest-c2.yaml kubectl apply -f env/authtokenrequest-c3.yaml - -echo "kubectl edit-status node cluster2-control-plane --context ${c2ctx}" -echo "kubectl edit-status node cluster3-control-plane --context ${c3ctx}"