[BUG] Elastic Quota Management not working as expected #2078

taraszka · 2024-05-31T14:32:52Z

What happened:

I've created a parent quota and two child quotas:

❯ cat eqs-test1.yaml eqs-pod1.yaml eqs-pod2.yaml
apiVersion: scheduling.sigs.k8s.io/v1alpha1
kind: ElasticQuota
metadata:
  name: test1-quota
  namespace: test1
  labels:
    quota.scheduling.koordinator.sh/parent: ""
    quota.scheduling.koordinator.sh/is-parent: "true"
spec:
  max:
    cpu: 500m
    memory: 1Gi
  min:
    cpu: 10m
    memory: 1Gi
---
apiVersion: scheduling.sigs.k8s.io/v1alpha1
kind: ElasticQuota
metadata:
  name: pod1-quota
  namespace: test1
  labels:
    quota.scheduling.koordinator.sh/parent: "test1-quota"
    quota.scheduling.koordinator.sh/is-parent: "false"
spec:
  max:
    cpu: 500m
    memory: 1Gi
  min:
    cpu: 1m
    memory: 128Mi
---
apiVersion: scheduling.sigs.k8s.io/v1alpha1
kind: ElasticQuota
metadata:
  name: pod2-quota
  namespace: test1
  labels:
    quota.scheduling.koordinator.sh/parent: "test1-quota"
    quota.scheduling.koordinator.sh/is-parent: "false"
spec:
  max:
    cpu: 500m
    memory: 1Gi
  min:
    cpu: 1m
    memory: 128Mi

and a test pods:

❯ cat pod1-test1.yaml pod2-test1.yaml
apiVersion: v1
kind: Pod
metadata:
  name: pod1-test1
  namespace: test1
  labels:
    quota.scheduling.koordinator.sh/name: "pod1-quota"
spec:
  schedulerName: koord-scheduler
  containers:
  - command:
    - sleep
    - 365d
    image: ubuntu
    imagePullPolicy: IfNotPresent
    name: curlimage
    resources:
      limits:
        cpu: 500m
        memory: 512Mi
      requests:
        cpu: 50m
        memory: 512Mi
    terminationMessagePath: /dev/termination-log
    terminationMessagePolicy: File
  restartPolicy: Always
---
apiVersion: v1
kind: Pod
metadata:
  name: pod2-test1
  namespace: test1
  labels:
    quota.scheduling.koordinator.sh/name: "pod2-quota"
spec:
  schedulerName: koord-scheduler
  containers:
  - command:
    - sleep
    - 365d
    image: ubuntu
    imagePullPolicy: IfNotPresent
    name: curlimage
    resources:
      limits:
        cpu: 500m
        memory: 512Mi
      requests:
        cpu: 50m
        memory: 512Mi
    terminationMessagePath: /dev/termination-log
    terminationMessagePolicy: File
  restartPolicy: Always

On both pods, I ran cpuburn to see how the quota behaves. Both pods use 500m each.

What you expected to happen:

I expected both pods to share 500m, so at the same time, both would use 250m instead of 500m each (100% utilization of parent quota max divided between two child quota).

How to reproduce it (as minimally and precisely as possible):

Apply the above yaml files and run cpuburn on both pods. Observe htop on both.

Anything else we need to know?:

Environment:

App version: 1.4.1
Kubernetes version (use kubectl version): v1.28.7
Install details (e.g. helm install args): default
Node environment (for koordlet/runtime-proxy issue):
- Containerd/Docker version: v1.7.11
- OS version: Rocky Linux release 9.4 (Blue Onyx)
- Kernal version: 5.14.0-362.24.1.el9_3.0.1.x86_64
- Cgroup driver: systemd
Others:

The text was updated successfully, but these errors were encountered:

saintube · 2024-06-04T03:17:14Z

@taraszka Do you mean that the ElasticQuota should limit the actual cpu usage of the running pod not to exceed the max quota? I'm afraid not, since the quota takes effect on the pod creation and scheduling, instead of the pod runtime. The real usage of a scheduled pod is limited via the Linux cgroups, which corresponds to the pod's cpu limit.
/cc @shaloulcy @ZiMengSheng

taraszka · 2024-06-04T13:11:05Z

@taraszka Do you mean that the ElasticQuota should limit the actual cpu usage of the running pod not to exceed the max quota? I'm afraid not, since the quota takes effect on the pod creation and scheduling, instead of the pod runtime. The real usage of a scheduled pod is limited via the Linux cgroups, which corresponds to the pod's cpu limit. /cc @shaloulcy @ZiMengSheng

I see.

taraszka added the kind/bug Create a report to help us improve label May 31, 2024

saintube added area/koord-scheduler area/koord-manager kind/question Support request or question relating to Koordinator labels Jun 4, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BUG] Elastic Quota Management not working as expected #2078

[BUG] Elastic Quota Management not working as expected #2078

taraszka commented May 31, 2024 •

edited

Loading

saintube commented Jun 4, 2024

taraszka commented Jun 4, 2024

[BUG] Elastic Quota Management not working as expected #2078

[BUG] Elastic Quota Management not working as expected #2078

Comments

taraszka commented May 31, 2024 • edited Loading

saintube commented Jun 4, 2024

taraszka commented Jun 4, 2024

taraszka commented May 31, 2024 •

edited

Loading