Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Elastic Quota Management not working as expected #2078

Open
taraszka opened this issue May 31, 2024 · 2 comments
Open

[BUG] Elastic Quota Management not working as expected #2078

taraszka opened this issue May 31, 2024 · 2 comments
Labels
area/koord-manager area/koord-scheduler kind/bug Create a report to help us improve kind/question Support request or question relating to Koordinator

Comments

@taraszka
Copy link

taraszka commented May 31, 2024

What happened:

I've created a parent quota and two child quotas:

❯ cat eqs-test1.yaml eqs-pod1.yaml eqs-pod2.yaml
apiVersion: scheduling.sigs.k8s.io/v1alpha1
kind: ElasticQuota
metadata:
  name: test1-quota
  namespace: test1
  labels:
    quota.scheduling.koordinator.sh/parent: ""
    quota.scheduling.koordinator.sh/is-parent: "true"
spec:
  max:
    cpu: 500m
    memory: 1Gi
  min:
    cpu: 10m
    memory: 1Gi
---
apiVersion: scheduling.sigs.k8s.io/v1alpha1
kind: ElasticQuota
metadata:
  name: pod1-quota
  namespace: test1
  labels:
    quota.scheduling.koordinator.sh/parent: "test1-quota"
    quota.scheduling.koordinator.sh/is-parent: "false"
spec:
  max:
    cpu: 500m
    memory: 1Gi
  min:
    cpu: 1m
    memory: 128Mi
---
apiVersion: scheduling.sigs.k8s.io/v1alpha1
kind: ElasticQuota
metadata:
  name: pod2-quota
  namespace: test1
  labels:
    quota.scheduling.koordinator.sh/parent: "test1-quota"
    quota.scheduling.koordinator.sh/is-parent: "false"
spec:
  max:
    cpu: 500m
    memory: 1Gi
  min:
    cpu: 1m
    memory: 128Mi

and a test pods:

❯ cat pod1-test1.yaml pod2-test1.yaml
apiVersion: v1
kind: Pod
metadata:
  name: pod1-test1
  namespace: test1
  labels:
    quota.scheduling.koordinator.sh/name: "pod1-quota"
spec:
  schedulerName: koord-scheduler
  containers:
  - command:
    - sleep
    - 365d
    image: ubuntu
    imagePullPolicy: IfNotPresent
    name: curlimage
    resources:
      limits:
        cpu: 500m
        memory: 512Mi
      requests:
        cpu: 50m
        memory: 512Mi
    terminationMessagePath: /dev/termination-log
    terminationMessagePolicy: File
  restartPolicy: Always
---
apiVersion: v1
kind: Pod
metadata:
  name: pod2-test1
  namespace: test1
  labels:
    quota.scheduling.koordinator.sh/name: "pod2-quota"
spec:
  schedulerName: koord-scheduler
  containers:
  - command:
    - sleep
    - 365d
    image: ubuntu
    imagePullPolicy: IfNotPresent
    name: curlimage
    resources:
      limits:
        cpu: 500m
        memory: 512Mi
      requests:
        cpu: 50m
        memory: 512Mi
    terminationMessagePath: /dev/termination-log
    terminationMessagePolicy: File
  restartPolicy: Always

On both pods, I ran cpuburn to see how the quota behaves. Both pods use 500m each.

What you expected to happen:

I expected both pods to share 500m, so at the same time, both would use 250m instead of 500m each (100% utilization of parent quota max divided between two child quota).

How to reproduce it (as minimally and precisely as possible):

Apply the above yaml files and run cpuburn on both pods. Observe htop on both.

Anything else we need to know?:

Environment:

  • App version: 1.4.1
  • Kubernetes version (use kubectl version): v1.28.7
  • Install details (e.g. helm install args): default
  • Node environment (for koordlet/runtime-proxy issue):
    • Containerd/Docker version: v1.7.11
    • OS version: Rocky Linux release 9.4 (Blue Onyx)
    • Kernal version: 5.14.0-362.24.1.el9_3.0.1.x86_64
    • Cgroup driver: systemd
  • Others:
@taraszka taraszka added the kind/bug Create a report to help us improve label May 31, 2024
@saintube saintube added area/koord-scheduler area/koord-manager kind/question Support request or question relating to Koordinator labels Jun 4, 2024
@saintube
Copy link
Member

saintube commented Jun 4, 2024

@taraszka Do you mean that the ElasticQuota should limit the actual cpu usage of the running pod not to exceed the max quota? I'm afraid not, since the quota takes effect on the pod creation and scheduling, instead of the pod runtime. The real usage of a scheduled pod is limited via the Linux cgroups, which corresponds to the pod's cpu limit.
/cc @shaloulcy @ZiMengSheng

@taraszka
Copy link
Author

taraszka commented Jun 4, 2024

@taraszka Do you mean that the ElasticQuota should limit the actual cpu usage of the running pod not to exceed the max quota? I'm afraid not, since the quota takes effect on the pod creation and scheduling, instead of the pod runtime. The real usage of a scheduled pod is limited via the Linux cgroups, which corresponds to the pod's cpu limit. /cc @shaloulcy @ZiMengSheng

I see.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/koord-manager area/koord-scheduler kind/bug Create a report to help us improve kind/question Support request or question relating to Koordinator
Projects
None yet
Development

No branches or pull requests

2 participants