You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
It would be nice if ResourceQuota plugin could account for resourcequota limits when estimating maxReplica count.
Why is this needed:
We ignore the limits during estimation because currently pb.ReplicaRequirements only supports setting resource requests. This can cause problems if there are burstable resources on the destination namespace. Recently I hit an edge case where the resourcequota on my destination namespace had enough requests, but not enough limits to schedule an additional pod. This caused the FlinkDeployment to fail:
mszacillo@control-plane-1:~$ k get resourcequota -n workspace
NAME AGE REQUEST LIMIT
pods-free 57d requests.cpu: 28800m/30, requests.memory: 86354Mi/180Gi limits.cpu: 29100m/30, limits.memory: 86682Mi/180Gi
Scheduling an additional TaskManager pod requires 1 CPU and 4096m - this can fit within the ResourceQuota's requests, but this resource does not fit within the ResourceQuota's limits, causing a scheduling error:
Could not create pod mszacillo-karmada-taskmanager-2-4, exception: java.util.concurrent.CompletionException: org.apache.flink.kubernetes.shaded.io.fabric8.kubernetes.client.KubernetesClientException: Failure executing: POST at: https://10.96.0.1/api/v1/namespaces/workspace/pods. Message: Forbidden!Configured service account doesn't have access. Service account may have been revoked. pods "mszacillo-karmada-taskmanager-2-4" is forbidden: exceeded quota: pods-free, requested: limits.cpu=1, used: limits.cpu=29100m, limited: limits.cpu=30.
Potential Options:
We could extend the pb.ReplicaRequirements API to include resource limits. We can then also check limits during the ResourceQuota plugin estimation. Another option would be to check the resource's requests against the available limits on the relevant ResourceQuota when calculating maxReplicas.
The text was updated successfully, but these errors were encountered:
What would you like to be added:
It would be nice if ResourceQuota plugin could account for resourcequota limits when estimating maxReplica count.
Why is this needed:
We ignore the limits during estimation because currently pb.ReplicaRequirements only supports setting resource requests. This can cause problems if there are burstable resources on the destination namespace. Recently I hit an edge case where the resourcequota on my destination namespace had enough requests, but not enough limits to schedule an additional pod. This caused the FlinkDeployment to fail:
mszacillo@control-plane-1:~$ k get resourcequota -n workspace NAME AGE REQUEST LIMIT pods-free 57d requests.cpu: 28800m/30, requests.memory: 86354Mi/180Gi limits.cpu: 29100m/30, limits.memory: 86682Mi/180Gi
Scheduling an additional TaskManager pod requires 1 CPU and 4096m - this can fit within the ResourceQuota's requests, but this resource does not fit within the ResourceQuota's limits, causing a scheduling error:
Potential Options:
We could extend the
pb.ReplicaRequirements
API to include resource limits. We can then also check limits during the ResourceQuota plugin estimation. Another option would be to check the resource's requests against the available limits on the relevant ResourceQuota when calculating maxReplicas.The text was updated successfully, but these errors were encountered: