How to use the allocation policy #1774

aq2013 · 2024-07-08T06:48:28Z

Describe the support request
We created gpu device plugin with allocation-policy balanced and shared-dev-num 2. One node with one of Intel GPU Flex 140 GPU card. So the cluster have 2 node with Flex 140 GPU card and each node have 2 gpu.intel.com/i915.

Now we deployed 2 gpu applications, each requests 1 gpu.intel.com/i915. As I understand, when the allocation-policy is balanced ( " balanced mode spreads workloads among GPU devices"), these 2 applications should be scheduled to 2 different nodes with gpu cards. But we found all 2 applications were scheduled to the same node.

device-plugin:

Args:
      -shared-dev-num=2
      -enable-monitoring
      -allocation-policy=balanced
      -v=5

worknode-104:

worknode-105:

2 applications on one node:

System (please complete the following information if applicable):

OS version: [e.g. Ubuntu 22.04]
Kernel version: [e.g. Linux 5.15]
Device plugins version: [e.g. v0.29.0]
Hardware info: [e.g. Flex 140 gpu]

The text was updated successfully, but these errors were encountered:

tkatila · 2024-07-09T05:57:58Z

Hi @aq2013 and thanks for the issue.

Now we deployed 2 gpu applications, each requests 1 gpu.intel.com/i915. As I understand, when the allocation-policy is balanced ( " balanced mode spreads workloads among GPU devices"), these 2 applications should be scheduled to 2 different nodes with gpu cards. But we found all 2 applications were scheduled to the same node.

GPU plugin works at node level so it can only affect the GPU selection among the GPUs that are in its control. Thus the "balanced" mode only applies within the node not over the whole cluster. e.g, when user deploys two Pods and they happen to be scheduled to the same node, GPU plugin will deploy one to GPU1 and the other to GPU2.

Depending on your goal, I can think of two ways to get where you'd want to be:

Inter Pod anti-affinity rules (https://kubernetes.io/docs/concepts/scheduling-eviction/assign-pod-node/#inter-pod-affinity-and-anti-affinity) - prevent scheduler from scheduling a Pod to a node that already has the same type of Pod.
- Maybe not usable if you'd eventually want to schedule two similar Pods to one node.
EDIT: GAS doesn't work either.
- GAS' balancedResource - does resource balancing (=GPU) over the cluster (https://github.com/intel/platform-aware-scheduling/blob/master/gpu-aware-scheduling/README.md#gas-scheduler-extender)
  - Depending on your cluster access rights, might be hard to install. Requires changes to the scheduler config on the host.)

Also, Flex 140 should have two GPUs per a physical card. With "shared-dev-num" == 2, there should be 4 i915 resources.

EDIT: GAS doesn't solve the problem either. Its "balancedResource" also works at node level.

aq2013 · 2024-08-09T02:19:47Z

Thanks for the reply. Now we added the anti-affinity rules to schedule Pods to different nodes. I will close this issue.

aq2013 closed this as completed Aug 9, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to use the allocation policy #1774

How to use the allocation policy #1774

aq2013 commented Jul 8, 2024

tkatila commented Jul 9, 2024 •

edited

Loading

aq2013 commented Aug 9, 2024

How to use the allocation policy #1774

How to use the allocation policy #1774

Comments

aq2013 commented Jul 8, 2024

tkatila commented Jul 9, 2024 • edited Loading

aq2013 commented Aug 9, 2024

tkatila commented Jul 9, 2024 •

edited

Loading