Enable dynamic GPU scheduling #79

ksatzke · 2020-07-30T10:15:47Z

Currently, the resource limits for KNIX components, when using helm charts for deployments, are fixed at deployment time, like so:

resources:
      limits:
        cpu: 1
        memory: 2Gi
      requests:
        cpu: 1
        memory: 1Gi

For each workflow deployment, its allowance for GPU support should also be available for configuration at workflow deployment time, to enable dynamic definition of workflow requirements to run on GPUs instead of CPUs at workflow deployment time, and for KNIX to enable scheduling of the workflow on a node which still has sufficient GPUs cores available, like so:

resources:
      limits:
        cpu: 1
        memory: 2Gi
        nvidia.com/gpu: 1 # requesting 1 GPU

add the option to define GPU requirements per workflow to the GUI
store workflow requirement limits together with workflow data
extend management service to evaluate and handle workflow requirement limits for GPU and handle GPU scheduling
add node labelling capabilities to KNIX

iakkus · 2020-07-30T11:05:37Z

These need to be done in the feature/GPU_support_extended branch, right?

ksatzke · 2020-07-30T11:51:36Z

If we can agree on the issue, we can perform implementation using this branch to extend KNIX GPU support, right.

ksatzke added feature_request New feature request help wanted Extra attention is needed design The issue is related to the high-level architecture in progress This issue is already being fixed env/kubernetes To indicate something specific to Kubernetes setup of KNIX labels Jul 30, 2020

ksatzke self-assigned this Jul 30, 2020

ksatzke linked a pull request Oct 12, 2020 that will close this issue

Feature/gpu support extended #87

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Enable dynamic GPU scheduling #79

Enable dynamic GPU scheduling #79

ksatzke commented Jul 30, 2020

iakkus commented Jul 30, 2020

ksatzke commented Jul 30, 2020

Enable dynamic GPU scheduling #79

Enable dynamic GPU scheduling #79

Comments

ksatzke commented Jul 30, 2020

iakkus commented Jul 30, 2020

ksatzke commented Jul 30, 2020