H2O Driverless AI Helm - Error when deploying on ICP 3.1.1 / Kubernetes version 1.11.3 . Need Helm Chart Update #20

bmarolleau · 2019-01-17T10:55:55Z

Hello,
Here is the error I get when deploying on ICP 3.1.1 (no pb with ICP 2.x) due to the GPU management by K8s which changed with the latest versions of K8s. Symptom: POD scheduling error at helm install.
With ICP 3.1 & 3.1.1 (and K8s version >= 11) nvidia.com/gpu should be used instead of alpha.kubernetes.io/nvidia-gpu,
Here is a modified helm chart that works on my environment:
The critical part is in the Helm templates/deployment.yaml file in the requests/limits lines:

resources:
            limits:
            {{- if and (eq (.Capabilities.KubeVersion.Major|int) 1) (lt (.Capabilities.KubeVersion.Minor|int) 11) }}
              alpha.kubernetes.io/nvidia-gpu: {{ .Values.resources.limits.gpu }}
            {{- else }}
              nvidia.com/gpu: {{ .Values.resources.limits.gpu }}
            {{- end }}
              memory: {{ .Values.resources.limits.memory }}
            requests:
            {{- if and (eq (.Capabilities.KubeVersion.Major|int) 1) (lt (.Capabilities.KubeVersion.Minor|int) 11) }}
              alpha.kubernetes.io/nvidia-gpu: {{ .Values.resources.requests.gpu }}
            {{- else }}
              nvidia.com/gpu: {{ .Values.resources.requests.gpu }}
            {{- end }}
              memory: {{ .Values.resources.requests.memory }}

Here is the modified file to be placed in the templates folder of the helm chart, as an example:
deployment.zip

The text was updated successfully, but these errors were encountered:

CreatureDev · 2019-04-10T21:14:48Z

This issue will be fixed in the latest release

bmarolleau changed the title ~~Error when deploying on ICP 3.1.1 / Kubernetes version 1.11.3 . Need Helm Chart Update~~ DriverlessAI Helm - Error when deploying on ICP 3.1.1 / Kubernetes version 1.11.3 . Need Helm Chart Update Jan 17, 2019

bmarolleau changed the title ~~DriverlessAI Helm - Error when deploying on ICP 3.1.1 / Kubernetes version 1.11.3 . Need Helm Chart Update~~ H2O Driverless AI Helm - Error when deploying on ICP 3.1.1 / Kubernetes version 1.11.3 . Need Helm Chart Update Jan 17, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

H2O Driverless AI Helm - Error when deploying on ICP 3.1.1 / Kubernetes version 1.11.3 . Need Helm Chart Update #20

H2O Driverless AI Helm - Error when deploying on ICP 3.1.1 / Kubernetes version 1.11.3 . Need Helm Chart Update #20

bmarolleau commented Jan 17, 2019

CreatureDev commented Apr 10, 2019

H2O Driverless AI Helm - Error when deploying on ICP 3.1.1 / Kubernetes version 1.11.3 . Need Helm Chart Update #20

H2O Driverless AI Helm - Error when deploying on ICP 3.1.1 / Kubernetes version 1.11.3 . Need Helm Chart Update #20

Comments

bmarolleau commented Jan 17, 2019

CreatureDev commented Apr 10, 2019