About this fork

Fork of the GCP nvidia-gpu-device-plugin.

About this fork

This fork advertises multiple fake GPU for each real GPU, allowing to share a GPU between multiple pods using the Kubernetes device plugin api.

The goal is to schedule pods on GPUs until the GPU memory is full (GPU memory bin-packing).

See here for more details (notably how to use it on GKE).

See also deepomatic/shared-gpu-nvidia-k8s-device-plugin for the same feature based on NVIDIA own Kubernetes GPU Device Plugin.

Limits

This is a big workaround given the current situation. It has many drawbacks: The kubernetes scheduler doesn't know how the underlying real GPUs are shared between the deepomatic.com/shared-gpu resources it allocates among Pods.

there is now way to control/guarantee spreading the pods among real GPUs: the current workaround is to limit to one real GPU per node and to indirectly schedule via other resources such as memory (assuming there is a correlation between memory and GPU (memory) usage.
in the case of multiple real GPUs per node, asking for multiple shared GPUs for one Pod doesn't make sense as there is no guarantee the pod will be allocated shared GPUs from different real GPUs

Roadmap

For proper scheduling, this device plugin will advertise SharedGPUMemory as Kubernetes Extended Resources. Since the SharedGPUMemory resource is at the Node level (instead of at the Device level), we effectively support only one GPU per node.

Configuration

You can control the number of fake GPU device declared by this device plugin by adding -gpu-duplication-factor N to the container command (default: 100) in the daemonset definition.

Hardware Accelerators in GKE

This repository is a collection of installation recipes and integration utilities for consuming Hardware Accelerators in Google Kubernetes Engine.

This is not an official Google product.

More details on the nvidia-gpu-device-plugin are here.

Name		Name	Last commit message	Last commit date
Latest commit History 170 Commits
build		build
cmd/nvidia_gpu		cmd/nvidia_gpu
demo		demo
example		example
nvidia-driver-installer		nvidia-driver-installer
pkg/gpu/nvidia		pkg/gpu/nvidia
vendor		vendor
.dockerignore		.dockerignore
.travis.yml		.travis.yml
CONTRIBUTING.md		CONTRIBUTING.md
Dockerfile		Dockerfile
Gopkg.lock		Gopkg.lock
Gopkg.toml		Gopkg.toml
Jenkinsfile		Jenkinsfile
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
daemonset.yaml		daemonset.yaml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

About this fork

Limits

Roadmap

Configuration

Hardware Accelerators in GKE

About

Releases

Packages

Languages

License

Deepomatic/container-engine-accelerators

Folders and files

Latest commit

History

Repository files navigation

About this fork

Limits

Roadmap

Configuration

Hardware Accelerators in GKE

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages