This repo is now archived. Go to https://github.com/IBM/power-device-plugin
Power Device Plugin to add protected devices into a non-privileged container. The Power Device Plugin uses the Kubernetes Device Plugin in order to add specific devices to the given Pod.
The device plugin only installs on the workers.
- To deploy the device plugin:
# kustomize build manifests | oc apply -f -
project.project.openshift.io/power-device-plugin created
serviceaccount/power-device-plugin created
clusterrolebinding.rbac.authorization.k8s.io/power-device-plugin created
daemonset.apps/power-device-plugin created
To debug the running plugin, you can use:
export GRPC_GO_LOG_VERBOSITY_LEVEL=99
export GRPC_GO_LOG_SEVERITY_LEVEL=info
Thse are commented out in the DaemonSet.
You can check the kubelet behavior using:
# journalctl -u kubelet
...
7446 handler.go:95] "Registered client" name="power-dev-plugin/dev"
wrapper[7446]: I1219 04:32:20.722778 7446 manager.go:230] "Device plugin connected" resourceName="power-dev-plugin/dev"
wrapper[7446]: I1219 04:32:20.723559 7446 client.go:93] "State pushed for device plugin" resource="power-dev-plugin/dev" re>
wrapper[7446]: I1219 04:32:20.726284 7446 manager.go:279] "Processed device updates for resource" resourceName="power-dev-p>
wrapper[7446]: I1219 04:32:27.293908 7446 setters.go:333] "Updated capacity for device plugin" plugin="power-dev-plugin/dev>
To debug the socket connection:
- Connect to the worker
ssh core@worker-0
- Change to Root
sudo -s
- Check the socket is live
nc -U /var/lib/kubelet/device-plugins/power-dev.csi.ibm.com-reg.sock
In practice, the plugin only used around 1m core, and 18mi memory as shown below:
[root@bastion ~]# oc adm top pods -n power-device-plugin
NAME CPU(cores) MEMORY(bytes)
power-dev-mutate-c7f8b4689-rfhhf 1m 18Mi
power-device-plugin-2fsdv 1m 20Mi
power-device-plugin-jwqmm 1m 20Mi
power-device-plugin-mkpbv 1m 21Mi
power-device-plugin-pp4vw 1m 19Mi
power-device-plugin-x9szk 1m 21Mi
power-device-plugin-zkxkz 1m 20Mi
- To deploy the sample:
kustomize build examples | oc apply -f -
The build includes multiple architectures: linux/amd64
, linux/ppc64le
, linux/s390x
.
The build uses the ubi9/ubi:9.4 image.
Jan 01 16:25:32 worker-0 kubenswrapper[36781]: E0117 16:25:32.788348 36781 client.go:90] "ListAndWatch ended unexpectedly for device plugin" err="rpc error: code = Unavailable desc = error reading from server: EOF" resource="power-dev-plugin/dev"
indicates a problem with the socket and you'll want to enable logging per Editing kubelet log level verbosity and gathering logs
Command is...
echo -e "[Service]\nEnvironment=\"KUBELET_LOG_LEVEL=8\"" > /etc/systemd/system/kubelet.service.d/30-logging.conf
You should see why it's failing there.
# oc describe nodes | grep power-dev-pl
power-dev-plugin/dev: 742
power-dev-plugin/dev: 742
power-dev-plugin/dev 0 0
power-dev-plugin/dev: 742
power-dev-plugin/dev: 742
power-dev-plugin/dev 1 1
power-dev-plugin/dev: 928
power-dev-plugin/dev: 928
power-dev-plugin/dev 0 0
power-dev-plugin/dev: 928
power-dev-plugin/dev: 928
power-dev-plugin/dev 0 0
power-dev-plugin/dev: 928
power-dev-plugin/dev: 928
power-dev-plugin/dev 0 0
power-dev-plugin/dev: 0
power-dev-plugin/dev: 0
power-dev-plugin/dev 0 0
power-dev-plugin/dev: 928
power-dev-plugin/dev: 928
power-dev-plugin/dev 0 0
- https://github.com/intel/intel-device-plugins-for-kubernetes/blob/main/pkg/deviceplugin/manager.go#L96
- https://github.com/kairen/simple-device-plugin/tree/master
- https://github.com/kubernetes/kubelet/tree/master/pkg/apis/deviceplugin/v1beta1
- https://kubernetes.io/docs/concepts/extend-kubernetes/compute-storage-net/device-plugins/#examples
- https://github.com/k8stopologyawareschedwg/sample-device-plugin/blob/main/pkg/deviceplugin/deviceplugin.go