Allow Google accelerators (i.e. GPUs) on workers #161

dghubble · 2018-03-11T20:21:38Z

Warning: This does not magically make GPUs work on Container Linux or Kubernetes. It simply allows advanced users to begin experimenting with them. Like the comments imply, this feature is unofficial, undocumented, unsupported, and may be changed or removed at any time.

Caveats:

Requires changes to Google Cloud default quotas
Requires using terraform-provider-google 1.6.0 or higher to work with "0" GPUs properly
Requires compiling your own kernel modules on Container Linux. (It's possible, I've done it. Just lots of rough edges)
Some instances will remain un-created forever, because no GPU model is uniformly available across zones and workers are randomized into zones within a region automatically (a Typhoon feature). We just have to fiddle with the count until GCP learns to only try to create the instance in a zone it can actually be created in.

dghubble · 2018-03-12T01:00:43Z

Mon Mar 12 01:00:04 2018       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 390.25                 Driver Version: 390.25                    |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  Tesla P100-PCIE...  Off  | 00000000:00:04.0 Off |                    0 |
| N/A   33C    P0    29W / 250W |      0MiB / 16280MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID   Type   Process name                             Usage      |
|=============================================================================|
|  No running processes found                                                 |
+-----------------------------------------------------------------------------+

dghubble added the platform/google-cloud label Mar 11, 2018

Allow Google accelerators (i.e. GPUs) on workers

2592a0a

dghubble force-pushed the google-accelerators branch from a1653b5 to 2592a0a Compare March 12, 2018 00:21

dghubble merged commit 2592a0a into master Mar 12, 2018

dghubble deleted the google-accelerators branch March 12, 2018 06:59

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Allow Google accelerators (i.e. GPUs) on workers #161

Allow Google accelerators (i.e. GPUs) on workers #161

dghubble commented Mar 11, 2018 •

edited

Loading

dghubble commented Mar 12, 2018

Allow Google accelerators (i.e. GPUs) on workers #161

Allow Google accelerators (i.e. GPUs) on workers #161

Conversation

dghubble commented Mar 11, 2018 • edited Loading

dghubble commented Mar 12, 2018

dghubble commented Mar 11, 2018 •

edited

Loading