-
Notifications
You must be signed in to change notification settings - Fork 549
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[UX/Catalog] Add DEVICE_MEM info to GCP GPUs. #3375
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @concretevitamin!
'H100': 80 * 1024, | ||
'P4': 8 * 1024, | ||
'T4': 16 * 1024, | ||
'V100': 16 * 1024, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we add P100 too?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
BTW, generating a catalog from this branch removed GCP P100 from my sky show-gpus P100
:
# Master catalog
(base) ➜ ~ sky show-gpus P100
GPU QTY CLOUD INSTANCE_TYPE DEVICE_MEM vCPUs HOST_MEM HOURLY_PRICE HOURLY_SPOT_PRICE REGION
P100 1 Azure Standard_NC6s_v2 - 6 112GB $ 2.070 $ 0.207 eastus
P100 2 Azure Standard_NC12s_v2 - 12 224GB $ 4.140 $ 0.414 eastus
P100 4 Azure Standard_NC24rs_v2 - 24 448GB $ 9.108 $ 0.911 eastus
P100 4 Azure Standard_NC24s_v2 - 24 448GB $ 8.280 $ 0.828 eastus
P100 1 GCP n1-highmem-8 - 8 52GB $ 1.933 $ 0.679 us-central1
P100 2 GCP n1-highmem-16 - 16 104GB $ 3.866 $ 1.357 us-central1
P100 4 GCP n1-highmem-32 - 32 208GB $ 7.733 $ 2.714 us-central1
P100 1 OCI VM.GPU2.1 16GB 24 72GB $ 1.275 - eu-amsterdam-1
P100 2 OCI BM.GPU2.2 16GB 56 256GB $ 2.550 - eu-amsterdam-1
# Catalog from this branch
(base) ➜ ~ sky show-gpus P100
GPU QTY CLOUD INSTANCE_TYPE DEVICE_MEM vCPUs HOST_MEM HOURLY_PRICE HOURLY_SPOT_PRICE REGION
P100 1 Azure Standard_NC6s_v2 - 6 112GB $ 2.070 $ 0.207 eastus
P100 2 Azure Standard_NC12s_v2 - 12 224GB $ 4.140 $ 0.414 eastus
P100 4 Azure Standard_NC24rs_v2 - 24 448GB $ 9.108 $ 0.911 eastus
P100 4 Azure Standard_NC24s_v2 - 24 448GB $ 8.280 $ 0.828 eastus
P100 1 OCI VM.GPU2.1 16GB 24 72GB $ 1.275 - eu-amsterdam-1
P100 2 OCI BM.GPU2.2 16GB 56 256GB $ 2.550 - eu-amsterdam-1
More generally, does this change handle any new GPUs that may get added but are not updated in the name_to_gpu_memory_in_mib
dict?
Here's the catalog that got generated for me for reference.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great catch! Fixed both P100 and missing-GPU-info-not-shown issue (tested with this repro).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @concretevitamin! LGTM.
Previously,
DEVICE_MEM
is missing insky show-gpus
GCP results. This was because GCP APIs didn't explicitly return such info.Now:
Tested (run the relevant ones):
bash format.sh
pytest tests/test_smoke.py
pytest tests/test_smoke.py::test_fill_in_the_name
bash tests/backward_comaptibility_tests.sh