-
Notifications
You must be signed in to change notification settings - Fork 3.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: nb
is set total number of devices, when nb is -1.
#4209
fix: nb
is set total number of devices, when nb is -1.
#4209
Conversation
Codecov Report
@@ Coverage Diff @@
## master #4209 +/- ##
======================================
- Coverage 93% 93% -0%
======================================
Files 111 111
Lines 8067 8127 +60
======================================
+ Hits 7488 7536 +48
- Misses 579 591 +12 |
@ssaru Nice find :) I was wondering if we could just ignore |
@lezwon Moreover, i thinks two options, It's too ambiguous. My suggestion is integrate P.S. Thank you for your effort 👍 import torch
def pick_multiple_gpus(nb):
nb = torch.cuda.device_count() if nb == -1 else nb
picked = []
for _ in range(nb):
picked.append(pick_single_gpu(exclude_gpus=picked))
return picked
def pick_single_gpu(exclude_gpus: list):
for i in range(torch.cuda.device_count()):
if i in exclude_gpus:
continue
# Try to allocate on device:
device = torch.device(f"cuda:{i}")
try:
torch.ones(1).to(device)
except RuntimeError:
continue
return i
raise RuntimeError("No GPUs available.") (cc. @inmoonlight) |
Also how do you propose we combine |
As stated in the document (described below), how about this way?
# specifies all GPUs regardless of its availability
Trainer(gpus=-1, auto_select_gpus=False)
# specified all "available" GPUs. If only one GPU is not occupied, use one gpu.
Trainer(gpus=-1, auto_select_gpus=True) (cc. @ssaru) |
@inmoonlight yep. This PR would do that. If @ssaru can add a test, we can go ahead and merge it. :) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, just needs a test!
Edit:
#2852 Might have to look at this pending PR as well to make sure everything works
1. test combination `auto_select_gpus`, `gpus` options using Trainer 2. test `pick_multiple_gpus` function directly Refs: #4207
Hello @ssaru! Thanks for updating this PR. There are currently no PEP 8 issues detected in this Pull Request. Cheers! 🍻 Comment last updated at 2020-10-29 09:25:05 UTC |
I finished my work as below
(cc. @inmoonlight ) |
The above test scenario throws a MisconfigurationException("GPUs requested but none are available.") at "pytorch_lightning/utilities/device_parser.py:75"(refer as below) because the machine doesn't have a GPU pytorch_lightning/utilities/device_parser.py:75 gpus = _normalize_parse_gpu_string_input(gpus)
gpus = _normalize_parse_gpu_input_to_list(gpus)
if not gpus:
raise MisconfigurationException("GPUs requested but none are available.") For more exactly testing, it's better to instantiate Trainer with What do you think? |
@ssaru sounds good! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the PR. LGTM!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
great extension, would be nice to have it in the changelog :)
* fix: `nb` is set total number of devices, when nb is -1. Refs: #4207 * feat: add test code 1. test combination `auto_select_gpus`, `gpus` options using Trainer 2. test `pick_multiple_gpus` function directly Refs: #4207 * docs: modify contents in `Select GPU devices` Refs: #4207 * refactore: reflect the reuslt of review Refs: #4207 * refactore: reflect the reuslt of review Refs: #4207 * Update CHANGELOG.md Co-authored-by: chaton <thomas@grid.ai> Co-authored-by: Roger Shieh <55400948+s-rog@users.noreply.github.com> Co-authored-by: Nicki Skafte <skaftenicki@gmail.com> (cherry picked from commit b459fd2)
What does this PR do?
Fixes #4207 (issue)
Before
auto_select_gpus
but, if you set gpus is -1, raise MisconfigurationException("GPUs requested but none are available.")
./pytorch_lightning/tuner/auto_gpu_select.py
After
nb
is set total number of devices, when nb is -1../pytorch_lightning/tuner/auto_gpu_select.py
Before submitting
PR review
Anyone in the community is free to review the PR once the tests have passed.
If we didn't discuss your PR in Github issues there's a high chance it will not be merged.
Did you have fun?
Make sure you had fun coding 🙃
(cc. @inmoonlight)