-
Notifications
You must be signed in to change notification settings - Fork 3.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GPUs requested but none are available #3542
Comments
Hi! thanks for your contribution!, great first issue! |
Mind check if you have installed CUDA version of your PT as |
@shanhaidexiamo what does |
Sorry to interrupt but I'm experiencing the same issue. the device_count() returns This is the env
GCP command to make the similar instance
|
@kyoungrok0517 mind sharing this output, just to check that you have properly installed PT and drivers... |
@Borda That returns |
@kyoungrok0517 good catch, mind sending PR? |
@williamFalcon is working on the parsing of gpus for DDP. The error is most likely because they are not correctly passed to or parsed in the child process. |
Hmm... so am I late to PR? I never did this before so it'll be grateful if you guide me how it works. I make pull request even though I don't know how to fix it?? Please tell me. I'd like to help. |
This issue has been automatically marked as stale because it hasn't had any recent activity. This issue will be closed in 7 days if no further activity occurs. Thank you for your contributions, Pytorch Lightning Team! |
Hi all, [EDIT] I meant I am now having the same issue. I'm running my job on a server with 8 GPUs. When I run @Borda , @awaelchli : do you know if this was ever fixed? Thanks! |
@fishbotics What else is running on the gpus? The original issue reported here was fixed by #4209, I believe. |
my server has 8 GPUs, but when I use the trainer class and set gpus = -1, it gets the run error GPUs requested but none are available, use torch to check the gpus , get the number of gpu is 8, and cuda.is_available is true. Does any one can tell me what's wrong ?
The text was updated successfully, but these errors were encountered: