-
Notifications
You must be signed in to change notification settings - Fork 5.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[tune] tune.ray() gives repeated status without any further execution #17359
Comments
What most likely happens here is that you request resources for your trials that cannot be fulfilled by your cluster. Can you provide your training code (or at least the call to The problem with Tune here is that we're currently not throwing any warning when resource requests cannot be fulfilled. Ray Tune waits forever for resources that will never be available. This is the same underlying issue as here: #16425 - we're working on resolving this. However, this will only fix the warning - to make your code run you will have to either adjust your resource requests or add more resources to the cluster. |
Also cc @xwjiang2010 who is working on fixing the warning message |
Here is my code ray.init()
|
I specified GPU as 0. I get the following error. Failure # 1 (occurred at 2021-07-27_11-22-36) |
So the initial problem was fixed by setting GPUs to 0 - your machine doesn't seem to have a GPU, or it was not detected by the system (e.g. CUDA). The error you're currently seeing stems from a trainable class that does not implement all abstract methods. Your trainable class should implement at least a You can share your |
This is my RayModel Class
|
It seems your You can fix it like this:
|
It's working now. Thanks. |
Yes - we deprecated the old Trainable classes about a year ago (on July 1st 2020, here d35f0e4#diff-e1d889098f6b27e0d88ba206b0689d77c1a320d58697d98933decde97fd3cac8) and threw a deprecation warning since then. Glad we could resolve your problem - I'll close this issue, but feel free to add to it or reopen if any questions remain. |
I am using tune.ray() for hyperparameter tuning in Pycharm. When I execute the script file my code outputs the status repeatedly and nothing further gets executed.
PyCharm : - 2020.2.3 edition
Python :- 3.6.8
torch: - 1.9.0
ray:- 1.5.0
Any insights will be helpful.
The text was updated successfully, but these errors were encountered: