Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[tune] Better error message for all pending trials #16850

Closed
richardliaw opened this issue Jul 2, 2021 — with Slack · 5 comments · Fixed by #17533
Closed

[tune] Better error message for all pending trials #16850

richardliaw opened this issue Jul 2, 2021 — with Slack · 5 comments · Fixed by #17533
Assignees
Labels
tune Tune-related issues

Comments

Copy link
Contributor

richardliaw commented Jul 2, 2021

What is the problem?

I've seen numerous times that trials can be printed as all "pending".

We should detect if we're on autoscaling cluster and print a relevant status message. cc @krfricke

@richardliaw
Copy link
Contributor Author

@architkulkarni architkulkarni added the tune Tune-related issues label Jul 2, 2021
@krfricke krfricke self-assigned this Jul 6, 2021
@krfricke
Copy link
Contributor

krfricke commented Jul 6, 2021

Yes, this will be very helpful and presumably help uncover other problems as well (e.g. with placement group scheduling)

@richardliaw
Copy link
Contributor Author

@xwjiang2010 FYI

@xwjiang2010
Copy link
Contributor

xwjiang2010 commented Aug 4, 2021

Just want to add what I observed in my repro set up.

It seems before migrating to PG, we have this. After migrating to PG, we return here and Executor.has_resources_for_trial() is returning True, although it's clearly otherwise.

@krfricke

@krfricke
Copy link
Contributor

krfricke commented Aug 5, 2021

Yes, this was a request by the autoscaler team to ensure that we request unavailable resources (by scheduling placement groups) so that autoscaling is triggered. We might add a check there to only return True if is_ray_cluster() is True

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
tune Tune-related issues
Projects
None yet
4 participants