Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Tune] Don't recommend tune.run API in logging messages when using the Tuner #33642

Merged
merged 8 commits into from
Mar 28, 2023

Conversation

justinvyu
Copy link
Contributor

@justinvyu justinvyu commented Mar 23, 2023

Why are these changes needed?

This PR changes some logs to use the correct Tune entrypoint, depending on if the user is running with tune.run vs. tuner.fit(). Certain args like config and param_space are different between the two. Restoration logic is also different. This PR also reduces the amount of redundant logs that we print on restoration. This PR fixes the log when auto ray init happens to actually show up -- this will help users figure out how to customize ray.init options.

Problem

Before this change, doing Ctrl+C on your experiment would give you a message to restore with tune.run. Also, specifying an invalid mode in TuneConfig would also reference tune.run.

from ray import tune
import time

def train_fn(config):
    time.sleep(10)

tuner = tune.Tuner(train_fn)
tuner.fit()
... continue running with `tune.run(resume=True)`.

Auto ray init log example

Initializing Ray automatically. For cluster usage or custom Ray initialization, call `ray.init(...)` before `Tuner(...)`.

Related issue number

Closes #31478

Checks

  • I've signed off every commit(by using the -s flag, i.e., git commit -s) in this PR.
  • I've run scripts/format.sh to lint the changes in this PR.
  • I've included any doc changes needed for https://docs.ray.io/en/master/.
    • I've added any new APIs to the API Reference. For example, if I added a
      method in Tune, I've added it in doc/source/tune/api/ under the
      corresponding .rst file.
  • I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/
  • Testing Strategy
    • Unit tests
    • Release tests
    • This PR is not tested :(

Signed-off-by: Justin Yu <justinvyu@berkeley.edu>
Signed-off-by: Justin Yu <justinvyu@berkeley.edu>
Comment on lines 794 to 797
logger.debug(
"TrialRunner resumed, ignoring new add_experiment but "
"updating trial resources."
)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is this talking about?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any new experiments/configurations passed to tune.run will be ignored (we only continue the current state). This is when people pass tune.run(different_experiment) when resuming.

However, overwriting trainables is now a default way

Can we maybe

  • Detect if an Experiment was passed or just a trainable (in this block if not isinstance(exp, Experiment))
  • If an experiment, continue to use the INFO message (maybe with updated wording)
  • Else, don't print anything

Alternatively, we can keep it as is in the PR. I don't think anybody really passes experiments anyway and the message was unhelpful to begin with.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've kept it as a DEBUG, and improved the message a bit. It was a a bit more complicated to tell if the user passed in an experiment, since all trainables get converted to experiment -- felt that keeping it at the DEBUG log level was good enough.

Copy link
Member

@woshiyyya woshiyyya left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good 🙂️

@@ -226,6 +227,7 @@ def run(
_remote: Optional[bool] = None,
# Passed by the Tuner.
_remote_string_queue: Optional[Queue] = None,
_tuner_api: bool = False,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there anyway to determine the entrypoint with certain internal states, instead of passing this flag explicitly?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, it's a bit hard. I did this bc it seems like we have some special Tuner flags already, but maybe I could do a __tuner_api double underscore to make sure users really don't use this thing.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm I think @xwjiang2010 is also doing some context passing for entry point detection? Just checking to avoid duplicate work here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like there is no telemetry for tuner vs. tune.run yet! This can be used in a future telemetry PR too then.

}
if _tuner_api
else {
"entrypoint": "tune.run(...)",
Copy link
Member

@woshiyyya woshiyyya Mar 23, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When do users typically call tune.run()?

Copy link
Contributor Author

@justinvyu justinvyu Mar 23, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's the old Tune API (before Tuner) that we will deprecate at some point in the future.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it.

Signed-off-by: Justin Yu <justinvyu@berkeley.edu>
Signed-off-by: Justin Yu <justinvyu@berkeley.edu>
Copy link
Contributor

@krfricke krfricke left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good to me, thanks! Ping me for merge

Comment on lines 794 to 797
logger.debug(
"TrialRunner resumed, ignoring new add_experiment but "
"updating trial resources."
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any new experiments/configurations passed to tune.run will be ignored (we only continue the current state). This is when people pass tune.run(different_experiment) when resuming.

However, overwriting trainables is now a default way

Can we maybe

  • Detect if an Experiment was passed or just a trainable (in this block if not isinstance(exp, Experiment))
  • If an experiment, continue to use the INFO message (maybe with updated wording)
  • Else, don't print anything

Alternatively, we can keep it as is in the PR. I don't think anybody really passes experiments anyway and the message was unhelpful to begin with.

Signed-off-by: Justin Yu <justinvyu@berkeley.edu>
Signed-off-by: Justin Yu <justinvyu@berkeley.edu>
Signed-off-by: Justin Yu <justinvyu@berkeley.edu>
@justinvyu justinvyu added the tests-ok The tagger certifies test failures are unrelated and assumes personal liability. label Mar 25, 2023
@gjoliver gjoliver merged commit 950fc33 into ray-project:master Mar 28, 2023
@justinvyu justinvyu deleted the tune/resume_vs_restore branch April 10, 2023 16:25
elliottower pushed a commit to elliottower/ray that referenced this pull request Apr 22, 2023
…the `Tuner` (ray-project#33642)

Signed-off-by: Justin Yu <justinvyu@berkeley.edu>
Signed-off-by: elliottower <elliot@elliottower.com>
ProjectsByJackHe pushed a commit to ProjectsByJackHe/ray that referenced this pull request May 4, 2023
…the `Tuner` (ray-project#33642)

Signed-off-by: Justin Yu <justinvyu@berkeley.edu>
Signed-off-by: Jack He <jackhe2345@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
tests-ok The tagger certifies test failures are unrelated and assumes personal liability.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Cannot set root temporary path with ray tune
5 participants