Rename CLI tools and move to proper entrypoint #396

joecummings · 2024-02-20T19:40:02Z

Context

As mentioned in #370, _scripts was currently being packaged as it's own standalone package. The proper way to do this would be to move into the torchtune package and specify it as an entry_point. This PR is the first in a collection of changes to properly package TorchTune.

Why did you rename _scripts/ to _cli? Everything currently contained in the _scripts/ dir is related to the cli tool. As such, it makes sense to rename the dir. Once #388 lands, the _cli_utils dir will also be deleted along w/ recipe_utils.py and config_utils.py. Then the tune.py command will live at the same level under the _cli dir as the rest of the sub commands.

Do I need to read all 19 files? NO, most of these are just moved files. Be sure to checkout the setup.py file though and some of the changes to tune.py.

Changelog

Rename _scripts to _cli
Move _cli under torchtune pkg dir
Update all file paths
Change tune into a proper python file and modify setup.py to run it as an entry_point

Test plan

pytest tests/torchtune/_cli

Below indicates the steps taken in the test, a checkmark indicates a run equivalent to the behavior before these code changes:

Fresh conda install, tune ls, tune full_finetune --config alpaca_llama2_full_finetune
Pip uninstall, re-install, tune ls, tune full_finetune --config alpaca_llama2_full_finetune

netlify · 2024-02-20T19:40:21Z

✅ Deploy Preview for torchtune-preview ready!

Name	Link
🔨 Latest commit	`48cf03b`
🔍 Latest deploy log	https://app.netlify.com/sites/torchtune-preview/deploys/65d6425524e3230008db4c35
😎 Deploy Preview	https://deploy-preview-396--torchtune-preview.netlify.app
📱 Preview on mobile	Toggle QR Code... Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site configuration.

NicolasHug · 2024-02-20T23:06:21Z

torchtune/_cli/cli_utils/tune.py

+        sys.argv = [str(cmd)] + args.recipe_args
+        runpy.run_path(str(cmd), run_name="__main__")


Now that the _script folder is no more, does it still make sense to rely on runpy and on manually building cmd to run those files?

E.g. instead of building

cmd = pkg_path / "_cli" / "cli_utils" / "recipe_utils.py"

and then calling it with runpy.run_path, we should just be able to call the stuff we need in torchtune._cli.cli_utils.recipe_utils from within Python (i.e. right here)?

I'm not sure why this was done this way originally, but maybe we don't need that anymore?
CC @kartikayk @ebsmothers

This doesn't need to be addressed for this PR since this is pre-existing logic. But I suspect that whole logic could deserve a revamp.

It might be worth considering relying on argparse's subcommands: https://docs.python.org/dev/library/argparse.html#sub-commands

Many programs split up their functionality into a number of sub-commands, for example, the svn program can invoke sub-commands like svn checkout, svn update, and svn commit. Splitting up functionality this way can be a particularly good idea when a program performs several different functions which require different kinds of command-line arguments [...]

Yeah I agree with @NicolasHug on both points here. The subparser logic seems like it could be a viable approach for tackling our different tune subcommands. Feel free to file a follow-up issue for this

Great points - filed #397 for a follow-up.

ebsmothers

Thanks for making these changes! Generally looks good to me. A couple questions on locations of things:

(1) I'm a bit confused about the structure of torchtune/_cli. For example why is the primary entry point a CLI util but its subcommands (like ls.py) are not?

(2) What about the tests directory? Now that we are moving _cli down a level in the directory hierarchy, should we be doing the same for tests/_cli?

ebsmothers · 2024-02-21T05:30:40Z

torchtune/_cli/cli_utils/tune.py

@@ -1,4 +1,8 @@
-#!/usr/bin/env python3
+# Copyright (c) Meta Platforms, Inc. and affiliates.


Thank you for renaming this to a .py file, that was honestly driving me insane

I do it for my fans.

ebsmothers · 2024-02-21T05:59:38Z

torchtune/_cli/cli_utils/tune.py

    return total > script_args

-if __name__ == "__main__":
+
+def main():
    parser = get_args_parser()
    _update_parser_help(parser)
    args = parser.parse_args()

    distributed_args = _is_distributed_args(args)
    cmd = args.recipe
    if not cmd.endswith(".py"):


Dumb q: what are we doing here if the command does end in .py? (E.g. I wanna run tune my_local_recipe.py)

This logic will be entirely revamped in a follow-up PR. Just trying to do minimal changes here.

ebsmothers · 2024-02-21T06:06:30Z

torchtune/_cli/cli_utils/tune.py

+        sys.argv = [str(cmd)] + args.recipe_args
+        runpy.run_path(str(cmd), run_name="__main__")


Yeah I agree with @NicolasHug on both points here. The subparser logic seems like it could be a viable approach for tackling our different tune subcommands. Feel free to file a follow-up issue for this

NicolasHug

Thank you Joe,

This LGTM as a first step. I have left some comments, each of which would deserve its own follow-up (Evan's as well), but it's best to merge this PR now since it already provides a net improvement.

I can open the follow-up issues if that helps, LMK.

NicolasHug · 2024-02-21T10:10:27Z

setup.py

-    description="Package for finetuning LLMs and diffusion models using native PyTorch",
+    entry_points={
+        "console_scripts": [
+            "tune = torchtune._cli.cli_utils.tune:main",


Since the _cli namespace was added, _cli.cli_utils could probably become _cli.utils.

True, but I plan on getting rid of that subdirectory ASAP.

NicolasHug · 2024-02-21T10:12:02Z

torchtune/_cli/hf_upload/LICENSE

What is the plan for those files in hf_upload? They're not available from tune at the moment. Should they?

upload is still an open issue, according to our roadmap it's somewhere between P0.5 and P1. I know it's a bit of a cop-out, but I guess my response is "probably, but I'm going to cross this bridge when we get there".

joecummings · 2024-02-21T14:15:20Z

@ebsmothers

(1) I'm a bit confused about the structure of torchtune/_cli. For example why is the primary entry point a CLI util but its subcommands (like ls.py) are not?

The way it's currently structured, tune.py is the entrypoint and it batches out subcommands to things like ls.py, download.py.

(2) What about the tests directory? Now that we are moving _cli down a level in the directory hierarchy, should we be doing the same for tests/_cli?

Good catch, I'll go ahead and make this change in this PR.

ebsmothers · 2024-02-21T14:47:59Z

The way it's currently structured, tune.py is the entrypoint and it batches out subcommands to things like ls.py, download.py

Sorry to clarify this point: my question is more around the locations of these files. The cli_utils subdirectory contains config_utils.py and recipe_utils.py, which makes sense. Meanwhile, the subcommands ls.py and download.py are just under _cli/. To me, tune.py is logically more similar to the second set of files than to the first set, so why is it grouped with the utilities instead of its own subcommands?

joecummings · 2024-02-21T16:08:41Z

The way it's currently structured, tune.py is the entrypoint and it batches out subcommands to things like ls.py, download.py

Sorry to clarify this point: my question is more around the locations of these files. The cli_utils subdirectory contains config_utils.py and recipe_utils.py, which makes sense. Meanwhile, the subcommands ls.py and download.py are just under _cli/. To me, tune.py is logically more similar to the second set of files than to the first set, so why is it grouped with the utilities instead of its own subcommands?

Oh this will be changed. Waiting until landing 'tune cp' bc then I can just remove the whole subdir.

joecummings added 2 commits February 20, 2024 11:33

Move _scripts to _cli and under the torchtune pkg

8bbfdb0

Add file deletions and modify setup

fc3219d

facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Feb 20, 2024

joecummings added 3 commits February 20, 2024 11:42

Lint

fb2ae1b

stringify

fe1a1ad

Fix weird parsing around strings

ca290ea

joecummings marked this pull request as ready for review February 20, 2024 20:31

joecummings requested review from NicolasHug, ebsmothers and kartikayk February 20, 2024 20:31

NicolasHug reviewed Feb 20, 2024

View reviewed changes

ebsmothers reviewed Feb 21, 2024

View reviewed changes

NicolasHug approved these changes Feb 21, 2024

View reviewed changes

joecummings mentioned this pull request Feb 21, 2024

Revamp logic in tune CLI #397

Closed

Move testing files to proper place

48cf03b

joecummings merged commit 5ae6169 into main Feb 21, 2024
17 checks passed

joecummings deleted the move-scripts-to-proper-entrypoint branch February 21, 2024 19:12

kartikayk mentioned this pull request Feb 25, 2024

Get rid of _scripts, make tune an entrypoint #370

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Rename CLI tools and move to proper entrypoint #396

Rename CLI tools and move to proper entrypoint #396

joecummings commented Feb 20, 2024 •

edited

Loading

netlify bot commented Feb 20, 2024 •

edited

Loading

NicolasHug Feb 20, 2024

NicolasHug Feb 20, 2024

ebsmothers Feb 21, 2024

joecummings Feb 21, 2024 •

edited

Loading

ebsmothers left a comment

ebsmothers Feb 21, 2024

joecummings Feb 21, 2024

ebsmothers Feb 21, 2024

joecummings Feb 21, 2024

ebsmothers Feb 21, 2024

NicolasHug left a comment •

edited

Loading

NicolasHug Feb 21, 2024

joecummings Feb 21, 2024

NicolasHug Feb 21, 2024

joecummings Feb 21, 2024

joecummings commented Feb 21, 2024

ebsmothers commented Feb 21, 2024

joecummings commented Feb 21, 2024

		sys.argv = [str(cmd)] + args.recipe_args
		runpy.run_path(str(cmd), run_name="__main__")

		@@ -1,4 +1,8 @@
		#!/usr/bin/env python3
		# Copyright (c) Meta Platforms, Inc. and affiliates.

Rename CLI tools and move to proper entrypoint #396

Rename CLI tools and move to proper entrypoint #396

Conversation

joecummings commented Feb 20, 2024 • edited Loading

Context

Changelog

Test plan

netlify bot commented Feb 20, 2024 • edited Loading

✅ Deploy Preview for torchtune-preview ready!

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

joecummings Feb 21, 2024 • edited Loading

Choose a reason for hiding this comment

ebsmothers left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

NicolasHug left a comment • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

joecummings commented Feb 21, 2024

ebsmothers commented Feb 21, 2024

joecummings commented Feb 21, 2024

joecummings commented Feb 20, 2024 •

edited

Loading

netlify bot commented Feb 20, 2024 •

edited

Loading

joecummings Feb 21, 2024 •

edited

Loading

NicolasHug left a comment •

edited

Loading