-
Notifications
You must be signed in to change notification settings - Fork 448
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Recipes] Bunch of refactoring #511
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/torchtune/511
Note: Links to docs will display an error until the docs builds have been completed. ✅ No FailuresAs of commit 0b4f481 with merge base 4f73f75 (): This comment was automatically generated by Dr. CI and updates every 15 minutes. |
✅ Deploy Preview for torchtune-preview ready!
To edit notification comments on pull requests, go to your Netlify site configuration. |
opt_state_dict=checkpoint_dict[utils.OPT_KEY] | ||
if self._resume_from_checkpoint | ||
else None, | ||
opt_state_dict=( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think our linters that run in CI are consistent with VSCode autoformatters, hence changes like these sneaking in. Shall we look into this? cc @ebsmothers @NicolasHug
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
VSCode autoformatter or pre-commit hooks? If the former I think it's low-pri tbh
if ( | ||
cfg.full_bf16 | ||
self._training_precision == torch.bfloat16 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note we also have this utility now
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think I can use that util as-is since it will bail out for CPU devices, but I want this recipe to run on CPU for current CI.
Context
dtype
are the same, fp16 is disabled across recipes, bf16 is checked, etcChangelog
dtype
which now maps to the "full" low precision as mentioned in [RFC] Configuring low precision training in torchtune #504.log.info(memory_stats_log(...))
.Test plan
Follow-ups