-
Notifications
You must be signed in to change notification settings - Fork 448
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Phi3 Mini 4K Instruct Model to torchtune #876
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/torchtune/876
Note: Links to docs will display an error until the docs builds have been completed. ✅ No FailuresAs of commit 872911a with merge base 3890200 (): This comment was automatically generated by Dr. CI and updates every 15 minutes. |
# pip install bitsandbytes | ||
# | ||
# To launch on a single device, run the following command from root: | ||
# tune run full_finetune_single_device --config recipes/config/phi3/mini_full_low_memory.yaml |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
# tune run full_finetune_single_device --config recipes/config/phi3/mini_full_low_memory.yaml | |
# tune run full_finetune_single_device --config phi3/mini_full_low_memory.yaml |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh I havent added this to the registry, so this won't work. I'll let you take care of that
# You can add specific overrides through the command line. For example | ||
# to override the checkpointer directory while launching training | ||
# you can run: | ||
# tune run full_finetune_single_device --config recipes/config/phi3/mini_full_low_memory.yaml checkpointer.checkpoint_dir=<YOUR_CHECKPOINT_DIR> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
# tune run full_finetune_single_device --config recipes/config/phi3/mini_full_low_memory.yaml checkpointer.checkpoint_dir=<YOUR_CHECKPOINT_DIR> | |
# tune run full_finetune_single_device --config phi3/mini_full_low_memory.yaml checkpointer.checkpoint_dir=<YOUR_CHECKPOINT_DIR> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same
x_out = rope_phi3(input) | ||
|
||
# check the numerics of the computed tensor | ||
assert_expected(x_out.mean(), tensor(-0.0005), atol=1e-4) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you include in the PR where these numbers come from?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Info's in the gist! Don't really want to replicate all of the info in the context again
from torch import nn, Tensor | ||
|
||
|
||
class Phi3RotaryPositionalEmbeddings(nn.Module): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're a hero
num_kv_heads=self._config["num_key_value_heads"], | ||
dim=self._config["hidden_size"], | ||
) | ||
if self._model_type == ModelType.PHI3_MINI: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does capitalization change this? B/c in the config it's PHI_MINI
and in the enum it's phi_mini
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yup, it matters!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mostly nits! Thanks @kartikayk
|
||
def phi3_tokenizer(path: str) -> SentencePieceTokenizer: | ||
tokenizer = SentencePieceTokenizer(path) | ||
tokenizer.pad_id = 32000 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking at the HF config it says that eos_id and pad_id are the same, is that right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, but also will let @joecummings confirm that!
TODO: The implementation below can be made more efficient | ||
for inference. | ||
""" | ||
# input tensor has shape [b, s, n_h, n_d] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(based on the above docstring)
# input tensor has shape [b, s, n_h, n_d] | |
# input tensor has shape [b, s, n_h, h_d] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh good catch!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good!
Imp Note: The tokenizer still needs some work, this will be a follow up PR.
Context
What is the purpose of this PR? Is it to
Please link to any issues this PR addresses.
Changelog
This PR adds support for the Phi3 Mini 4K Instruct model to torchtune. Specifically we add the following:
Test plan
Please make sure to do each of the following if applicable to your PR. (If you're not sure about any one of these just ask and we will happily help.)
pre-commit install
)pytest tests
pytest tests -m integration_test
Unit Tests
Detailed comparisons with reference implementation
Full-finetune Recipe