-
Notifications
You must be signed in to change notification settings - Fork 441
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Mistral testing #888
Mistral testing #888
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/torchtune/888
Note: Links to docs will display an error until the docs builds have been completed. ✅ No FailuresAs of commit 190cc8a with merge base bec7bab (): This comment was automatically generated by Dr. CI and updates every 15 minutes. |
… model. Added comparison scripts and verified correctness.
I've updated scripts for the rest of the mistral components. I need to write the comparison involving mapping state dicts, update the unit test, and (potentially) add LoRA comparisons. |
…d unit test with expected value
Okay, all seems good. We now have a unit test for the base For the unfortunate reviewer seeing my +1160 line PR (I hope you read this first!): |
@@ -0,0 +1,186 @@ | |||
# Copyright (c) Meta Platforms, Inc. and affiliates. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the main file used for comparing implementations.
Thanks for all this extensive testing!
I think we wanna find the right balance of rigorous testing and maintenance here. So while I don't want your work to be in vain, I wonder if we should just add those comparison scripts that differ nontrivially from the Llama2 ones, and for other components point to the Llama2 ones. So in this case that would mean keep compare_mistral and compare_feedforward (since you mentioned it's not tested under llama2). Then you can add a readme to |
not in vain at all - I learnt lots! I've updated and added a README. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good! Two small nits, with green CI this is good to merge
Co-authored-by: ebsmothers <ebs@meta.com>
…into mistral-tests
Thanks again for your review @ebsmothers :) |
Context
What is the purpose of this PR? Is it to
Please link to any issues this PR addresses.
#848
Changelog
I've started adding scripts to verify the implementation of
mistral
. I'm using the reference implementation from the official repo. There's another implementation in the repo which usesxformers
for the attention mechanism, but it's not straightforward to replicate. I ended up running into lots of issues when I initially tried.So far, I've added a script to compare the attention implementation. I've verified the attention implementation produces consistent ouputs using
python -m tests.torchtune.models.mistral.scripts.compare_attention
. I'll be keeping the reference implementation intests/torchtune/models/mistral/scripts/mistral_reference.py
.Next steps
I'm generally following this process - the plan is to continue copying and testing the components of the mistral implementation, and then testing models as a whole and implementing mapping
torchune.models.mistral
into the reference implementation. Finally, I'll add unit tests to integrate into CI.Good to make sure I'm not too far off the mark : )