-
-
Notifications
You must be signed in to change notification settings - Fork 876
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update get_unpad_data
patching for multipack
#2013
Update get_unpad_data
patching for multipack
#2013
Conversation
modeling_arch = importlib.import_module(module_name) | ||
modeling_arch._get_unpad_data = get_unpad_data # pylint: disable=protected-access | ||
if hasattr(modeling_arch, "_get_unpad_data"): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is any handling needed if"_get_unpad_data"
is not available? For ex, throw error to say packing not available?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wanted to get an opinion on this, I am not sure myself. Maybe it can be an error for some known model types where we expect it and warnings for others?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm, I think it's good to have a check in case upstream changes a model type we support without our knowledge.
else:
if {model_type} in list_of_known_working_types:
raise Error
I think once we address the last 2 suggestions, this should be good to go. We should also make sure we manually run the multigpu tests before merging this PR. |
I think we should add a small e2e test for this. Perhaps similar to |
Manually Tested CI:
Edit: Having some issues verifying the last two runs, re-running. Edit2: Double-checked those tests are working. |
* Update `get_unpad_data` patching for multipack * Update src/axolotl/utils/models.py * Update src/axolotl/utils/models.py * Add test case --------- Co-authored-by: Wing Lian <wing.lian@gmail.com> Co-authored-by: Wing Lian <wing@axolotl.ai>
Fixes #1991 (at least attempts to)
The code for checking if a model has remote code is flawed where it assumes if someone is passing
trust_remote_code
then it must have remote code. That causes get_unpad_data patching to be missed if someone passestrust_remote_code: True
breaking mulit packed sample packing with flash attentionThis PR changes that to rely on
auto_map
defined in model'sconfig.json
Plus additionally removes the pre 4.43 transformers patching code.
How has this been tested?
So far, manually, but would be nice to add tests. First would like to get some review feedback
Requesting review from @winglian and @NanoCode012