-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Throw a warning if only the IGNORE field is mismatched #252
Conversation
f"Buffer {seq_param_name} does not match between seq_model and tlens_model" | ||
) | ||
if seq_param_name.endswith("IGNORE"): | ||
print( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please make this a warnings.warn call.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ideally would fix here too :)
Line 129 in 7ac5fec
print( |
Implemented logging as requested on Slack, see here for screenshots. |
Only throw a warning if the IGNORE field is mismatched, instead of raising an error.
Description
Allows loading transformers that use a different causal masking strategy (-1e5 rather than -inf). Useful for loading models before the convention change if they don't break with -1e5 (such as mod add models). Should not be used for Pythia.
Prints a warning whenever this occurs.
Motivation and Context
I sometimes need to load old mod add models, want to be able to reproduce results. Backwards compatibility.
How Has This Been Tested?
I can load models now; I checked that mod add does not have large attention scores.
Does this PR introduce a breaking change?
No