Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Throw a warning if only the IGNORE field is mismatched #252

Merged
merged 6 commits into from
Dec 8, 2023

Conversation

stefan-apollo
Copy link
Collaborator

Only throw a warning if the IGNORE field is mismatched, instead of raising an error.

Description

Allows loading transformers that use a different causal masking strategy (-1e5 rather than -inf). Useful for loading models before the convention change if they don't break with -1e5 (such as mod add models). Should not be used for Pythia.

Prints a warning whenever this occurs.

Motivation and Context

I sometimes need to load old mod add models, want to be able to reproduce results. Backwards compatibility.

How Has This Been Tested?

I can load models now; I checked that mod add does not have large attention scores.

Does this PR introduce a breaking change?

No

f"Buffer {seq_param_name} does not match between seq_model and tlens_model"
)
if seq_param_name.endswith("IGNORE"):
print(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please make this a warnings.warn call.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ideally would fix here too :)

print(

@stefan-apollo
Copy link
Collaborator Author

Implemented logging as requested on Slack, see here for screenshots.

@stefan-apollo stefan-apollo merged commit c0a5f16 into main Dec 8, 2023
1 check passed
@stefan-apollo stefan-apollo changed the title Only throw a warning if the IGNORE field is mismatched Throw a warning if only the IGNORE field is mismatched Dec 8, 2023
nix-apollo added a commit that referenced this pull request Dec 11, 2023
* origin/main:
  Utility to get rib activations (#242)
  Only throw a warning if the IGNORE field is mismatched (#252)
  Normalize after squaring in jacobian squared (#247)
  Add option to add node_labels to graph (#250)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants