-
Notifications
You must be signed in to change notification settings - Fork 230
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Post-init model patching fix #280
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
lancerts
previously approved these changes
Sep 29, 2024
Open
shimizust
force-pushed
the
sshimizu/patching-fix
branch
from
September 30, 2024 19:01
fa98b78
to
2b6317e
Compare
lancerts
approved these changes
Sep 30, 2024
@ByronHsu Confirmed all tests pass locally on A100 with transformers 4.44.2 if you can force merge |
tyler-romero
pushed a commit
to tyler-romero/Liger-Kernel
that referenced
this pull request
Oct 1, 2024
## Summary - Previously, the pre-trained weights were not being loaded if patching model post-initialization - Instead of loading weights, just patch the model instance module's forward method (see linkedin#279) ## Testing Done - In convergence tests, check that pre-init patching and post-init patching match results from original model - Hardware Type: A100 - [x] run `make test` to ensure correctness - [x] run `make checkstyle` to ensure code style - [ ] run `make test-convergence` to ensure convergence --> most tests working, waiting for other fixes for all tests to pass
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Summary
Testing Done
In convergence tests, check that pre-init patching and post-init patching match results from original model
Hardware Type: A100
run
make test
to ensure correctnessrun
make checkstyle
to ensure code stylerun
make test-convergence
to ensure convergence --> most tests working, waiting for other fixes for all tests to pass