Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Weights are not copied when model is already instantiated #279

Closed
ByronHsu opened this issue Sep 27, 2024 · 1 comment
Closed

Weights are not copied when model is already instantiated #279

ByronHsu opened this issue Sep 27, 2024 · 1 comment
Assignees

Comments

@ByronHsu
Copy link
Collaborator

ByronHsu commented Sep 27, 2024

🐛 Describe the bug

In #199, we introduced an api to replace the already instantiated nn.module with liger module. However, the weights are not copied over, so it causes issue #268 and #257.

Solutions

  1. only patch with the forward function of liger module
  2. if patching globally with the forward function of liger module, do we still need the post-init patching?

Reproduce

No response

Versions

na

@shimizust
Copy link
Collaborator

Thanks for finding the issue and suggesting a solution @ByronHsu . Here is the fix (waiting on other fixes to make all the convergence tests pass): #280

We still need post-init patching--patching the forward method globally won't replace the forward method of the modules on the already instantiated model.

ByronHsu pushed a commit that referenced this issue Sep 30, 2024
## Summary
- Previously, the pre-trained weights were not being loaded if patching
model post-initialization
- Instead of loading weights, just patch the model instance module's
forward method (see #279)

## Testing Done
- In convergence tests, check that pre-init patching and post-init
patching match results from original model

- Hardware Type: A100
- [x] run `make test` to ensure correctness
- [x] run `make checkstyle` to ensure code style
- [ ] run `make test-convergence` to ensure convergence --> most tests
working, waiting for other fixes for all tests to pass
tyler-romero pushed a commit to tyler-romero/Liger-Kernel that referenced this issue Oct 1, 2024
## Summary
- Previously, the pre-trained weights were not being loaded if patching
model post-initialization
- Instead of loading weights, just patch the model instance module's
forward method (see linkedin#279)

## Testing Done
- In convergence tests, check that pre-init patching and post-init
patching match results from original model

- Hardware Type: A100
- [x] run `make test` to ensure correctness
- [x] run `make checkstyle` to ensure code style
- [ ] run `make test-convergence` to ensure convergence --> most tests
working, waiting for other fixes for all tests to pass
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants