Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Fix] Fix AttributeError in Qwen2.5 LoRA: 'Qwen2ForCausalLM' object has no attribute 'get_hidden_dim' #1536

Merged
merged 7 commits into from
Oct 3, 2024

Conversation

mssongit
Copy link
Contributor

@mssongit mssongit commented Sep 29, 2024

Motivation

This PR fixes an AttributeError that occurs when using LoRA with the Qwen2.5 model. The error stems from an attempt to call get_module_name and get_hidden_dim on the Qwen2ForCausalLM model, which does not have this attribute. This breaks the LoRA initialization process, specifically when preallocating memory for the LoRA adapters.

Modifications

Add fallback function for get_module_name and get_hidden_dim

Checklist

  • Format the code according to the Contributor Guide.
  • Add unit tests as outlined in the Contributor Guide.
  • Update documentation as needed, including docstrings or example tutorials.

@merrymercy
Copy link
Contributor

@mssongit Could you fix the lint error?

@mssongit
Copy link
Contributor Author

@mssongit Could you fix the lint error?

Sure! I’ve already fixed the lint error by ensuring the code properly checks for the existence of the get_hidden_dim method and falls back to config.hidden_size when necessary. I've also run the linter (black and others), and the issue should now be resolved. Let me know if you need any further adjustments or checks!

@merrymercy merrymercy requested a review from Ying1123 September 30, 2024 03:29
@Ying1123 Ying1123 changed the title Fix AttributeError in Qwen2.5 LoRA: 'Qwen2ForCausalLM' object has no attribute 'get_hidden_dim' [Fix] Fix AttributeError in Qwen2.5 LoRA: 'Qwen2ForCausalLM' object has no attribute 'get_hidden_dim' Sep 30, 2024
@Ying1123
Copy link
Member

Hi @mssongit, thanks for the contribution! Could you provide a lora adapter path for testing?

@mssongit
Copy link
Contributor Author

mssongit commented Oct 1, 2024

Hi @mssongit, thanks for the contribution! Could you provide a lora adapter path for testing?

The base model I am using is Qwen/Qwen2.5-14B-Instruct, and the LoRA adapter has been uploaded to mssongit/Qwen2.5-14B-SFT-LoRA. Feel free to use that for testing!

AttributeError: 'Qwen2ForCausalLM' object has no attribute 'get_hidden_dim'
…lLM' object has no attribute 'get_hidden_dim'

- Added a check for the existence of `get_hidden_dim` in `base_model`.
- Fallback to `base_model.config.hidden_size` if `get_hidden_dim` is not available.
- This ensures backward compatibility and prevents potential errors in models that don't implement `get_hidden_dim`.
…lLM' object has no attribute 'get_module_name'

### Description
- Applied `black` formatting to `lora_manager.py` to ensure consistent code style and pass pre-commit linting checks.
- No functional changes were made, only code style adjustments for compliance with `black` standards.

### Changes
- Reformatted the `lora_manager.py` file using `black`.
- Ensured that the code adheres to the project's style guidelines.

### Testing
- Ran `pre-commit` hooks to ensure `black` formatting and other linting checks pass successfully.
Optimized get_hidden_dim Calls:
The get_hidden_dim function is now called only once per module, and the result is reused to avoid redundant calculations, leading to better performance.
CUDA Memory Allocation Optimization:

Memory allocation using torch.empty has been optimized by checking if module_A and module_B already exist in A_buffer and B_buffer, respectively, to prevent redundant allocations.
Removal of Redundant Code:

Refactored redundant memory allocation and get_stacked_multiply calls, improving code readability and maintainability.
Modified the logic to correctly handle self.origin_target_modules, which is a list of strings (module names), not actual module objects.
Updated the self.target_modules initialization:
If the base_model has a get_module_name method, use it to map origin_target_modules.
Otherwise, compare the module names in self.origin_target_modules with the names retrieved from named_modules() to ensure correct behavior.
This fix resolves errors encountered in models like Qwen2.5 LoRA, which do not have the get_module_name function.
@Ying1123 Ying1123 force-pushed the patch-2 branch 4 times, most recently from 42aad95 to 1b70d9c Compare October 2, 2024 21:46
@Ying1123
Copy link
Member

Ying1123 commented Oct 2, 2024

Hi @mssongit, I have directly modified your PR so that you may take a look. Currently, there is some issue with the base model "Qwen/Qwen2.5-14B-Instruct", which I'm still investigating. The fix for the base model will come up in another PR.

@Ying1123 Ying1123 merged commit e6852b0 into sgl-project:main Oct 3, 2024
11 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants