-
Notifications
You must be signed in to change notification settings - Fork 230
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Fix huggingface GA issue for llama (#333)
## Summary <!--- This is a required section; please describe the main purpose of this proposed code change. ---> <!--- ## Details This is an optional section; is there anything specific that reviewers should be aware of? ---> To fix #322 This PR introduces a new `lce_forward` compatible with `transformers>=4.46.0` (after grad acc fix) while ensuring backward compatibilty. To be specific, i keep the original flce untouched and write a new one for `4.46.0`. If HF version is `<4.46.0`, it will show a warning for deprecation, and fallback to the old flce. ```python if transformer_version >= version.parse("4.46.0"): modeling_llama.LlamaForCausalLM.forward = llama_lce_forward else: # if version < 4.46.0 logger.warning( "Support for transformers versions < 4.46.0 will soon be discontinued due to issues with incorrect gradient accumulation. " "Please consider upgrading to avoid potential issues. See details: huggingface/transformers#34191" ) modeling_llama.LlamaForCausalLM.forward = llama_lce_forward_deprecated ``` For more context of grad acc fix, please see huggingface/transformers#34191 ## TODO - [ ] broadcast the changes to all models once the effect is verified. ## Testing Done <!--- This is a required section; please describe how this change was tested. ---> <!-- Replace BLANK with your device type. For example, A100-80G-PCIe Complete the following tasks before sending your PR, and replace `[ ]` with `[x]` to indicate you have done them. --> - Hardware Type: <BLANK> - [x] run `make test` to ensure correctness - [x] run `make checkstyle` to ensure code style - [x] run `make test-convergence` to ensure convergence
- Loading branch information
Showing
3 changed files
with
150 additions
and
2 deletions.
There are no files selected for viewing
File renamed without changes.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters