-
-
Notifications
You must be signed in to change notification settings - Fork 877
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Update XFormers Attention Monkeypatch to handle Llama-2 70B (GQA) (#339)
* Fix XFormers attention for Llama-2 70B (GQA) Updated XFormers MonkeyPatch to handle GQA as used in Llama-2 70B. All the updated code is taken directly from the Transformers library: huggingface/transformers@07360b6#diff-06392bad3b9e97be9ade60d4ac46f73b6809388f4d507c2ba1384ab872711c51 from their llama_modeling.py file. * Catch configs without pretraining_tp * Whitespace bug fix Command had accidentally been moved out of if-else block. * pre-commit formatting fixes Thanks to @winglian
- Loading branch information
Showing
1 changed file
with
66 additions
and
17 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters