You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hello, Qwen2 implements attention calculation using GQA. In our implementation, we have added support for GQA, and using our LLaMA implementation, it can support GQA models like LLaMA-3. The model architecture of Qwen2 shares similarities with LLaMA, so you can extend Qwen2 based on our LLaMA implementation.
hi what the transformers version for? I use the latest which is 4.46.2,but got the problems :TypeError: LlamaRotaryEmbedding.forward() got an unexpected keyword argument 'seq_len',which maybe need a
old version of 4.37, but I quantiztioned for LLAMA3.1-8B-INSTURCT
No description provided.
The text was updated successfully, but these errors were encountered: