New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

[Feature] support Nemo #2094

Open

lihan opened this issue Jul 21, 2024 · 4 comments

Assignees

lihan commented Jul 21, 2024

Motivation

The best multilingual open source small LM.

https://mistral.ai/news/mistral-nemo/

Related resources

No response

Additional context

No response

Collaborator

zhyncs commented Jul 21, 2024

The structure is essentially the same as that of a Llama, supporting it shouldn't be too difficult, please stay tuned. ref https://github.com/vllm-project/vllm/pull/6548/files

maxin9966 commented Jul 21, 2024

@zhyncs Does it support the AWQ format?

lvhan028 assigned AllentDan

Collaborator

AllentDan commented Jul 25, 2024

AllentDan@d14ce34

I found that mistral and llama used different input_dim for attention.output layer. Do we need another type or name argument to distinguish the model architecture? @lvhan028 @lzhangzz
https://github.com/huggingface/transformers/blob/main/src/transformers/models/mistral/modeling_mistral.py#L199
https://github.com/huggingface/transformers/blob/main/src/transformers/models/llama/modeling_llama.py#L303

Collaborator

lzhangzz commented Jul 31, 2024

The input dim of attention.output should be computed as head_num * head_dim. The use of hidden_units_ is a bug.

AllentDan mentioned this issue

Fix hidden size and support mistral nemo #2215

Merged

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment