Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add NormalizedConfig support qwen, baichuan, chatglm #1490

Open
wants to merge 8 commits into
base: main
Choose a base branch
from
5 changes: 5 additions & 0 deletions optimum/utils/normalized_config.py
Original file line number Diff line number Diff line change
Expand Up @@ -262,6 +262,11 @@ class NormalizedConfigManager:
"whisper": WhisperLikeNormalizedTextConfig,
"xlm-roberta": NormalizedTextConfig,
"yolos": NormalizedVisionConfig,
"mpt": MPTNormalizedTextConfig,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mpt is already in the list.

"baichuan": NormalizedTextConfig,
"qwen": NormalizedTextConfig,
"chatglm": NormalizedTextConfig.with_args(num_layers="num_layers"),
Copy link
Collaborator

@echarlaix echarlaix Oct 31, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what about the vocab size, shouldn't it be padded_vocab_size for ChatGLM models ?
https://huggingface.co/THUDM/chatglm3-6b/blob/main/config.json#L32

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

chatglm has 3 models , I find chatglm is vocab_size, but chatglm2&3 is padded_vocab_size,Could you help me deal with this situation? @echarlaix
https://huggingface.co/THUDM/chatglm-6b/blob/main/config.json#L27

Comment on lines +266 to +268
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

qwen2 is now available in transformers.

baichuan and chatglm are not.


}

@classmethod
Expand Down