Adding Qwen2.5 #1834

ysjprojects · 2024-11-20T19:17:11Z

see #1709

Qwen2.5

0.5B, 1.5B, 3B, 7B, 14B, 32B, 72B

Qwen2.5 Coder

0.5B, 1.5B, 3B, 7B, 14B, 32B

Both base and instruct models.

Motivation:

Proven SOTA coding performance, especially on Qwen2.5-Coder series.
One of more recent open-source LM releases with decent performance on general benchmarks that is competitive with proprietary models.
Notably strong performance on Chinese benchmarks.
SOTA model that goes as small as 0.5B, which is very where and will serve many use cases in small LMs.

Andrei-Aksionov

Hello @ysjprojects 👋

Thanks for the PR.
I believe we wanted to implement Qwen models for quite a while, but never did.

Overall, it looks wonderful.
I just added a couple of nits.

litgpt/prompts.py

tests/test_tokenizer.py

litgpt/model.py

Andrei-Aksionov · 2024-11-22T11:42:10Z

One more thing: please update the description of the PR with more info about the model.

Co-authored-by: Andrei-Aksionov <58434077+Andrei-Aksionov@users.noreply.github.com>

litgpt/scripts/convert_hf_checkpoint.py

Andrei-Aksionov · 2024-11-24T16:42:18Z

I did a quick check of 0.5B and 1.5B instruct version (with the fix for conversion script).
Don't know about other languages. but in English even 0.5B performs surprisingly well 🙂.

@ysjprojects After you apply the fix that I've mentioned in the comment, I'll be happy to merge the PR.

ysjprojects · 2024-11-24T17:04:58Z

One more thing: please update the description of the PR with more info about the model.

is there some specific details that should be included?

ysjprojects · 2024-11-24T17:14:35Z

I did a quick check of 0.5B and 1.5B instruct version (with the fix for conversion script). Don't know about other languages. but in English even 0.5B performs surprisingly well 🙂.

@ysjprojects After you apply the fix that I've mentioned in the comment, I'll be happy to merge the PR.

fixed

Andrei-Aksionov · 2024-11-24T17:20:22Z

One more thing: please update the description of the PR with more info about the model.

is there some specific details that should be included?

Would be nice if you added why did you decide to add this exact model.
Coding abilities, excellent support of Chinese, ...
Just a bit of context.

ysjprojects added 5 commits November 19, 2024 21:44

qwen2.5: added config + special attn_bias

4448918

Qwen2.5: convert checkpoints scripts

f16944f

Update README.md

7703507

Update download_model_weights.md

97f6315

Qwen2.5: added prompt template

1e79370

ysjprojects requested review from williamFalcon, lantiga and rasbt as code owners November 20, 2024 19:17

ysjprojects added 3 commits November 20, 2024 14:56

Qwen2.5: fix tokenizer vocab size

4399041

Qwen2.5: fix test_tokenizer bos exception

bac1a27

Qwen2.5: fixed adding config.attn_bias to lora

d3576d2

Andrei-Aksionov reviewed Nov 22, 2024

View reviewed changes

litgpt/prompts.py Outdated Show resolved Hide resolved

tests/test_tokenizer.py Outdated Show resolved Hide resolved

litgpt/model.py Show resolved Hide resolved

ysjprojects and others added 5 commits November 22, 2024 13:38

Update tests/test_tokenizer.py

0c22a8e

Co-authored-by: Andrei-Aksionov <58434077+Andrei-Aksionov@users.noreply.github.com>

Qwen2.5: fix adding Qwen2.5-Coder in prompts.py

e441de3

Qwen2.5: added coder variant

982d0c9

Update README.md

0b9d172

Update download_model_weights.md

f68808d

Andrei-Aksionov reviewed Nov 24, 2024

View reviewed changes

litgpt/scripts/convert_hf_checkpoint.py Show resolved Hide resolved

Qwen2.5: fix convert lit/hf checkpoint scripts

d58eb95

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Adding Qwen2.5 #1834

Adding Qwen2.5 #1834

ysjprojects commented Nov 20, 2024 •

edited

Loading

Andrei-Aksionov left a comment

Andrei-Aksionov commented Nov 22, 2024

Andrei-Aksionov commented Nov 24, 2024

ysjprojects commented Nov 24, 2024

ysjprojects commented Nov 24, 2024

Andrei-Aksionov commented Nov 24, 2024

Adding Qwen2.5 #1834

Are you sure you want to change the base?

Adding Qwen2.5 #1834

Conversation

ysjprojects commented Nov 20, 2024 • edited Loading

Qwen2.5

Qwen2.5 Coder

Andrei-Aksionov left a comment

Choose a reason for hiding this comment

Andrei-Aksionov commented Nov 22, 2024

Andrei-Aksionov commented Nov 24, 2024

ysjprojects commented Nov 24, 2024

ysjprojects commented Nov 24, 2024

Andrei-Aksionov commented Nov 24, 2024

ysjprojects commented Nov 20, 2024 •

edited

Loading