[MODEL] Add qwen moe support #24

LRL-ModelCloud · 2024-06-20T11:51:31Z

No description provided.

…e-support

Qubitium · 2024-06-20T14:20:02Z

@bozheng-hit We have refractored your code to be more generic so that the same dynamic_expert_index variable expansion code can be re-used by other expert models. Currently we are quantizing and doing validation to make sure ppl before and after quant is in normal range. We should have this merged in the next couple of hours.

…wen-moe-support

Qubitium · 2024-06-20T16:40:40Z

TEST PASSED:

Native PPL: 6.4653
Quant PPL: 6.8459

Note we only a rudimentary quant for testing and not a full-quality quant. The PPL values here is only for regression/sanity check.

* support Qwen2MoE * add description * change order * add type and type hint * fix wrong args name and order * fix need Qwen2MoeGPTQ * update README.md * add require_true_sequential property * add dynamic_expert_layer_index property * add todo * shorten name * use getattr() * reduce torch/triton requirements to torch/triton 2.0.0 * rename inside_layer_modules to layer_modules * print layers log * Update perplexity.py * Update model.py * rename * remove unused log --------- Co-authored-by: bozheng-hit <dsoul0621@gmail.com> Co-authored-by: Bo Zheng <368586905@qq.com> Co-authored-by: LRL-ModelCloud <lrl@modelcloud.ai> Co-authored-by: ZX-ModelCloud <zx@modelcloud.ai> Co-authored-by: diegomontoya <xing@fictionpress.com> Co-authored-by: Qubitium-ModelCloud <qubitium@modelcloud.ai>

bozheng-hit and others added 14 commits March 14, 2024 17:23

support Qwen2MoE

b8fd802

add description

255fe29

Merge branch 'main' into qwen2_moe

bf5b702

Merge remote-tracking branch 'bozheng-hit/qwen2_moe' into add-qwen-mo…

53253d4

…e-support

change order

a36e76e

add type and type hint

5b6cfa2

fix wrong args name and order

4d742cc

fix need Qwen2MoeGPTQ

d52ba55

Merge branch 'main' into add-qwen-moe-support

4a042f6

update README.md

3631d2c

add require_true_sequential property

e0aec12

add dynamic_expert_layer_index property

a088080

add todo

1e3e484

shorten name

8ec8615

LRL-ModelCloud and others added 4 commits June 20, 2024 22:24

use getattr()

c23b8c0

Merge remote-tracking branch 'origin/add-qwen-moe-support' into add-q…

f4b527e

…wen-moe-support

reduce torch/triton requirements to torch/triton 2.0.0

283eb86

Merge branch 'main' into add-qwen-moe-support

bef1039

Qubitium changed the title ~~Add qwen moe support~~ [MODEL] Add qwen moe support Jun 20, 2024

LRL-ModelCloud and others added 5 commits June 21, 2024 00:20

Merge branch 'main' into add-qwen-moe-support

9a5f439

rename inside_layer_modules to layer_modules

bdac695

print layers log

b31a8a9

Update perplexity.py

099de33

Update model.py

d0138dd

Qubitium and others added 2 commits June 20, 2024 16:44

rename

95de74b

remove unused log

235996b

Qubitium merged commit 95b2fe6 into main Jun 20, 2024
1 of 2 checks passed

Qubitium deleted the add-qwen-moe-support branch June 20, 2024 16:47

DeJoker pushed a commit to DeJoker/GPTQModel that referenced this pull request Jul 19, 2024

remove wasteful code dedicated to arg passing (ModelCloud#24)

a9260d1

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[MODEL] Add qwen moe support #24

[MODEL] Add qwen moe support #24

LRL-ModelCloud commented Jun 20, 2024

Qubitium commented Jun 20, 2024 •

edited

Loading

Qubitium commented Jun 20, 2024 •

edited

Loading

[MODEL] Add qwen moe support #24

[MODEL] Add qwen moe support #24

Conversation

LRL-ModelCloud commented Jun 20, 2024

Qubitium commented Jun 20, 2024 • edited Loading

Qubitium commented Jun 20, 2024 • edited Loading

Qubitium commented Jun 20, 2024 •

edited

Loading

Qubitium commented Jun 20, 2024 •

edited

Loading