support phi 3 #27

ZX-ModelCloud · 2024-06-20T12:50:54Z

No description provided.

…port_phi_3

Qubitium · 2024-06-20T13:09:52Z

@davidgxue We have validated this PR with modfications on real model quant with pre/post PPL. Ready for merge. Thank you for the Phi3 code!

Qubitium · 2024-06-20T13:12:43Z

Test Result:

Phi-3-medium-4k-instruct
Native PPL: 4.189669132232666
Format gptq_v2, Quantized PPL: 4.834439754486084

davidgxue · 2024-06-21T14:20:50Z

Awesome work and thank you for adding me on here haha! AutoGPTQ library still seems glitchy? Both LLAMA 3 and Phi3 quants from AutoGPTQ has gibberish output issues when using with transformers... Hopefully this repo is better maintained!

Qubitium · 2024-06-21T15:25:31Z

@davidgxue Phi3 appears to contain fused modules and from our experience, any fused modules, will result in less than ideal calibration. That's why you see the Phi-3 post quant much higher (diff vs pre-quant) than other models. As such, worse output is expected but I do see an issue with bad/gibberish.

Can you provide me with examples of gibberish (are these generated on native or gptq qunated) models? With our team, expert in gptq quants, we may be able to src the bug.

I would highly recommend you to re-quant them use GPTQModel as our code vs AutoGPTQ will in many cases by default generate higher quality quants due to the optimizations we have made. We also added lots of protection against setting toggles that generate bad quants.

Hopefully this repo is better maintained!

This is our pledge, mission statement, and core reason why I pushed our team to fork AutoGPTQ.

davidgxue · 2024-06-21T15:30:13Z

Oh yeah the fused modules part is definitely expected.

What I meant is LLAMA 3 and Phi 3 (well i guess they are both llama family maybe) literally not working. I have had this issue open for a long time on AutoGPTQ repo, and im not the only one cuz I released a few GPTQ quants on huggingface and im being spammed in DM or huggingface issues about people experiencing the exact same thing.
AutoGPTQ/AutoGPTQ#657

This is our pledge, mission statement, and core reason why I pushed our team to fork AutoGPTQ.

Love it! Will follow you guys and see if there are stuff I can help out on!

* initial attempt to add phi 3 (may need unfuse) * fix Phi3GPTQ base_modules --------- Co-authored-by: David Xue <xuegdxw@gmail.com>

davidgxue and others added 3 commits April 26, 2024 23:43

initial attempt to add phi 3 (may need unfuse)

e1d9aba

Merge remote-tracking branch 'davidgxue/add-phi3-support' into zx_sup…

d5eb963

…port_phi_3

fix Phi3GPTQ base_modules

6b455d2

Qubitium merged commit fd27156 into main Jun 20, 2024
2 of 3 checks passed

Qubitium deleted the zx_support_phi_3 branch June 20, 2024 13:10

Qubitium mentioned this pull request Jun 24, 2024

Extend support for Phi-3 models AutoGPTQ/AutoGPTQ#651

Open

DeJoker pushed a commit to DeJoker/GPTQModel that referenced this pull request Jul 19, 2024

Ruff fix (ModelCloud#27)

4f6fbcd

DeJoker pushed a commit to DeJoker/GPTQModel that referenced this pull request Jul 19, 2024

support phi 3 (ModelCloud#27)

0eb6ac0

* initial attempt to add phi 3 (may need unfuse) * fix Phi3GPTQ base_modules --------- Co-authored-by: David Xue <xuegdxw@gmail.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

support phi 3 #27

support phi 3 #27

ZX-ModelCloud commented Jun 20, 2024

Qubitium commented Jun 20, 2024

Qubitium commented Jun 20, 2024

davidgxue commented Jun 21, 2024

Qubitium commented Jun 21, 2024 •

edited

Loading

davidgxue commented Jun 21, 2024

support phi 3 #27

support phi 3 #27

Conversation

ZX-ModelCloud commented Jun 20, 2024

Qubitium commented Jun 20, 2024

Qubitium commented Jun 20, 2024

davidgxue commented Jun 21, 2024

Qubitium commented Jun 21, 2024 • edited Loading

davidgxue commented Jun 21, 2024

Qubitium commented Jun 21, 2024 •

edited

Loading