GPTQModel v0.9.0
What's Changed (First Release since AutoGPTQ fork)
4 New Models plus sym=False
asymmetry and lm_head
quantized inference support.
- ✨ [FEATURE/BUG]
sym=false
support by @qwopqwop200, @Qubitium, @fxmarty - ✨ [FEATURE]
lm_head
quantization inference by @Qubitium - 🚀 [MODEL] ChatGLM by @LRL-ModelCloud @Qubitium
- 🚀 [MODEL] MiniCPM model support by @LDLINGLINGLING, @Qubitium in #18
- 🚀 [MODEL] Phi-3 model support by @davidgxue, @ZX-ModelCloud in #27
- 🚀 [MODEL] QwenMoE model support by @bozheng-hit, @LRL-ModelCloud in #24
- 🚀 [CORE] Faster quantization and better quality (PPL) quant by @Qubitium
- 👾[BUG] H100 crash by @Qubitium
- 👾[BUG] Packing perf regression on high core-count systems by @Qubitium
- 🚀 [REFRACTOR] Major refractor and code debloat by @Qubitium
- 🤖 [CI] Code quality by @CSY-ModelCloud in #31
- 🤖 [CI] Add Perplexity regression test by @LRL-ModelCloud in #1
- 🤖 [CI] Add Runner by @CSY-ModelCloud in #3
Full Changelog: https://github.com/ModelCloud/GPTQModel/commits/v0.9.0