Release GPTQModel v0.9.11 · ModelCloud/GPTQModel

What's Changed

Added LG EXAONE 3.0 model support. New dynamic per layer/module flexible quantization where each layer/module may have different bits/params. Added proper sharding support to backend.BITBLAS. Auto-heal quantization errors due to small damp values.

[CORE] add support for pack and shard to bitblas by @LRL-ModelCloud in #316
Add dynamic bits by @PZS-ModelCloud in #311, #319, #321, #323, #327
[MISC] Adjust the validate order of QuantLinear when BACKEND is AUTO by @ZX-ModelCloud in #318
add save_quantized log model total size by @PZS-ModelCloud in #320
Auto damp recovery by @CSY-ModelCloud in #326
[FIX] add missing original_infeatures by @CSY-ModelCloud in #337
Update Transformers to 4.44.0 by @Qubitium in #336
[MODEL] add exaone model support by @LRL-ModelCloud in #340
[CI] Upload wheel to local server by @CSY-ModelCloud in #339
[MISC] Fix assert by @CSY-ModelCloud in #342

Full Changelog: v0.9.10...v0.9.11

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

GPTQModel v0.9.11

What's Changed

Contributors