Skip to content

GPTQModel v0.9.11

Compare
Choose a tag to compare
@Qubitium Qubitium released this 09 Aug 10:33
· 658 commits to main since this release
f2fcdc8

What's Changed

Added LG EXAONE 3.0 model support. New dynamic per layer/module flexible quantization where each layer/module may have different bits/params. Added proper sharding support to backend.BITBLAS. Auto-heal quantization errors due to small damp values.

Full Changelog: v0.9.10...v0.9.11