v0.1.7

mobicham released this 24 Apr 08:59

· 148 commits to master since this release

HQQ v0.1.7

Faster inference with torchao / marlin 4-bit kernels
Multi-gpu support for model.quantize()
Custom HF generator
Various bug fixes/improvements

Assets 2