Skip to content

v0.1.7

Compare
Choose a tag to compare
@mobicham mobicham released this 24 Apr 08:59
· 148 commits to master since this release

HQQ v0.1.7

  • Faster inference with torchao / marlin 4-bit kernels
  • Multi-gpu support for model.quantize()
  • Custom HF generator
  • Various bug fixes/improvements