Skip to content

GPTFast-0.3.1

Latest
Compare
Choose a tag to compare
@MDK8888 MDK8888 released this 22 Aug 04:11

GPTFast 0.3.1 is here 🚀🚀🚀!

  • Stabilized GPTQ for all models, both with and without bias.
  • Customized W4A16 matmul kernels with tiling that outperform nn.Linear by 30% on RTX 3050.