Skip to content

GPTFast 0.2.0

Compare
Choose a tag to compare
@MDK8888 MDK8888 released this 02 Apr 04:16
· 41 commits to master since this release
7653cca
  • Inference speeds are now accelerated by 6-8.5x
  • Static key-value caching is now enabled for all Hugging Face models
  • Support for generic sampling functions in addition to argmax
  • Debugged speculative decoding