v0.8.6 - support LongLLaMA
Breaking changes
- Setting the internal
past
attribute of the cache toNone
now will cause an error to be raised if you try to use it again. Please use the original model instead
New features
- Support LongLLaMA
repr
for cached model- Don't check logits from Llama CPP
Bug fixes
None