How to add KV cache quantization options? #1220

Closed Unanswered

limour-blog asked this question in Q&A

limour-blog
Feb 25, 2024

server : add KV cache quantization options

Replies: 1 comment

limour-blog
Feb 25, 2024
Author

I see llama_context_params, but how do I pass it to the class Llama?

0 replies

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment