How to add KV cache quantization options? #1220
Replies: 1 comment
-
I see llama_context_params, but how do I pass it to the class Llama? |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
server : add KV cache quantization options
Beta Was this translation helpful? Give feedback.
All reactions