Releases: ggerganov/llama.cpp
Releases · ggerganov/llama.cpp
b4006
ggml : remove ggml_scratch (#10121) ggml-ci
b4005
sync : ggml
b4003
build: fix build error in Windows env with OneAPI setup (#10107)
b4002
llama : improve output buffer type selection (#10098)
b4001
quantize : fix --keep-split (#10114)
b4000
llama : fix buffer checks for mamba and rwk (#10111) * llama : fix buffer checks for mamba and rwk * llama : fix missing worst case flag during reserve * cuda : fix supports_op for norm * disable sched SET_CAUSE
b3999
loader: refactor tensor weights storage (#9935) * loader: refactor tensor weights storage * use sorted map, sort weights by layer --------- Co-authored-by: slaren <slarengh@gmail.com>
b3998
server : include scheme when printing URL (#10106)
b3997
ggml : check tensor name lengths in gguf files (#10100)
b3996
kompute: add mul_mat_q4_k shader (#10097) This is a more or less direct translation from the Metal implementation to GLSL. Signed-off-by: Sergio Lopez <slp@redhat.com>