Skip to content

Releases: ggerganov/llama.cpp

b4006

01 Nov 12:00
1804adb
Compare
Choose a tag to compare
ggml : remove ggml_scratch (#10121)

ggml-ci

b4005

01 Nov 10:28
815fe72
Compare
Choose a tag to compare
sync : ggml

b4003

01 Nov 04:11
e597e50
Compare
Choose a tag to compare
build: fix build error in Windows env with OneAPI setup (#10107)

b4002

01 Nov 01:34
85679d3
Compare
Choose a tag to compare
llama : improve output buffer type selection (#10098)

b4001

01 Nov 01:22
1e9f949
Compare
Choose a tag to compare
quantize : fix --keep-split (#10114)

b4000

01 Nov 00:50
c02e5ab
Compare
Choose a tag to compare
llama : fix buffer checks for mamba and rwk (#10111)

* llama : fix buffer checks for mamba and rwk

* llama : fix missing worst case flag during reserve

* cuda : fix supports_op for norm

* disable sched SET_CAUSE

b3999

31 Oct 20:27
ab3d71f
Compare
Choose a tag to compare
loader:  refactor tensor weights storage (#9935)

* loader: refactor tensor weights storage

* use sorted map, sort weights by layer

---------

Co-authored-by: slaren <slarengh@gmail.com>

b3998

31 Oct 14:13
0a683e8
Compare
Choose a tag to compare
server : include scheme when printing URL (#10106)

b3997

31 Oct 11:40
dea5e86
Compare
Choose a tag to compare
ggml : check tensor name lengths in gguf files (#10100)

b3996

31 Oct 10:05
1329c0a
Compare
Choose a tag to compare
kompute: add mul_mat_q4_k shader (#10097)

This is a more or less direct translation from the Metal implementation
to GLSL.

Signed-off-by: Sergio Lopez <slp@redhat.com>