Releases · ggerganov/llama.cpp

01 Nov 12:00

1804adb

b4006

ggml : remove ggml_scratch (#10121)

ggml-ci

Assets 22

01 Nov 10:28

github-actions

b4005

815fe72

b4005

sync : ggml

Assets 22

01 Nov 04:11

github-actions

b4003

e597e50

b4003

build: fix build error in Windows env with OneAPI setup (#10107)

Assets 22

01 Nov 01:34

github-actions

b4002

85679d3

b4002

llama : improve output buffer type selection (#10098)

Assets 22

01 Nov 01:22

github-actions

b4001

1e9f949

b4001

quantize : fix --keep-split (#10114)

Assets 22

01 Nov 00:50

github-actions

b4000

c02e5ab

b4000

llama : fix buffer checks for mamba and rwk (#10111)

* llama : fix buffer checks for mamba and rwk

* llama : fix missing worst case flag during reserve

* cuda : fix supports_op for norm

* disable sched SET_CAUSE

Assets 22

31 Oct 20:27

github-actions

b3999

ab3d71f

b3999

loader:  refactor tensor weights storage (#9935)

* loader: refactor tensor weights storage

* use sorted map, sort weights by layer

---------

Co-authored-by: slaren <slarengh@gmail.com>

Assets 22

31 Oct 14:13

github-actions

b3998

0a683e8

b3998

server : include scheme when printing URL (#10106)

Assets 22

31 Oct 11:40

github-actions

b3997

dea5e86

b3997

ggml : check tensor name lengths in gguf files (#10100)

Assets 22

31 Oct 10:05

github-actions

b3996

1329c0a

b3996

kompute: add mul_mat_q4_k shader (#10097)

This is a more or less direct translation from the Metal implementation
to GLSL.

Signed-off-by: Sergio Lopez <slp@redhat.com>

Assets 22

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Releases: ggerganov/llama.cpp

b4006

b4005

b4003

b4002

b4001

b4000

b3999

b3998

b3997

b3996