Skip to content

Releases: NeoZhangJianyu/llama.cpp

b2466

20 Mar 03:49
d26e8b6
Compare
Choose a tag to compare
increase igpu cluster limit (#6159)

b2460

19 Mar 02:06
Compare
Choose a tag to compare
flake.lock: Update

Flake lock file updates:

• Updated input 'nixpkgs':
    'github:NixOS/nixpkgs/9df3e30ce24fd28c7b3e2de0d986769db5d6225d' (2024-03-06)
  → 'github:NixOS/nixpkgs/d691274a972b3165335d261cc4671335f5c67de9' (2024-03-14)

b2437

15 Mar 11:54
46acb36
Compare
Choose a tag to compare
fix set main gpu error (#6073)

b2431

15 Mar 06:37
4755afd
Compare
Choose a tag to compare
llama : fix integer overflow during quantization (#6063)

b2409

13 Mar 03:03
306d34b
Compare
Choose a tag to compare
ci : remove tidy-review (#6021)

b2408

12 Mar 13:07
8030da7
Compare
Choose a tag to compare
ggml : reuse quantum structs across backends (#5943)

* ggml : reuse quant blocks across backends

ggml-ci

* ggml : define helper constants only for CUDA and SYCL

ggml-ci

* ggml : define helper quantum constants for SYCL

ggml-ci

b2407

12 Mar 12:19
184215e
Compare
Choose a tag to compare
ggml : fix UB in IQ2_S and IQ3_S (#6012)

b2405

12 Mar 04:12
5cdb371
Compare
Choose a tag to compare
grammar : fix unnecessarily retained pointer to rules (#6003)

b2351

06 Mar 02:44
652ca2b
Compare
Choose a tag to compare
compare-llama-bench.py : remove mul_mat_q (#5892)

b2343

05 Mar 06:25
29eee40
Compare
Choose a tag to compare
fix speculative decoding build on windows (#5874)