Skip to content

Releases: NeoZhangJianyu/llama.cpp

update_oneapi-b3789-3ae8374

update_oneapi-b3788-f557ccf

20 Sep 04:18
Compare
Choose a tag to compare
update oneapi to 2024.2

b3787

20 Sep 04:05
6026da5
Compare
Choose a tag to compare
server : clean-up completed tasks from waiting list (#9531)

ggml-ci

b3735

12 Sep 03:42
df4b794
Compare
Choose a tag to compare
cann: Fix error when running a non-exist op (#9424)

b3678

07 Sep 07:35
9b2c24c
Compare
Choose a tag to compare
server : simplify state machine for slot (#9283)

* server : simplify state machine for slot

* add SLOT_STATE_DONE_PROMPT

* pop_deferred_task

* add missing notify_one

* fix passkey test

* metrics : add n_busy_slots_per_decode

* fix test step

* add test

* maybe fix AddressSanitizer?

* fix deque ?

* missing lock

* pop_deferred_task: also notify

* Update examples/server/server.cpp

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>

---------

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>

b3449

24 Jul 02:48
de28008
Compare
Choose a tag to compare
examples : Fix `llama-export-lora` example (#8607)

* fix export-lora example

* add more logging

* reject merging subset

* better check

* typo

b3291

04 Jul 01:56
f619024
Compare
Choose a tag to compare
[SYCL] Remove unneeded semicolons (#8280)

b3145

14 Jun 06:10
172c825
Compare
Choose a tag to compare
rpc : fix ggml_backend_rpc_supports_buft() (#7918)

b2716

23 Apr 01:25
4e96a81
Compare
Choose a tag to compare
[SYCL] Windows default build instructions without -DLLAMA_SYCL_F16 fl…

…ag activated (#6767)

* Fix FP32/FP16 build instructions

* Fix typo

* Recommended build instruction

Co-authored-by: Neo Zhang Jianyu <jianyu.zhang@intel.com>

* Recommended build instruction

Co-authored-by: Neo Zhang Jianyu <jianyu.zhang@intel.com>

* Recommended build instruction

Co-authored-by: Neo Zhang Jianyu <jianyu.zhang@intel.com>

* Add comments in Intel GPU linux

---------

Co-authored-by: Anas Ahouzi <112881240+aahouzi-intel@users.noreply.github.com>
Co-authored-by: Neo Zhang Jianyu <jianyu.zhang@intel.com>

b2688

17 Apr 02:08
facb8b5
Compare
Choose a tag to compare
convert : fix autoawq gemma (#6704)

* fix autoawq quantized gemma model convert error

using autoawq to quantize gemma model will include a lm_head.weight tensor in model-00001-of-00002.safetensors. it result in this situation that convert-hf-to-gguf.py can't map lm_head.weight. skip loading this tensor could prevent this error.

* change code to full string match and print necessary message

change code to full string match and print a short message to inform users that lm_head.weight has been skipped.

---------

Co-authored-by: Zheng.Deng <32841220+CUGfred@users.noreply.github.com>