-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
merge from llama.cpp #33
Commits on Aug 8, 2024
-
make : clean llamafile objects (ggerganov#8923)
`ggml/src/llamafile/sgemm.o` was not deleted on `make clean`
Configuration menu - View commit details
-
Copy full SHA for ebd541a - Browse repository at this point
Copy the full SHA ebd541aView commit details -
Configuration menu - View commit details
-
Copy full SHA for 85fca8d - Browse repository at this point
Copy the full SHA 85fca8dView commit details -
Configuration menu - View commit details
-
Copy full SHA for 5b33ea1 - Browse repository at this point
Copy the full SHA 5b33ea1View commit details -
Configuration menu - View commit details
-
Copy full SHA for f93d49a - Browse repository at this point
Copy the full SHA f93d49aView commit details -
Configuration menu - View commit details
-
Copy full SHA for e44a561 - Browse repository at this point
Copy the full SHA e44a561View commit details -
Configuration menu - View commit details
-
Copy full SHA for 366d486 - Browse repository at this point
Copy the full SHA 366d486View commit details -
Configuration menu - View commit details
-
Copy full SHA for afd27f0 - Browse repository at this point
Copy the full SHA afd27f0View commit details -
gguf-py : simplify support for quant types (ggerganov#8838)
* gguf-py : use classes for quants * convert_hf : simplify internal quantization type selection * gguf-py : fix flake8 lint * gguf-py : fix BF16 numpy view type * gguf-py : remove LlamaFileTypeMap Too specific to 'llama.cpp', and would be a maintenance burden to keep up to date. * gguf-py : add generic quantize and dequantize functions The quant classes no longer need to be known, only the target or the source type, for 'quantize' and 'dequantize', respectively.
Configuration menu - View commit details
-
Copy full SHA for 3a14e00 - Browse repository at this point
Copy the full SHA 3a14e00View commit details
Commits on Aug 9, 2024
-
llama : reduce useless copies when saving session (ggerganov#8916)
* llama : avoid useless copies in dummy session writer * llama : avoid double tensor copy when saving session to buffer
Configuration menu - View commit details
-
Copy full SHA for 345a686 - Browse repository at this point
Copy the full SHA 345a686View commit details -
Configuration menu - View commit details
-
Copy full SHA for daef3ab - Browse repository at this point
Copy the full SHA daef3abView commit details -
Configuration menu - View commit details
-
Copy full SHA for 6f6496b - Browse repository at this point
Copy the full SHA 6f6496bView commit details -
embedding : add --pooling option to README.md [no ci] (ggerganov#8934)
This commit adds the `--pooling` option to the README.md file in the `examples/embedding` directory. The motivation for adding this options is that currently if the model used does not specify a pooling type the embedding example will fail with the following error message: ```console main: error: pooling type NONE not supported ``` This commit also updates the name of the executable in the examples section.
Configuration menu - View commit details
-
Copy full SHA for 5b2c04f - Browse repository at this point
Copy the full SHA 5b2c04fView commit details -
whisper : use vulkan as gpu backend when available (whisper/2302)
* ggml: use vulkan as gpu backend when available Signed-off-by: Matt Stephenson <mstephenson6@users.noreply.github.com> * whisper: enable using vk as default buffer type Signed-off-by: Matt Stephenson <mstephenson6@users.noreply.github.com> --------- Signed-off-by: Matt Stephenson <mstephenson6@users.noreply.github.com>
Configuration menu - View commit details
-
Copy full SHA for 70c0ea3 - Browse repository at this point
Copy the full SHA 70c0ea3View commit details -
Configuration menu - View commit details
-
Copy full SHA for 4305b57 - Browse repository at this point
Copy the full SHA 4305b57View commit details -
llava : support MiniCPM-V-2.5 (ggerganov#7599)
* init * rename * add run android for termux in readme * add android readme * add instructions in readme * change name in readme * Update README.md * fixed line * add result in readme * random pos_embed * add positions index * change for ollama * change for ollama * better pos_embed in clip * support ollama * updata cmakelist * updata cmakelist * rename wrapper * clear code * replace and organize code * add link * sync master * fix warnings * fix warnings * fix bug in bicubic resize when need resize iamge smaller * receive review comments and modify * receive review comments and modify * put all code into llava dir * fix quality problem in pr code * change n_layer * add space in "-1" * imitate reshape bug of python code * fix bug in clip * fix issues for merging * fix llama-minicpmv-cli in cmake file * change pr readme * fix code review * remove in line 33 directory in the /cmakelists.txt (not in example, in the main dir * fix cmakefile * add warn * fix KEY_HAS_MINICPMV_PROJ * remove load_image_size into clip_ctx * remove the extern "C", MINICPMV_API * fix uhd code for review comment * delete minicpmv-wrapper in pr * remove uhd_image_embed * Modify 2 notes * clip : style changes * del common.h in clip * fix Type-Check error * fix Type-Check error * fix Type-Check error * fix Type-Check error * fix makefile error * fix ubuntu-make error * try fix clip * try fix 1 --------- Co-authored-by: Hongji Zhu <fireyoucan@gmail.com> Co-authored-by: harvestingmoon <leewenyeong@gmail.com> Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
Configuration menu - View commit details
-
Copy full SHA for 3071c0a - Browse repository at this point
Copy the full SHA 3071c0aView commit details -
llama : better replace_all (cont) (ggerganov#8926)
* llama : better replace_all (cont) ggml-ci * code : deduplicate replace_all ggml-ci
Configuration menu - View commit details
-
Copy full SHA for 45a55b9 - Browse repository at this point
Copy the full SHA 45a55b9View commit details -
Configuration menu - View commit details
-
Copy full SHA for 272e3bd - Browse repository at this point
Copy the full SHA 272e3bdView commit details -
llama : add support for lora adapters in T5 model (ggerganov#8938)
Co-authored-by: Stanisław Szymczyk <sszymczy@gmail.com>
Configuration menu - View commit details
-
Copy full SHA for 6afd1a9 - Browse repository at this point
Copy the full SHA 6afd1a9View commit details -
Configuration menu - View commit details
-
Copy full SHA for b72942f - Browse repository at this point
Copy the full SHA b72942fView commit details
Commits on Aug 10, 2024
-
gguf-py : fix double call to add_architecture() (ggerganov#8952)
Signed-off-by: tarilabs <matteo.mortari@gmail.com>
Configuration menu - View commit details
-
Copy full SHA for 911b437 - Browse repository at this point
Copy the full SHA 911b437View commit details -
Add support for encoder-only T5 models (ggerganov#8900)
* gguf-py : add T5ENCODER model architecture * common : call llama_decode() during warmup only if the model has decoder * convert-hf : add T5EncoderModel * llama : add llama_model_has_decoder() API function * llama : split build_t5() into build_t5_encoder() and build_t5_decoder() * llama : add support for LLM_ARCH_T5ENCODER * llama-embedding : add support for LLAMA_POOLING_TYPE_NONE * llama-embedding : add support for encoder-only models --------- Co-authored-by: Stanisław Szymczyk <sszymczy@gmail.com>
Configuration menu - View commit details
-
Copy full SHA for 7c3f55c - Browse repository at this point
Copy the full SHA 7c3f55cView commit details -
llama : default n_swa for phi-3 (ggerganov#8931)
* default n_swa for phi-3 * fix * double check swa
Configuration menu - View commit details
-
Copy full SHA for 7eb2384 - Browse repository at this point
Copy the full SHA 7eb2384View commit details -
Configuration menu - View commit details
-
Copy full SHA for 6e02327 - Browse repository at this point
Copy the full SHA 6e02327View commit details
Commits on Aug 11, 2024
-
Optimize Vulkan backend for better CPU performance and less GPU synch…
…ronization overhead. (ggerganov#8943) * Optimize Vulkan backend for better CPU performance and less GPU synchronization overhead. - Allocation overhead for the temporary std::vectors was easily detectable with a sampling profiler and simple to remove. - ggml_vk_sync_buffer introduce a full pipeline sync which has a significant cost on the GPU side, sometimes larger than the actual kernel execution. Adding only barriers for shader read/writes and transfers seems to be sufficient looking at the code which either launches compute kernels or copies tensors. * Fix small typo --------- Co-authored-by: 0cc4m <picard12@live.de>
Configuration menu - View commit details
-
Copy full SHA for 7c5bfd5 - Browse repository at this point
Copy the full SHA 7c5bfd5View commit details -
llama : check all graph nodes when searching for result_embd_pooled (g…
…gerganov#8956) Co-authored-by: Stanisław Szymczyk <sszymczy@gmail.com>
Configuration menu - View commit details
-
Copy full SHA for 33309f6 - Browse repository at this point
Copy the full SHA 33309f6View commit details -
Configuration menu - View commit details
-
Copy full SHA for a21c6fd - Browse repository at this point
Copy the full SHA a21c6fdView commit details -
Configuration menu - View commit details
-
Copy full SHA for 8cd1bcf - Browse repository at this point
Copy the full SHA 8cd1bcfView commit details -
gguf-py : Numpy dequantization for most types (ggerganov#8939)
* gguf-py : Numpy dequantization for most types * gguf-py : Numpy dequantization for grid-based i-quants
Configuration menu - View commit details
-
Copy full SHA for 4134999 - Browse repository at this point
Copy the full SHA 4134999View commit details
Commits on Aug 12, 2024
-
Configuration menu - View commit details
-
Copy full SHA for 32335d5 - Browse repository at this point
Copy the full SHA 32335d5View commit details