forked from LostRuins/koboldcpp
-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
b3306 #208
Merged
Merged
b3306 #208
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Owner
Nexesenex
commented
Jul 4, 2024
- I have read the contributing guidelines
- Self-reported review complexity:
- Low
- Medium
- High
* ppl : fix n_seq_max for perplexity * use 1 seq for kl_divergence
* llama : suppress unref var in Windows MSVC This commit suppresses two warnings that are currently generated for src/llama.cpp when building on Windows MSVC ```console C:\llama.cpp\src\llama.cpp(14349,45): warning C4101: 'ex': unreferenced local variable [C:\llama.cpp\build\src\llama.vcxproj] C:\llama.cpp\src\llama.cpp(19285,44): warning C4101: 'e': unreferenced local variable [C:\llama.cpp\build\src\llama.vcxproj] ``` * Update src/llama.cpp --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
This commit adds the compile definition `_CRT_SECURE_NO_WARNINGS` to the root cmake subproject. The motivation for this is that currently the following warnings are displayed when compiling the tests and common cmake subprojects: ```console test-llama-grammar.cpp C:\llama.cpp\src\.\llama.cpp(1406,77): warning C4996: 'strerror': This function or variable may be unsafe. Consider using strerror_s instead. To disable deprecation, use _CRT_SECURE_NO_WARNINGS. See online help for details. [C:\llama.cpp\build\tests\test-llama-grammar.vcxproj] ... ``` This compile definition is currently set for the `src` subproject and this change moves into the root cmake project so that it is applied to all cmake subprojects.
* llama : add inference support and model types for T5 and FLAN-T5 model families * llama : add new API functions to support encoder-decoder models: llama_encode(), llama_model_has_encoder(), llama_model_decoder_start_token() * common, llama-cli, llama-batched : add support for encoder-decoder models * convert-hf : handle shared token embeddings tensors in T5Model * convert-hf : add support for SentencePiece BPE tokenizer in T5Model (for Pile-T5 models) * convert-hf : add MT5ForConditionalGeneration and UMT5ForConditionalGeneration to architectures supported by T5Model * convert : add t5 tokenizer tests, use "slow" HF tokenizer for t5 --------- Co-authored-by: Stanisław Szymczyk <sszymczy@gmail.com> Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
Not namespaced though :(
This commit adds a new option to the tokenize example, --show-count. When this is set the total number of tokens are printed to stdout. This was added as an option as I was concerned that there might be scripts that use the output from this program and it might be better to not print this information by default. The motivation for this is that can be useful to find out how many tokens a file contains, for example when trying to determine prompt input file sizes for testing. Signed-off-by: Daniel Bevenius <daniel.bevenius@gmail.com>
* Initial OpenELM support (270M only so far) * Fill out missing entries in llama_model_type_name * fixup! Initial OpenELM support (270M only so far) Fix formatting * llama : support all OpenELM models * llama : add variable GQA and variable FFN sizes Some metadata keys can now also be arrays to support setting their value per-layer for models like OpenELM. * llama : minor spacing changes Co-authored-by: Georgi Gerganov <ggerganov@gmail.com> * llama : use std::array for per-layer hparams * llama : fix save/load state * llama : do not print hparams for vocab-only models * llama : handle n_head == 0 * llama : use const ref for print_f and fix division by zero * llama : fix t5 uses of n_head and n_ff * llama : minor comment --------- Co-authored-by: Francis Couture-Harpin <git@compilade.net> Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
* main: add need_insert_eot * do not format system prompt if it is empty
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.