llama : expose model's rope_freq_scale in the API #3418

grencez · 2023-09-30T21:43:14Z

I think this is necessary for automatic implementations of https://github.com/ggerganov/llama.cpp/tree/master/examples/main#extended-context-size when the model's RoPE scaling factor isn't 1.0. (We want to further scale it rather than overwriting the value, right?)

so it can be scaled further before creating a context.

ggerganov/llama.cpp#3418

…example * 'master' of github.com:ggerganov/llama.cpp: (24 commits) convert : fix Baichuan2 models by using vocab size in config.json (ggerganov#3299) readme : add project status link ggml : fix build after ggerganov#3329 llm : add Refact model (ggerganov#3329) sync : ggml (conv 1d + 2d updates, UB fixes) (ggerganov#3468) finetune : readme fix typo (ggerganov#3465) ggml : add RISC-V Vector Support for K-Quants and improved the existing intrinsics (ggerganov#3453) main : consistent prefix/suffix coloring (ggerganov#3425) llama : fix session saving/loading (ggerganov#3400) llama : expose model's rope_freq_scale in the API (ggerganov#3418) metal : alibi for arbitrary number of heads (ggerganov#3426) cmake : make LLAMA_NATIVE flag actually use the instructions supported by the processor (ggerganov#3273) Work on the BPE tokenizer (ggerganov#3252) convert : fix vocab size when not defined in hparams (ggerganov#3421) cmake : increase minimum version for add_link_options (ggerganov#3444) CLBlast: Add broadcast support for matrix multiplication (ggerganov#3402) gguf : add BERT, MPT, and GPT-J arch info (ggerganov#3408) gguf : general usability improvements (ggerganov#3409) cmake : make CUDA flags more similar to the Makefile (ggerganov#3420) finetune : fix ggerganov#3404 (ggerganov#3437) ...

so it can be scaled further before creating a context.

ggerganov/llama.cpp#3418

grencez force-pushed the model_rope branch from 567051f to 93b8765 Compare October 2, 2023 08:15

llama : expose model's rope_freq_scale in the API

bb941fc

so it can be scaled further before creating a context.

grencez force-pushed the model_rope branch from 93b8765 to bb941fc Compare October 2, 2023 11:02

ggerganov approved these changes Oct 3, 2023

View reviewed changes

ggerganov merged commit 48be797 into ggerganov:master Oct 3, 2023
32 checks passed

grencez deleted the model_rope branch October 3, 2023 18:20

grencez added a commit to rendezqueue/rendezllama that referenced this pull request Oct 4, 2023

updatg(llama.cpp): with rope_freq_scale in the API

8fd5043

ggerganov/llama.cpp#3418

yusiwen pushed a commit to yusiwen/llama.cpp that referenced this pull request Oct 7, 2023

llama : expose model's rope_freq_scale in the API (ggerganov#3418)

6487ea7

so it can be scaled further before creating a context.

grencez added a commit to rendezqueue/rendezllama that referenced this pull request Oct 12, 2023

update(llama.cpp): with rope_freq_scale in the API

f96fa97

ggerganov/llama.cpp#3418

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

llama : expose model's rope_freq_scale in the API #3418

llama : expose model's rope_freq_scale in the API #3418

grencez commented Sep 30, 2023

llama : expose model's rope_freq_scale in the API #3418

llama : expose model's rope_freq_scale in the API #3418

Conversation

grencez commented Sep 30, 2023