llama : change yarn_ext_factor placeholder to -1 #3922

cebtenzzre · 2023-11-03T02:00:51Z

I have been unable to reproduce any strange results from this code on gcc 12.3.0 or clang 16.0.6 - cparams.yarn_ext_factor is always correctly set to 0.0f for me.

However, with some compilers and system configurations and the -Ofast compiler option, the default of NaN apparently does not work. I was thinking that a good placeholder should be impossible to confuse for a valid value, but apparently the result of std::isnan cannot be relied upon.

ref: #2268 (comment)

LostRuins

tested on windows, seems to work.

vlapra · 2023-11-03T06:57:16Z

With this change it got stuck on mac after prompt entered. When I revert the change, it works.

Prompt Hello, got stuck that wya:
ggml_metal_init: GPU family: MTLGPUFamilyApple8 (1008)
ggml_metal_init: hasUnifiedMemory = true
ggml_metal_init: recommendedMaxWorkingSetSize = 21845.34 MB
ggml_metal_init: maxTransferRate = built-in GPU
llama_new_context_with_model: compute buffer total size = 200.63 MB
llama_new_context_with_model: max tensor size = 128.17 MB
ggml_metal_add_buffer: allocated 'data ' buffer, size = 8802.34 MB, ( 8802.97 / 21845.34)
ggml_metal_add_buffer: allocated 'kv ' buffer, size = 1600.02 MB, (10402.98 / 21845.34)
ggml_metal_add_buffer: allocated 'alloc ' buffer, size = 194.02 MB, (10597.00 / 21845.34)

system_info: n_threads = 8 / 10 | AVX = 0 | AVX2 = 0 | AVX512 = 0 | AVX512_VBMI = 0 | AVX512_VNNI = 0 | FMA = 0 | NEON = 1 | ARM_FMA = 1 | F16C = 0 | FP16_VA = 1 | WASM_SIMD = 0 | BLAS = 1 | SSE3 = 0 | SSSE3 = 0 | VSX = 0 |
sampling:
repeat_last_n = 64, repeat_penalty = 1.100, frequency_penalty = 0.000, presence_penalty = 0.000
top_k = 40, tfs_z = 1.000, top_p = 0.950, min_p = 0.050, typical_p = 1.000, temp = 0.700
mirostat = 0, mirostat_lr = 0.100, mirostat_ent = 5.000
generate: n_ctx = 2048, n_batch = 512, n_predict = -1, n_keep = 0

Hello

ggerganov · 2023-11-03T07:12:31Z

@cebtenzzre Not sure if you know that branches in this repo are monitored by https://github.com/ggml-org/ci and will run extra CI on commits that have the string ggml-ci in the commit message:

These tests are more like integration-level tests and require extra resources, so they run on rented infrastructure instead of the default Github Actions. Ideally, all PRs is good to go through that CI, but it requires collaborators to create the branches in llama.cpp. It's obviously optional to do so, but it would help catch such issues earlier

The CI that runs on these instances is implemented in ci/run.sh and can be ran also locally by developers to make sure everything is good. It downloads some extra data in a temp folder though, but it is one-time download.

llama : change yarn_ext_factor placeholder to -1

25fef50

cebtenzzre requested a review from KerfuffleV2 November 3, 2023 02:00

LostRuins approved these changes Nov 3, 2023

View reviewed changes

ggerganov merged commit 3fdbe6b into ggerganov:master Nov 3, 2023
32 checks passed

ggerganov mentioned this pull request Nov 3, 2023

common : YAYF (yet another YARN fix) #3925

Merged

olexiyb pushed a commit to Sanctum-AI/llama.cpp that referenced this pull request Nov 23, 2023

llama : change yarn_ext_factor placeholder to -1 (ggerganov#3922)

11ca834

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

llama : change yarn_ext_factor placeholder to -1 #3922

llama : change yarn_ext_factor placeholder to -1 #3922

cebtenzzre commented Nov 3, 2023 •

edited

Loading

LostRuins left a comment

vlapra commented Nov 3, 2023

ggerganov commented Nov 3, 2023

llama : change yarn_ext_factor placeholder to -1 #3922

llama : change yarn_ext_factor placeholder to -1 #3922

Conversation

cebtenzzre commented Nov 3, 2023 • edited Loading

LostRuins left a comment

Choose a reason for hiding this comment

vlapra commented Nov 3, 2023

ggerganov commented Nov 3, 2023

cebtenzzre commented Nov 3, 2023 •

edited

Loading