Llama 3 BPE tokenization needs improvement #4026

coder543 · 2024-04-29T14:04:06Z

What is the issue?

This PR just merged on llama.cpp, which contained important improvements to how tokenization worked for Llama 3 and other models. An example of the issue is noted here.

Hopefully ollama can update to the latest llama.cpp quickly and make a new release.

OS

Linux

GPU

Nvidia

CPU

AMD

Ollama version

all versions up to this point

MoonRide303 · 2024-04-29T14:20:31Z

You might want to wait for ggerganov/llama.cpp#6965 to be merged, too (should happen soon).

coder543 · 2024-05-11T13:41:15Z

ggerganov/llama.cpp#6965 has been merged now. I'm unclear when things were fixed in ollama, but I just tested with 0.1.35, and I can't reproduce it anymore. Closing.

dpublic · 2024-05-13T15:53:01Z

The llama.cpp commit link in ollama is dated 4/30 and ggerganov/llama.cpp#6965 was merged to llama.cpp on 5/9.
So, it doesn't look like this merge was included with the last 0.1.37 ollama release.
Does that mean ollama was changed to handle the previous llama.cpp behavior and a future llama.cpp sync in ollama will change behavior?

coder543 added the bug Something isn't working label Apr 29, 2024

coder543 mentioned this issue May 1, 2024

Llama3 Tokenizer #4082

Closed

coder543 closed this as completed May 11, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Llama 3 BPE tokenization needs improvement #4026

Llama 3 BPE tokenization needs improvement #4026

coder543 commented Apr 29, 2024

MoonRide303 commented Apr 29, 2024 •

edited

Loading

coder543 commented May 11, 2024

dpublic commented May 13, 2024

Llama 3 BPE tokenization needs improvement #4026

Llama 3 BPE tokenization needs improvement #4026

Comments

coder543 commented Apr 29, 2024

What is the issue?

OS

GPU

CPU

Ollama version

MoonRide303 commented Apr 29, 2024 • edited Loading

coder543 commented May 11, 2024

dpublic commented May 13, 2024

MoonRide303 commented Apr 29, 2024 •

edited

Loading