server: multimodal - fix misreported prompt and num prompt tokens #5896

cjpais · 2024-03-06T04:52:09Z

Should address #5852, #5863

The issue is all normal token processing is skipped for multimodal so it outputs no tokens processed in the response. This is wrong.

To fix, we set the number of tokens processed to it's correct value in ingest_images where the prompt is tokenized for multimodal.

Additionally a fix for the prompt being set to the empty string for multimodal responses. Basically we iteratively rebuild the initial prompt since it was cleared.

I think it would be best to unify the token processing between multimodal and regular, but this PR does not aim to address that.

chigkim · 2024-03-06T05:14:10Z

Awesome! confirming it works! Thank you!

ggerganov · 2024-03-06T07:09:45Z

I'm attempting to refactor most of the server code in #5882. The tentative plan is to remove multimodal functionality and potentially reintroduce it at a later stage. In the meantime llava-cli can server as a LLaVA example

Since it touches pretty much the entire code, I don't plan to merge server related PRs in the meantime to avoid resolving conflicts

cjpais · 2024-03-06T18:04:03Z

Wasn't aware of the rewrite, but excited for it. Makes sense to me not to merge.

Happy to close this PR if it makes sense to you. I'll keep my branch up for those who want to patch or cherrypick.

mathpopo · 2024-03-27T09:01:12Z

Wasn't aware of the rewrite, but excited for it. Makes sense to me not to merge.

Happy to close this PR if it makes sense to you. I'll keep my branch up for those who want to patch or cherrypick.

can give me zhe branch name? i use latest version , server upload image , just error?

crashr · 2024-03-29T12:26:36Z

@mathpopo You can fetch from this PR and create a local branch. You could name it "multimodal".

git fetch origin pull/5896/head:multimodal
git checkout multimodal

ggerganov · 2024-05-10T14:22:11Z

This is outdated since we removed the multimodal functionality from server

cjpais added 2 commits March 5, 2024 20:37

fix num tokens for multimodal + empty prompt in response

5db4c71

add back the [img-id]

a98a166

This was referenced Mar 6, 2024

Question about llama.cpp and llava-cli when used with llava 1.6 for vision: #5852

Closed

Server always incorrectly reports 1 for prompt_n, tokens_evaluated, and n_prompt_tokens_processed when using Llava 1.6. #5863

Closed

phymbert mentioned this pull request Mar 12, 2024

server : improvements and maintenance #4216

Open

10 tasks

chigkim mentioned this pull request Mar 12, 2024

LLaVA 1.6 Models Unable to Process Specific Image Size and Resolution Locally ollama/ollama#2429

Open

This was referenced Apr 16, 2024

Please update Llama.cpp Server. It now Supports Llava 1.6 Image Embedding Dimension properly. ollama/ollama#2795

Closed

Ollama Reports 0 Prompt Tokens When Using Llava ollama/ollama#3671

Closed

cjpais mentioned this pull request May 2, 2024

server: multimodal - fix misreported prompt and num prompt tokens Mozilla-Ocho/llamafile#392

Merged

mofosyne added server Review Complexity : Low Trivial changes to code that most beginner devs (or those who want a break) can tackle. e.g. UI fix labels May 10, 2024

Merge branch 'master' into server-fix-num-token-eval

c25d83d

ggerganov closed this May 10, 2024

mofosyne added the obsolete? Marker for potentially obsolete PR label May 10, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

server: multimodal - fix misreported prompt and num prompt tokens #5896

server: multimodal - fix misreported prompt and num prompt tokens #5896

cjpais commented Mar 6, 2024

chigkim commented Mar 6, 2024

ggerganov commented Mar 6, 2024 •

edited

Loading

cjpais commented Mar 6, 2024

mathpopo commented Mar 27, 2024

crashr commented Mar 29, 2024

ggerganov commented May 10, 2024

server: multimodal - fix misreported prompt and num prompt tokens #5896

server: multimodal - fix misreported prompt and num prompt tokens #5896

Conversation

cjpais commented Mar 6, 2024

chigkim commented Mar 6, 2024

ggerganov commented Mar 6, 2024 • edited Loading

cjpais commented Mar 6, 2024

mathpopo commented Mar 27, 2024

crashr commented Mar 29, 2024

ggerganov commented May 10, 2024

ggerganov commented Mar 6, 2024 •

edited

Loading