Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Llava 1.6: server not decoding images, but works via CLI #5515

Closed
tctrautman opened this issue Feb 15, 2024 · 6 comments
Closed

Llava 1.6: server not decoding images, but works via CLI #5515

tctrautman opened this issue Feb 15, 2024 · 6 comments

Comments

@tctrautman
Copy link

tctrautman commented Feb 15, 2024

First, let me say that I really appreciate all the work you guys are putting into llama.cpp -- it's really impressive.

I'm testing out yesterday's release of llava 1.6 (thanks so much for working on that tricky PR, @cmp-nct), and it's working well via the CLI, but when I run it via the server I'm seeing the below when it receives a request:

clip_image_load_from_bytes: failed to decode image bytes
slot 0 - failed to load image [id: 12]
task 1 - error: internal_error

How I'm running via CLI (works)

./llava-cli -m ./models/llava-1-6/mistral-7b-q_5_k.gguf --mmproj ./models/llava-1-6/mmproj-mistral7b-f16.gguf --image ./media/images/ginsberg.png -p "Who is this?" --temp 0.1

How I'm running via Server (doesn't work)

To start the server:

./server -m ./models/llava-1-6/mistral-7b-q_5_k.gguf --mmproj ./models/llava-1-6/mmproj-mistral7b-f16.gguf --host 127.0.0.1 --port 8080

The request I'm sending:

curl --request POST \
  --url http://localhost:8080/completion \
  --header 'Content-Type: application/json' \
  --data '{
	"prompt": "USER:[img-12]Who is this?.\nASSISTANT:",
	"temperature": 0.1,
	"image_data": [
		{
			"data": <BASE64_IMG>,
			"id": 12
		}
	]
}'

BASE64_IMG is the base 64 of the below image

ginsberg

Details

  • My system: 2021 M1 Max MBP w/ 64 GB of RAM, running Sonoma 14.3
  • Llava 1.6 Model files: both from this HF repo
    • Model: mistral-7b-q_5_k.gguf
    • Mmproj: mmproj-mistral7b-f16.gguf
  • Version of llama.cpp: I'm on the most recent commit as of this issue, commit 4524290e87b8e107cc2b56e1251751546f4b9051
@tctrautman
Copy link
Author

Seems possibly related to #5514, but that's pure speculation

@cmp-nct
Copy link
Contributor

cmp-nct commented Feb 15, 2024

yes it's the same, I responded with what needs to be done there

@tctrautman
Copy link
Author

Thanks for the thorough explanation in that issue, @cmp-nct. I'll close this issue out since it's a duplicate of that one.

@tctrautman
Copy link
Author

Opening this back up.

After seeing the conversation develop in the other ticket, that issue seems to stem from differences in the system prompt between the CLI and server.

But this error seems to appear while the server is loading the Base64 image into clip. Also, AFAICT this error doesn't seem to touch process_images at all, but instead comes from launch_slot_with_data:

https://github.com/ggerganov/llama.cpp/blob/4524290e87b8e107cc2b56e1251751546f4b9051/examples/server/server.cpp#L686-L698

I haven't made much progress with debugging (unfamiliar with C++), but I'll see if I can dive in a bit deeper over the next day or two.

@cjpais
Copy link
Contributor

cjpais commented Feb 17, 2024

Is the base64 you're putting in valid? At first I used an online converter and got the same result as you. Then I pushed the image through the ./server UI, copied the b64 from there and sent it from the command line without issue

This gist is what I used: https://gist.github.com/cjpais/6b7b620d29b4a8e6ca81eb5b87371bb5

@tctrautman
Copy link
Author

@cjpais that was it -- thank you! A silly mistake on my part -- I was sending the entire URL instead of just the base 64 data. In case others stumble upon this issue, this might be helpful reading.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants