[Bug] LLaVa-next does not work for single image processing #1506

ThomasBenzshawel · 2024-09-24T21:52:55Z

Checklist

1. I have searched related issues but cannot get the expected help.
2. The bug has not been fixed in the latest version.
3. Please note that if the bug-related issue you submitted lacks corresponding environment info and a minimal reproducible demo, it will be challenging for us to reproduce and resolve the issue, reducing the likelihood of receiving feedback.
4. If the issue you raised is not a bug but a question, please raise a discussion at https://github.com/sgl-project/sglang/discussions/new/choose Otherwise, it will be closed.
5. Please use English, otherwise it will be closed.

Describe the bug

INFO 09-24 16:44:39 weight_utils.py:236] Using model weights format ['*.safetensors']
Loading safetensors checkpoint shards: 0% Completed | 0/4 [00:00<?, ?it/s]
Loading safetensors checkpoint shards: 25% Completed | 1/4 [00:00<00:01, 2.63it/s]
Loading safetensors checkpoint shards: 50% Completed | 2/4 [00:01<00:01, 1.34it/s]
Loading safetensors checkpoint shards: 75% Completed | 3/4 [00:02<00:00, 1.19it/s]
Loading safetensors checkpoint shards: 100% Completed | 4/4 [00:03<00:00, 1.16it/s]
Loading safetensors checkpoint shards: 100% Completed | 4/4 [00:03<00:00, 1.24it/s]

chat template: llama-3-instruct

========== single ==========

[16:46:44 TP0] Exception in ModelTpServer:
Traceback (most recent call last):
File "/home/benzshawelt/.conda/envs/sglang/lib/python3.10/site-packages/sglang/srt/managers/tp_worker.py", line 239, in exposed_step
self.forward_step()
File "/home/benzshawelt/.conda/envs/sglang/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 116, in decorate_context
return func(*args, **kwargs)
File "/home/benzshawelt/.conda/envs/sglang/lib/python3.10/site-packages/sglang/srt/managers/tp_worker.py", line 259, in forward_step
self.forward_prefill_batch(new_batch)
File "/home/benzshawelt/.conda/envs/sglang/lib/python3.10/site-packages/sglang/srt/managers/tp_worker.py", line 560, in forward_prefill_batch
logits_output = self.model_runner.forward(batch)
File "/home/benzshawelt/.conda/envs/sglang/lib/python3.10/site-packages/sglang/srt/model_executor/model_runner.py", line 519, in forward
return self.forward_extend_multi_modal(batch)
File "/home/benzshawelt/.conda/envs/sglang/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 116, in decorate_context
return func(*args, **kwargs)
File "/home/benzshawelt/.conda/envs/sglang/lib/python3.10/site-packages/sglang/srt/model_executor/model_runner.py", line 506, in forward_extend_multi_modal
return self.model.forward(
File "/home/benzshawelt/.conda/envs/sglang/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 116, in decorate_context
return func(*args, **kwargs)
File "/home/benzshawelt/.conda/envs/sglang/lib/python3.10/site-packages/sglang/srt/models/llava.py", line 188, in forward
if modalities_list[image_idx] == "image":
IndexError: list index out of range

When sending one image to the model it crashes.

Reproduction

Installed sglang following the instructions, used the llava-next example in your repo, and resulted with the error.
https://github.com/sgl-project/sglang/blob/main/examples/frontend_language/quick_start/local_example_llava_next.py

Environment

The environment is a plain conda env with python 3.10 and ONLY the installations as described in the SGLANG readme. This is on a computer using slurm, I am on a node with 2 h100 nvidia gpus.

merrymercy · 2024-10-06T22:42:54Z

Fixed by #1592

merrymercy mentioned this issue Oct 6, 2024

Fix modality for image inputs #1592

Merged

merrymercy closed this as completed Oct 6, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug] LLaVa-next does not work for single image processing #1506

[Bug] LLaVa-next does not work for single image processing #1506

ThomasBenzshawel commented Sep 24, 2024

merrymercy commented Oct 6, 2024 •

edited

Loading

[Bug] LLaVa-next does not work for single image processing #1506

[Bug] LLaVa-next does not work for single image processing #1506

Comments

ThomasBenzshawel commented Sep 24, 2024

Checklist

Describe the bug

Reproduction

Environment

merrymercy commented Oct 6, 2024 • edited Loading

merrymercy commented Oct 6, 2024 •

edited

Loading