-
-
Notifications
You must be signed in to change notification settings - Fork 5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[New Model]: LLaVA-NeXT-Video support #5124
Comments
Hi there @AmazDeng! It looks like this model is already supported on |
Yes, the latest version of Transformers now supports the llava-next-video model. However, the inference speed is very slow. I hope you can support this model soon. |
I do think that's something we should support (and there's indeed an issue for this #416). This will be another API change so we need to make sure everything's compatible. At least as a first step, we do plan to support image embeddings as input (instead of |
I am trying to implement the Llava-Next-Video support. #6571 |
The model to consider.
The llava-next-video project has already been released, and the test results are quite good. Are there any plans to support this project?
https://github.com/LLaVA-VL/LLaVA-NeXT/blob/inference/docs/LLaVA-NeXT-Video.md
Currently, Hugging Face does not support this model.
The closest model vllm already supports.
No response
What's your difficulty of supporting the model you want?
No response
The text was updated successfully, but these errors were encountered: