Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Missing LLava 1.6 support for handling custom templates with the respect of the chosen LLM. #1301

Closed
DomainFlag opened this issue Mar 25, 2024 · 8 comments · Fixed by #1147
Closed
Labels
enhancement New feature or request

Comments

@DomainFlag
Copy link

DomainFlag commented Mar 25, 2024

I'm using the Llava15ChatHandler but it seems I don't see anything for Llava16ChatHandler by looking at the source code? Moreover, it contains hard-coded templating instead of having support for custom in-model given the prop-value from the metadata tokenizer.chat_template by Nous Hermes 2 Yi 34B for example (Link) which is quite different from the hard-coded one? Any plans for that? Is LLava 1.6 really supported or should I fallback to the parent project?

Update: Seems using the current codebase state, I can get fairly okay results during inference but not sure if there might be some regression, need to check the original and compare.

@abetlen abetlen added the enhancement New feature or request label Apr 4, 2024
@abetlen
Copy link
Owner

abetlen commented Apr 4, 2024

Started here #1147 but got sidetracked.

@shelbywhite
Copy link

Would love to see full support for LLaVA 1.6 in this project.

@Vinventive
Copy link

Started here #1147 but got sidetracked.

Definitely, there will be new, ground-breaking LLaVA models coming this month, fine-tuned on Llama-3. It would be great to run them quantized in GGUF using this cpp-python library.

@abetlen
Copy link
Owner

abetlen commented Apr 28, 2024

Coming soon in #1147 , already added llava1.6, obsidian, and moondream support using the new system.

@Vinventive
Copy link

we appreciate the addition of LLaVaV1.6 34B support would be great to have a support for smaller 7B quants and projectors, or at least a single cjpais/llava-1.6-mistral-7b-gguf. that would be truly awesome!

@abetlen
Copy link
Owner

abetlen commented Apr 30, 2024

@Vinventive I wasn't aware there were differences in the chat formats, do you mind sharing a link and I'll add that right away, cheers!

@Vinventive
Copy link

@Vinventive I wasn't aware there were differences in the chat formats, do you mind sharing a link and I'll add that right away, cheers!

Here is the link: ggml-org/llama.cpp#5267

For Mistral and using llava-cli binary:
Add this: -p "\nUSER:\nProvide a full description.\nASSISTANT:\n"
The mistral template for llava-1.6 seems to be no system print and a USER/ASSISTANT role

@Vinventive
Copy link

really struggling to run LLaVA with CUDA instead of cuBLAS and I was wondering if it's just an isolated issue, I've seen other open issue where people are running into similar problems #1393

maybe we're doing something incorrectly, or there is a missing info/step in the readme how to run it on Windows 64-bit?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants