Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question on quantized LLAMA3 versions for use with EAGLE #127

Open
jin-eld opened this issue Sep 3, 2024 · 0 comments
Open

Question on quantized LLAMA3 versions for use with EAGLE #127

jin-eld opened this issue Sep 3, 2024 · 0 comments

Comments

@jin-eld
Copy link

jin-eld commented Sep 3, 2024

Hi,

this is a question to anyone who has tried EAGLE with LLAMA3, I was wondering which LLAMA3 model exactly you were using? I.e. I assume a quantized version since the original one from Meta is huge, which quantization gave a good ratio of quality and performance in combination with EAGLE? I would also appreciate if someone could point me to a quantized LLAMA3 model repo which is known to work with EAGLE, so far I have found GGUF versions which are not supported and it seems I am not able to quantize the original LLAMA3 myself due to insufficient RAM. Any hints would be greatly appreciated, thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant