Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How much VRAM do you need to run Inference? #9

Open
FFFiend opened this issue Jun 12, 2023 · 0 comments
Open

How much VRAM do you need to run Inference? #9

FFFiend opened this issue Jun 12, 2023 · 0 comments

Comments

@FFFiend
Copy link

FFFiend commented Jun 12, 2023

Hey there, I'd just like to fork your repo and test it out on a couple prompts (which I'm assuming I can do just by running the pipelines set up in run_test.py ? If I'm wrong about this could you clarify) but I'm wondering how much VRAM I need to do this? The weights themselves are around 27 GB and the LLaMa model is 13 so can I load everything onto a single 40 GB GPU?

Thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant