-
Notifications
You must be signed in to change notification settings - Fork 2.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Question] Can LLava inference on CPU? #865
Comments
you need to install torch cpu and set device map to cpu in model loading side @wenli135 |
it's possible for you give a complete example for how run LLaVA_13b_4bit_vanilla_colab without gpu? |
I made some changes in the code to run inference on CPU, the model is loading but I am getting an error: |
did anyone able to run Llava inference on CPU without installing Intel Extention for Pytorch environment for inference? Any pointer will be really helpful |
Hi Ratan Could you tell why you don't want to use Intel Extention for Pytorch? Thanks. |
Tried some of this paths:
So, natively, from HF:
which I don't explain because using
=> blocked here. Finally, regarding https://github.com/intel/xFasterTransformer, it's not quite clear whether it replace or complements intel-extension-for-pytorch [CPU/XPU] and for which specific hardware. If any one could come up with answers/solutions for at least some of these, that'd be great. |
|
Question
I was trying to run LLava inference on cpu, but it complains "Torch not compiled with CUDA enabled". I noticed that cuda() is called when loading model. If I remove all the cuda() invocation, is it possible to run inference on cpu?
thanks.
The text was updated successfully, but these errors were encountered: