Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to run with low vram/ram? #4

Open
Ariiio opened this issue Oct 23, 2024 · 2 comments
Open

How to run with low vram/ram? #4

Ariiio opened this issue Oct 23, 2024 · 2 comments

Comments

@Ariiio
Copy link

Ariiio commented Oct 23, 2024

I want to run this model locally so I can caption a lot of images, my problem however is that I get the following error message every time:

torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 20.00 MiB. GPU 0 has a total capacty of 8.00 GiB of which 0 bytes is free. Of the allocated memory 14.51 GiB is allocated by PyTorch, and 117.23 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

I've tried to do set 'PYTORCH_CUDA_ALLOC_CONF=max_split_size_mb:10' and some other lower values and nothing worked.

I have 8gb VRAM and 16gb RAM

@fpgaminer
Copy link
Owner

You can try setting "device_map" to "auto" and possibly using HuggingFace accelerate to offload the model to RAM. Quantization could also help fit the entire model in VRAM, but is currently isn't working (see issue #3 ). Hopefully I'll have that fixed soon.

@John6666cat
Copy link

Alpha Two has errors in bitsandbytes, but a quantized repo in GPTQ format was released today. I was wondering if this could be used to load with less VRAM. Also, I think GPTQ is compatible with CPU. It is an easy to use format, although the disadvantage is that it does not allow on-the-fly quantization.
https://huggingface.co/OPEA/llama-joycaption-alpha-two-hf-llava-int4-sym-inc

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants