You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I want to run this model locally so I can caption a lot of images, my problem however is that I get the following error message every time:
torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 20.00 MiB. GPU 0 has a total capacty of 8.00 GiB of which 0 bytes is free. Of the allocated memory 14.51 GiB is allocated by PyTorch, and 117.23 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF
I've tried to do set 'PYTORCH_CUDA_ALLOC_CONF=max_split_size_mb:10' and some other lower values and nothing worked.
I have 8gb VRAM and 16gb RAM
The text was updated successfully, but these errors were encountered:
You can try setting "device_map" to "auto" and possibly using HuggingFace accelerate to offload the model to RAM. Quantization could also help fit the entire model in VRAM, but is currently isn't working (see issue #3 ). Hopefully I'll have that fixed soon.
Alpha Two has errors in bitsandbytes, but a quantized repo in GPTQ format was released today. I was wondering if this could be used to load with less VRAM. Also, I think GPTQ is compatible with CPU. It is an easy to use format, although the disadvantage is that it does not allow on-the-fly quantization. https://huggingface.co/OPEA/llama-joycaption-alpha-two-hf-llava-int4-sym-inc
I want to run this model locally so I can caption a lot of images, my problem however is that I get the following error message every time:
torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 20.00 MiB. GPU 0 has a total capacty of 8.00 GiB of which 0 bytes is free. Of the allocated memory 14.51 GiB is allocated by PyTorch, and 117.23 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF
I've tried to do
set 'PYTORCH_CUDA_ALLOC_CONF=max_split_size_mb:10'
and some other lower values and nothing worked.I have 8gb VRAM and 16gb RAM
The text was updated successfully, but these errors were encountered: