Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

torch.OutOfMemoryError: Allocation on device #6073

Open
JvavNumOne opened this issue Dec 16, 2024 · 8 comments
Open

torch.OutOfMemoryError: Allocation on device #6073

JvavNumOne opened this issue Dec 16, 2024 · 8 comments
Labels
User Support A user needs help with something, probably not a bug.

Comments

@JvavNumOne
Copy link

Your question

SamplerCustomAdvanced
Allocation on device

Logs

2024-12-16 17:10:56,767 - root - ERROR - !!! Exception during processing !!! Allocation on device 
2024-12-16 17:10:56,770 - root - ERROR - Traceback (most recent call last):
  File "/root/ComfyUI-aki-v1.4/execution.py", line 317, in execute
    output_data, output_ui, has_subgraph = get_output_data(obj, input_data_all, execution_block_cb=execution_block_cb, pre_execute_cb=pre_execute_cb)
                                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/ComfyUI-aki-v1.4/execution.py", line 192, in get_output_data
    return_values = _map_node_over_list(obj, input_data_all, obj.FUNCTION, allow_interrupt=True, execution_block_cb=execution_block_cb, pre_execute_cb=pre_execute_cb)
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/ComfyUI-aki-v1.4/execution.py", line 169, in _map_node_over_list
    process_inputs(input_dict, i)
  File "/root/ComfyUI-aki-v1.4/execution.py", line 158, in process_inputs
    results.append(getattr(obj, func)(**inputs))
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/ComfyUI-aki-v1.4/comfy_extras/nodes_custom_sampler.py", line 612, in sample
    samples = guider.sample(noise.generate_noise(latent), latent_image, sampler, sigmas, denoise_mask=noise_mask, callback=callback, disable_pbar=disable_pbar, seed=noise.seed)
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/ComfyUI-aki-v1.4/comfy/samplers.py", line 716, in sample
    output = self.inner_sample(noise, latent_image, device, sampler, sigmas, denoise_mask, callback, disable_pbar, seed)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/ComfyUI-aki-v1.4/comfy/samplers.py", line 695, in inner_sample
    samples = sampler.sample(self, sigmas, extra_args, callback, noise, latent_image, denoise_mask, disable_pbar)
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/ComfyUI-aki-v1.4/comfy/samplers.py", line 600, in sample
    samples = self.sampler_function(model_k, noise, sigmas, extra_args=extra_args, callback=k_callback, disable=disable_pbar, **self.extra_options)
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/miniconda3/envs/comfyui/lib/python3.12/site-packages/torch/utils/_contextlib.py", line 116, in decorate_context
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/root/ComfyUI-aki-v1.4/comfy/k_diffusion/sampling.py", line 1022, in sample_deis
    denoised = model(x_cur, t_cur * s_in, **extra_args)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/ComfyUI-aki-v1.4/comfy/samplers.py", line 299, in __call__
    out = self.inner_model(x, sigma, model_options=model_options, seed=seed)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/ComfyUI-aki-v1.4/comfy/samplers.py", line 682, in __call__
    return self.predict_noise(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/ComfyUI-aki-v1.4/comfy/samplers.py", line 685, in predict_noise
    return sampling_function(self.inner_model, x, timestep, self.conds.get("negative", None), self.conds.get("positive", None), self.cfg, model_options=model_options, seed=seed)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/ComfyUI-aki-v1.4/comfy/samplers.py", line 279, in sampling_function
    out = calc_cond_batch(model, conds, x, timestep, model_options)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/ComfyUI-aki-v1.4/comfy/samplers.py", line 228, in calc_cond_batch
    output = model.apply_model(input_x, timestep_, **c).chunk(batch_chunks)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/ComfyUI-aki-v1.4/custom_nodes/ComfyUI-Advanced-ControlNet/adv_control/utils.py", line 69, in apply_model_uncond_cleanup_wrapper
    return orig_apply_model(self, *args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/ComfyUI-aki-v1.4/comfy/model_base.py", line 142, in apply_model
    model_output = self.diffusion_model(xc, t, context=context, control=control, transformer_options=transformer_options, **extra_conds).float()
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/miniconda3/envs/comfyui/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1736, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/miniconda3/envs/comfyui/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1747, in _call_impl
    return forward_call(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/ComfyUI-aki-v1.4/comfy/ldm/flux/model.py", line 159, in forward
    out = self.forward_orig(img, img_ids, context, txt_ids, timestep, y, guidance, control)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/ComfyUI-aki-v1.4/custom_nodes/ComfyUI-PuLID-Flux/pulidflux.py", line 116, in forward_orig
    img = block(img, vec=vec, pe=pe)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/miniconda3/envs/comfyui/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1736, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/miniconda3/envs/comfyui/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1747, in _call_impl
    return forward_call(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/ComfyUI-aki-v1.4/comfy/ldm/flux/layers.py", line 231, in forward
    output = self.linear2(torch.cat((attn, self.mlp_act(mlp)), 2))
                          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
torch.OutOfMemoryError: Allocation on device

Other

128g memory,4090d-24g;

  • Name: cuda:0 NVIDIA GeForce RTX 4090 D : cudaMallocAsync
    • Type: cuda
    • VRAM Total: 25269436416
    • VRAM Free: 20775045028
    • Torch VRAM Total: 2080374784
    • Torch VRAM Free: 341772196
@JvavNumOne JvavNumOne added the User Support A user needs help with something, probably not a bug. label Dec 16, 2024
@JvavNumOne
Copy link
Author

The first time it runs, it encounters an OOM (Out of Memory) error. However, if you click again, it runs successfully. After that, every subsequent click will result in an OOM error.

@Domi443
Copy link

Domi443 commented Dec 16, 2024

The first time it runs, it encounters an OOM (Out of Memory) error. However, if you click again, it runs successfully. After that, every subsequent click will result in an OOM error.

No, I can't generate any image when this pops up. Even when I reload and queue agian

@JvavNumOne
Copy link
Author

The first time it runs, it encounters an OOM (Out of Memory) error. However, if you click again, it runs successfully. After that, every subsequent click will result in an OOM error.

No, I can't generate any image when this pops up. Even when I reload and queue agian

Did you encounter this problem as well? Has it been solved?

@JvavNumOne
Copy link
Author

How can I configure PyTorch to use more cache? It's reporting an OOM (Out - Of - Memory) error, but I still have around 10GB of VRAM (Video Random Access Memory) free.

@Alasundru
Copy link

I am getting this error on my PC which has a 4070 card with 16gb of VRAM, yet on my laptop which has a 3080 with 8gb vram, using the same workflow, models etc, it runs (slowly) but fine. What's the go? Is there something wrong with the way I installed cuda/pytorch etc on my PC? I don't really want to do a fresh install of ComfyUI, It takes forever to compile everything and redownload the nodes.
I have removed and reinstalled the cuda/pytorch stuff in the past which hasn't fixed anything, and they are the correct versions for my python and card Q_Q please haaalp

@JvavNumOne
Copy link
Author

I am getting this error on my PC which has a 4070 card with 16gb of VRAM, yet on my laptop which has a 3080 with 8gb vram, using the same workflow, models etc, it runs (slowly) but fine. What's the go? Is there something wrong with the way I installed cuda/pytorch etc on my PC? I don't really want to do a fresh install of ComfyUI, It takes forever to compile everything and redownload the nodes. I have removed and reinstalled the cuda/pytorch stuff in the past which hasn't fixed anything, and they are the correct versions for my python and card Q_Q please haaalp

I'm not sure either, but I got an alert here saying that the weight file wasn't loaded. I guess it might be the reason. I'm looking for this file to see if I can manually allocate VRAM.
2024-12-16 17:10:14,067 - root - WARNING - Warning torch.load doesn't support weights_only on this pytorch version, loading unsafely.
2024-12-16 17:10:14,454 - root - WARNING - clip missing: ['text_projection.weight']

@diogomathe
Copy link

My problem is always this ''ApplyPulidFlux allocation on device''

@PyrateGFXProductions
Copy link

I am having similar Flux memory issues. I solved the OOM with Garbage Collector node from ControlFlowUtils.
However, the image generation process is at least 10 minutes on my 3060 12gb and when it gets to the upscale...
well, just forget it ever completing the generation process. I had a generation go for over 9 hrs and it was only half done!
So, my conclusion is that there is still a model memory issue with Flux that needs to be fixed!!!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
User Support A user needs help with something, probably not a bug.
Projects
None yet
Development

No branches or pull requests

5 participants