Limit GPU memory usage? #75

sef43 · 2024-01-09T13:46:04Z

Hello,

When running on a GPU that might be doing something else I am sometimes seeing out of memory errors:
CUDA Error of GINTint2e_jk_kernel: out of memory

Is it possible to specify a hard limit on the amount of memory used by these kernels?

The text was updated successfully, but these errors were encountered:

wxj6000 · 2024-01-09T17:10:57Z

GPU memory is mostly allocated via CuPy. You can set the memory limit via CuPy if you hope GPU can do something else. https://docs.cupy.dev/en/stable/user_guide/memory.html#limiting-gpu-memory-usage

Although GINT* kernels do not allocate global memory explicitly, those kernels allocate a lot of local memory for high angular momentums. Those local memory are eventually allocated on global memory. So for high angular momentums, you probably still have the 'out of memory' issue.

sef43 · 2024-01-09T20:26:39Z

thank you for the explanation

sef43 · 2024-10-02T14:50:00Z

Hello, I am reopening this issue.

I have found that if I turn on CUDA_MPS and limit the number of active threads with this command:
CUDA_MPS_ACTIVE_THREAD_PERCENTAGE=50
I find that a calculation that usually fails with the CUDA Error of GINTint2e_jk_kernel: out of memory will succeed (taking only 1.5x longer, not 2x longer)

My understanding is that this reduces the local/shared memory in use at once, stopping the errors, at the expense of runtime.

Is it possible to do a similar modification at runtime, or compile time, in the code?

Maybe these values:?

gpu4pyscf/gpu4pyscf/lib/gint/gint.h

Lines 77 to 81 in 6474b41

    
           // threads for GPU 
        
           #define THREADSX        16 
        
           #define THREADSY        16 
        
           #define THREADS         (THREADSX * THREADSY) 
        
           #define MAX_STREAMS         16

wxj6000 · 2024-10-03T00:32:09Z

This is a good suggestion. If you turn off some threads, there is no need to allocate local memory for those threads. We can take it as one of the possible solutions.

sef43 closed this as completed Jan 9, 2024

sef43 mentioned this issue Oct 2, 2024

Question: What do the __config__.py settings effect? #217

Open

sef43 reopened this Oct 2, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Limit GPU memory usage? #75

Limit GPU memory usage? #75

sef43 commented Jan 9, 2024

wxj6000 commented Jan 9, 2024

sef43 commented Jan 9, 2024

sef43 commented Oct 2, 2024

wxj6000 commented Oct 3, 2024

Limit GPU memory usage? #75

Limit GPU memory usage? #75

Comments

sef43 commented Jan 9, 2024

wxj6000 commented Jan 9, 2024

sef43 commented Jan 9, 2024

sef43 commented Oct 2, 2024

wxj6000 commented Oct 3, 2024