Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix deadlock for CUDA #4044

Merged
merged 1 commit into from
Jul 22, 2024
Merged

Conversation

WeiqunZhang
Copy link
Member

@WeiqunZhang WeiqunZhang commented Jul 22, 2024

It has been noticed that Tests/GPU/CNS/Exec/RT hangs with amrex.the_arena_init_size=0 amrex.the_arena_release_threshold=0. The issue appears to be CUDA host callback functions do not work well with cudaFree in the main host thread. Note that we don't have any CUDA API calls in the host callback function. Also note that cudaMalloc seems to work and using a single GPU stream also works.

A workaround is implemented to avoid cudaFree inside an MFIter loop.

It has been noticed that Tests/GPU/CNS/Exec/RT hangs with
`amrex.the_arena_init_size=0 amrex.the_arena_release_threshold=0`. The issue
appears to be CUDA host callback functions do not work well with cudaFree in
the main host thread. Note that we don't have any CUDA API calls in the host
callback function. Also note that cudaMall seems work and using a single GPU
stream also works.

A workaround is implemented to avoid cudaFree when there are host callback
functions inside an MFIter loop.
@WeiqunZhang WeiqunZhang enabled auto-merge (squash) July 22, 2024 02:02
@WeiqunZhang WeiqunZhang merged commit 4392b19 into AMReX-Codes:development Jul 22, 2024
71 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants