Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

swiftsimio/visualisation/projection_backends/gpu.py , numba/cuda version issues #187

Open
Will-McD opened this issue Mar 14, 2024 · 9 comments
Assignees

Comments

@Will-McD
Copy link

Will-McD commented Mar 14, 2024

Using:
Swiftsimio 7.0.0
Numba 0.58.0

from numba import jit
from swiftsimio import mask as sw_mask

There is an instant error from numba/cuda and swiftsimio/visualisation/projection_backends/gpu

The final lines of the error messages are:
numba.cuda.cudadrv.driver.CudaAPIError: [2] Call to cuDevicePrimaryCtxRetain results in CUDA_ERROR_OUT_OF_MEMORY raise CudaAPIError(retcode, msg) numba.cuda.cudadrv.driver.CudaAPIError: [2] Call to cuDevicePrimaryCtxRetain results in CUDA_ERROR_OUT_OF_MEMORY

@MatthieuSchaller
Copy link
Member

Which machine is this on?

@MatthieuSchaller
Copy link
Member

@Will-McD ?

@Will-McD
Copy link
Author

Will-McD commented Apr 30, 2024

Which machine is this on?

I was able to produce this on Cosma8a and 8b, have not tried on any of the other cosma machines. But I have not had this error show up on any of the local machines

@JBorrow
Copy link
Member

JBorrow commented May 1, 2024

Are you sure this is a swiftsimio problem and not a numba problem?

@Will-McD
Copy link
Author

Will-McD commented May 1, 2024

Are you sure this is a swiftsimio problem and not a numba problem?

I think the issue is within swiftsimio. When I replace swiftsimio with astropy or my own "swiftsimio-like" visualisation functions that are essentially just the functions from swiftsimio without the dependancies on all the packages that swiftsimio has, but keep numba I have had no problems.

@JBorrow
Copy link
Member

JBorrow commented May 8, 2024

Can you make appropriate changes to optional_packages.py and submit it as a PR? I would really appreciate it. Looks like we need to guard against some other CUDA errors.

# Numba/CUDA
try:
    from numba.cuda.cudadrv.error import CudaSupportError

    try:
        import numba.cuda.cudadrv.driver as drv
        from numba import cuda
        from numba.cuda import jit as cuda_jit

        try:
            CUDA_AVAILABLE = cuda.is_available()
        except AttributeError:
            # Backwards compatibility with older versions
            # Check for the driver

            d = drv.Driver()
            d.initialize()

            CUDA_AVAILABLE = True

    except CudaSupportError:
        CUDA_AVAILABLE = False

except (ImportError, ModuleNotFoundError):
    # Mock the CudaSupportError so that we can raise it in cases
    # where we don't have numba installed.

    class CudaSupportError(Exception):
        def __init__(self, message):
            self.message = message

    CUDA_AVAILABLE = False

@MatthieuSchaller
Copy link
Member

@Will-McD still an issue?

@andreagebek
Copy link

I received exactly the same error message on the login8 cosma node at multiple occasions yesterday and today, by simply running:
import swiftsimio
The issue is so far not persistent, and seems to only occasionally pop up. Also, on the login7 cosma node this issue did not occur so far.
swiftsimio version: 9.0.2
numba version: 0.60.0

@JBorrow
Copy link
Member

JBorrow commented Nov 27, 2024

export NUMBA_DISABLE_CUDA=1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants