Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cuda array interface #42

Closed

Conversation

LucasGandel
Copy link
Collaborator

@LucasGandel LucasGandel commented Aug 9, 2024

Add support for __cuda_array_interface. Closes #10

TODO:

  • Add vector types in itkCudaImage.wrap
  • Handle Dimension 1, 2
  • Check shape order (i,j,k) (ITK-style) vs (k, j, i) (Numpy style)
  • Add RTK types in itkCudaImageRTK.wrap (see RTK#617)
  • Expose flag to prevent the GPU buffer to be released during GPU->CPU transfer to keep the cupy array valid when transferring the itkCudaImage to the CPU.
    - [ ] Add test
    - [ ] Improve descriptor and check other elements in __cuda_array_interface dict
  • Fix GetGPUBufferPointer being dereferenced in RTK and bump version (see RTK#617)
Code used to test the approach (requires cupy-cuda12x):
from itk import RTK as rtk
import cupy as cp

Dimension = 3
GPUImageType  = rtk.CudaImage[itk.F, Dimension]
cuda_image = GPUImageType.New()

start = itk.Index[Dimension]()
start[0] = 0  # first index on X
start[1] = 0  # first index on Y
start[2] = 0  # first index on Z

size = itk.Size[Dimension]()
size[0] = 2  # size along X
size[1] = 2  # size along Y
size[2] = 2  # size along Z

region = itk.ImageRegion[Dimension]()
region.SetSize(size)
region.SetIndex(start)

cuda_image.SetRegions(region)
cuda_image.Allocate()

cuda_image.FillBuffer(42.3)
cuda_image.GetCudaDataManager().UpdateGPUBuffer()

cupy_image = cp.array(cuda_image, dtype=cp.float32, copy=False, blocking=True)
cp.put(cupy_image, [0, 0, 0], 42.314)
print(f"GPU array as numpy {cp.asnumpy(cupy_image)}")

Prior to this commit CudaDataManager::GetGPUBufferPointer() returned a
pointer to the GPU buffer pointer. Its return type should have been void**.
Every usage of this method across RTK and CudaCommon dereference the
returned pointer pointer, so just return the buffer pointer directly as it
required for future commits adding support for __cuda_array_interface__
Adapt ITK's PyBuffer approach to inject __cuda_array_interface__ in wrapped
code.
Closes RTKConsortium#10
@LucasGandel
Copy link
Collaborator Author

@SimonRit regarding the 3rd point "Check shape order (i,j,k) (ITK-style) vs (k, j, i) (Numpy style)", it looks like cupy can handle both, but the default should probably be the numpy style (C-style). Do you think we should just reverse the shape to always use C-style, or should we introduce a class attribute to control the order expected by users. ITK provides a keep_axes parameter in GetArrayViewFromImage to control this behavior.

FWIW changing the order of the order parameter in cupy.array does not seem to have impact, only the shape matters. https://docs.cupy.dev/en/stable/reference/generated/cupy.array.html#cupy-array

@SimonRit
Copy link
Collaborator

SimonRit commented Sep 4, 2024

My understanding is that I can expect a behavior similar to numpy. The shape in the interface should be computed as in PyBuffer, see here (note the computation for vectors). The behavior will then be as described here. Sounds good to me for consistency with the NumPy interface.

@SimonRit
Copy link
Collaborator

SimonRit commented Sep 4, 2024

Do you intend to add tests? If yes, it would be good to add it to the list...

@LucasGandel
Copy link
Collaborator Author

Do you intend to add tests? If yes, it would be good to add it to the list...

Adding a test requires cupy to be installed. Should we add some CMake/Python magic to automatically install it based on the detected cuda version? Or just import cupy in a try/catch statement in the test file to throw a nice error message if it is not installed?

@LucasGandel
Copy link
Collaborator Author

Do you intend to add tests? If yes, it would be good to add it to the list...

Adding a test requires cupy to be installed. Should we add some CMake/Python magic to automatically install it based on the detected cuda version? Or just import cupy in a try/catch statement in the test file to throw a nice error message if it is not installed?

ITK expects numpy to be installed at runtime and just install it in CI scripts. We should do the same, but it looks like tests are not executed in the python CI workflow. This should be added in another PR, but I will add a test in the meantime

@SimonRit
Copy link
Collaborator

SimonRit commented Sep 6, 2024

If you need cupy to do the test, I think we should wait for a subsequent PR to implement it, when we implement helper functions such as GetCupyArrayFromImage or cupy_array_from_image. I suggest to drop it for now.

@LucasGandel
Copy link
Collaborator Author

If you need cupy to do the test, I think we should wait for a subsequent PR to implement it, when we implement helper functions such as GetCupyArrayFromImage or cupy_array_from_image. I suggest to drop it for now.

Sounds good! I'm attaching the script we used for testing a few types, as it could be useful later to implement the test:
itkCupyCudaArrayInterfaceTest.txt

I'll check how much effort it is to implement GetCupyArrayFromImage and will open a new PR as needed. This one should now be ready, changes have been made on the RTK side too (see RTKConsortium/RTK#617)

Copy link
Collaborator

@SimonRit SimonRit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I only have minor suggestions. I think we should document the API change and export the version number (CUDACOMMON_VERSION_MAJOR, CUDACOMMON_VERSION_MINOR) to let the user adapt his / her code depending on the version.

CMakeLists.txt Outdated Show resolved Hide resolved
include/itkCudaDataManager.h Outdated Show resolved Hide resolved
This allows modules like RTK to easily reuse files from the CudaCommon
sources, e.g. CudaImage.i.init and CudaImage.i.in
@LucasGandel
Copy link
Collaborator Author

I only have minor suggestions. I think we should document the API change and export the version number (CUDACOMMON_VERSION_MAJOR, CUDACOMMON_VERSION_MINOR) to let the user adapt his / her code depending on the version.

Sounds good. Then it will probably be version 2.0.0, and we can export it juste like CudaCommon_SOURCE_DIR for other modules to use it. Does this sounds good to you?

@SimonRit
Copy link
Collaborator

Sounds good. Then it will probably be version 2.0.0, and we can export it juste like CudaCommon_SOURCE_DIR for other modules to use it. Does this sounds good to you?

Yes, also with a header like rtkConfiguration.h.in to access it from the C++ code please.

@LucasGandel
Copy link
Collaborator Author

Superseded by #43

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add __cuda_array_interface__ to itk::CudaImage
2 participants