cuLaunchCooperativeKernelMultiDevice #478

maximusgrey · 2017-11-08T16:37:50Z

I have been working with CUDA9 / JavaCPP for a few days and got everything up and running very fast. Thank you!

However I cannot seems to get cuda.cuLaunchCooperativeKernelMultiDevice() working. It takes CUDA_LAUNCH_PARAMS as the first argument and second argument the array size, but what I need is an array of CUDA_LAUNCH_PARAMS. I tried via PointerPointer but that dit not fix things.

Does anyone have a solution on how to call cuLaunchCooperativeKernelMultiDevice for multiple devices?

saudet · 2017-11-09T09:04:29Z

CUDA_LAUNCH_PARAMS is a Pointer, which can point to a native array. To allocate an array of size 10, for example, we can call new CUDA_LAUNCH_PARAMS(10).

maximusgrey · 2017-11-09T10:44:21Z

Thanks for the feedback. I got it to work!

maximusgrey · 2017-11-09T12:16:38Z

@saudet , one more question. I have the array of CUDA_LAUNCH_PARAMS working. I can also set all grid and block variables and the kernels executes correctly on the gpus.

Next up is setting the kernel parameters. But each time when I set a kernel parameter like: launchParams.kernelParams(0, new LongPointer(new long[1]) I instantly get a SIGSEGV crash.

saudet · 2017-11-09T12:18:40Z

That doesn't look right. You're going to need to follow the doc that NVIDIA provides about that...

maximusgrey · 2017-11-09T12:26:37Z

NVIDIA doc says you have to make this struct:

typedef struct CUDA_LAUNCH_PARAMS_st {
CUfunction function; /**< Kernel to launch */

unsigned int gridDimX;       /**< Width of grid in blocks */

unsigned int gridDimY;       /**< Height of grid in blocks */

unsigned int gridDimZ;       /**< Depth of grid in blocks */

unsigned int blockDimX;      /**< X dimension of each thread block */

unsigned int blockDimY;      /**< Y dimension of each thread block */

unsigned int blockDimZ;      /**< Z dimension of each thread block */

unsigned int sharedMemBytes; /**< Dynamic shared-memory size per thread block in bytes */

CUstream hStream;            /**< Stream identifier */

void **kernelParams;         /**< Array of pointers to kernel parameters */

} CUDA_LAUNCH_PARAMS;`

So I want to set the "void **kernelParams;" pointer. However the cuda. java code only provides these options:

'public native Pointer kernelParams(int i);
public native CUDA_LAUNCH_PARAMS kernelParams(int i, Pointer kernelParams);
@MemberGetter public native @cast("void**") PointerPointer kernelParams();'

So how should I proceed?

saudet · 2017-11-09T12:33:26Z

You'll need to allocate your own PointerPointer and pass that...

maximusgrey · 2017-11-09T12:41:36Z

Like this? All variants give a SIGSEGV
launchParams.kernelParams(0, new PointerPointer(new IntPointer(new int[1])));
launchParams.kernelParams(0, new PointerPointer(new Pointer()));
launchParams.kernelParams(0, new PointerPointer(new Pointer[] { new Pointer() }));
launchParams.kernelParams(0, new PointerPointer(new Pointer[] { new IntPointer(new int[1]) }));

I would also think that kernelParams(0, pointer) would suggest a normal pointer and when returning the entire array with kernelParams() then I would get a PointerPointer back?

saudet · 2017-11-09T13:37:39Z

That is indeed an issue. We'll have to fix this.

saudet · 2017-11-09T23:23:34Z

In the meantime, we can work around that by using Loader.sizeof(CUDA_LAUNCH_PARAMS.class) and Loader.offsetof(CUDA_LAUNCH_PARAMS.class, "kernelParams") with new BytePointer(launchParams).putPointer(..., kernelParams).

maximusgrey · 2017-11-10T13:45:35Z

Thanks for the feedback and yes it works!

… member setters (issue bytedeco/javacpp-presets#478)

saudet · 2018-01-17T12:03:05Z

The fix is included in version 1.4, providing wrappers for CUDA 9.1 now though:
http://search.maven.org/#search%7Cga%7C1%7Cbytedeco%20cuda
Thanks for reporting and testing this out!

saudet added the question label Nov 9, 2017

saudet closed this as completed Nov 9, 2017

saudet added the bug label Nov 9, 2017

saudet reopened this Nov 9, 2017

saudet added a commit to bytedeco/javacpp that referenced this issue Nov 13, 2017

* Fix a few issues with Parser, including missing PointerPointer…

8f713ea

… member setters (issue bytedeco/javacpp-presets#478)

saudet closed this as completed Jan 17, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

cuLaunchCooperativeKernelMultiDevice #478

cuLaunchCooperativeKernelMultiDevice #478

maximusgrey commented Nov 8, 2017

saudet commented Nov 9, 2017

maximusgrey commented Nov 9, 2017

maximusgrey commented Nov 9, 2017 •

edited

Loading

saudet commented Nov 9, 2017 via email

maximusgrey commented Nov 9, 2017 •

edited

Loading

saudet commented Nov 9, 2017

maximusgrey commented Nov 9, 2017

saudet commented Nov 9, 2017

saudet commented Nov 9, 2017

maximusgrey commented Nov 10, 2017

saudet commented Jan 17, 2018

cuLaunchCooperativeKernelMultiDevice #478

cuLaunchCooperativeKernelMultiDevice #478

Comments

maximusgrey commented Nov 8, 2017

saudet commented Nov 9, 2017

maximusgrey commented Nov 9, 2017

maximusgrey commented Nov 9, 2017 • edited Loading

saudet commented Nov 9, 2017 via email

maximusgrey commented Nov 9, 2017 • edited Loading

saudet commented Nov 9, 2017

maximusgrey commented Nov 9, 2017

saudet commented Nov 9, 2017

saudet commented Nov 9, 2017

maximusgrey commented Nov 10, 2017

saudet commented Jan 17, 2018

maximusgrey commented Nov 9, 2017 •

edited

Loading

maximusgrey commented Nov 9, 2017 •

edited

Loading