sycl-for-cuda documentation is inaccurate with the updated plugin system ? #1161

fwyzard · 2020-02-22T08:09:00Z

I am trying the cuda branch (currently at 38ec8bf8f2c) and I can compile and run the simple application from the "getting started" guide.

However, the behaviour of the plugin system does not seem to match what is documented.

According to the guide,

the CUDA backend must be selected at runtime using the SYCL_BE environment variable.
SYCL_BE=PI_CUDA ./simple-sycl-app-cuda.exe

However, setting the SYCL_BE variable does not seem to make any difference.

Let's build the simple application (I've added a print out of the device being used):

$ build/bin/clang++ -fsycl -fsycl-targets=spir64-unknown-linux-sycldevice -Wno-unknown-cuda-version simple-sycl-app.cpp -o simple-sycl-app-opencl

Check the available devices:

$ clinfo -l
Platform #0: Intel(R) OpenCL HD Graphics
 `-- Device #0: Intel(R) Gen9 HD Graphics NEO
Platform #1: Intel(R) OpenCL
 `-- Device #0: Intel(R) Core(TM) i9-9900K CPU @ 3.60GHz

(there's a Tesla K40 as well, but the NVIDIA ICD has been removed to avoid even more confusion)

And run with no SYCL_BE variable:

$ ./simple-sycl-app-opencl
Running on SYCL device Intel(R) Gen9 HD Graphics NEO, driver version 20.06.15619
The results are correct!

Set the OpenCL backend:

$ SYCL_BE=PI_OPENCL ./simple-sycl-app-opencl
Running on SYCL device Intel(R) Gen9 HD Graphics NEO, driver version 20.06.15619
The results are correct!

OK, so far so good. Now with the CUDA backend:

$ SYCL_BE=PI_CUDA ./simple-sycl-app-opencl 
Running on SYCL device Intel(R) Gen9 HD Graphics NEO, driver version 20.06.15619
The results are correct!

Ehm, what ?

Something similar happens with the CUDA backend.

Let's rebuild the application with CUDA support:

$ build/bin/clang++ -fsycl -fsycl-targets=nvptx64-nvidia-cuda-sycldevice -Wno-unknown-cuda-version simple-sycl-app.cpp -o simple-sycl-app-cuda

Try with the default selector:

$ ./simple-sycl-app-cuda
Running on SYCL device Intel(R) Gen9 HD Graphics NEO, driver version 20.06.15619
terminate called after throwing an instance of 'cl::sycl::runtime_error'
  what():  OpenCL API failed. OpenCL API returns: -42 (CL_INVALID_BINARY) -42 (CL_INVALID_BINARY)
Aborted (core dumped)

That didn't work... maybe setting the plugin explicitly ?

$ SYCL_BE=PI_CUDA ./simple-sycl-app-cuda   
Running on SYCL device Intel(R) Gen9 HD Graphics NEO, driver version 20.06.15619
terminate called after throwing an instance of 'cl::sycl::runtime_error'
  what():  OpenCL API failed. OpenCL API returns: -42 (CL_INVALID_BINARY) -42 (CL_INVALID_BINARY)
Aborted (core dumped)

Nope.
OK, remove all ICDs and try again:

$ OCL_ICD_VENDORS= SYCL_BE=PI_CUDA ./simple-sycl-app-cuda
Running on SYCL device Tesla K40c, driver version CUDA 10.20
The results are correct!

So, it does work.
And without the SYCL_BE variable ?

$ OCL_ICD_VENDORS= ./simple-sycl-app-cuda
Running on SYCL device Tesla K40c, driver version CUDA 10.20
The results are correct!

Still works.
What if I force the OpenCL plugin ?

$ OCL_ICD_VENDORS= SYCL_BE=PI_OPENCL ./simple-sycl-app-cuda
Running on SYCL device Tesla K40c, driver version CUDA 10.20
The results are correct!

It still works !

So, it seems that the SYCL_BE variable is not needed any more, and in fact it is mostly ignored:

$ OCL_ICD_VENDORS= SYCL_BE=PI_IS_314 ./simple-sycl-app-cuda
Running on SYCL device Tesla K40c, driver version CUDA 10.20
The results are correct!

The only value with any effect seems to be PI_OTHER:

$ OCL_ICD_VENDORS= SYCL_BE=PI_OTHER ./simple-sycl-app-cuda
pi_die: Unknown SYCL_BE
terminate called without an active exception
Aborted (core dumped)

An other interesting and possibly undocumented result: it looks like it is possible to build a binary that supports both OpenCL and CUDA backends:

$ build/bin/clang++ -fsycl -fsycl-targets=nvptx64-nvidia-cuda-sycldevice,spir64-unknown-linux-sycldevice -Wno-unknown-cuda-version simple-sycl-app.cpp -o simple-sycl-app

$ ./simple-sycl-app
Running on SYCL device Intel(R) Gen9 HD Graphics NEO, driver version 20.06.15619
The results are correct!

$ OCL_ICD_VENDORS= ./simple-sycl-app
Running on SYCL device Tesla K40c, driver version CUDA 10.20
The results are correct!

Magic !

The text was updated successfully, but these errors were encountered:

bader · 2020-02-22T10:20:03Z

@fwyzard, thank you for trying it out and sharing the results of your experiments.
I noticed that the sample used in this experiment uses default device selector and it seems we have to improve how it behaves.
Also, although it's not documented in CUDA back-end documentation, multiple back-ends support was enabled by Plug-in system design, so it's expected to work.

fwyzard · 2020-02-22T16:58:32Z

@bader thanks for your comments.

Since it looks like both OpenCL and CUDA plugins are always available (if built), what is the purpose of the SYCL_BE variable ?

One aspect that could be improved is to avoid listing platforms or devices for which there is no support in the binary. That is, an application compiled with -fsycl-targets=nvptx64-nvidia-cuda-sycldevice should list only the CUDA devices (and the host device), and so on.

bader · 2020-02-23T15:31:43Z

Since it looks like both OpenCL and CUDA plugins are always available (if built), what is the purpose of the SYCL_BE variable ?

I think someone from Codeplay team should reply. According to my understanding it might be just an artifact of old implementation version. We added runtime library support for multiple plug-ins not long ago.

One aspect that could be improved is to avoid listing platforms or devices for which there is no support in the binary. That is, an application compiled with -fsycl-targets=nvptx64-nvidia-cuda-sycldevice should list only the CUDA devices (and the host device), and so on.

I was my first thought too, but this might cause problems for applications, which don't have device code (e.g. use libraries with SYCL API). The application might just create a command queue, buffers and call some library, which has device code.

fwyzard · 2020-02-23T23:10:23Z

... this might cause problems for applications, which don't have device code (e.g. use libraries with SYCL API). The application might just create a command queue, buffers and call some library, which has device code.

I see your point - the part of the application that queries the devices and the part that has the device code could have been built with support for different devices.

I think it would be useful to have some way for an application to know if a given device (or platform) is usable or not, for a given kernel.

Ruyk · 2020-02-24T15:56:26Z

@bader thanks for your comments.

Since it looks like both OpenCL and CUDA plugins are always available (if built), what is the purpose of the SYCL_BE variable ?

One aspect that could be improved is to avoid listing platforms or devices for which there is no support in the binary. That is, an application compiled with -fsycl-targets=nvptx64-nvidia-cuda-sycldevice should list only the CUDA devices (and the host device), and so on.

The PI interface added support for multiple backends while the MR for the CUDA backend was up. Rebasing it was more trouble than expected, @Alexander-Johnston and @bader (and others) have been working on reviewing and rebasing the MR.
All this work is in the cuda-dev branch. The cuda branch will be updated once the cuda-dev is a bit more stable.

The SYCL_BE env var was originally used to select the backend, when multiple backend support was not available. The SYCL_BE now is only needed when the default_selector is used. The default selector can decide to pick any device in the system, but it doesnt have access to the information of which device images are available (PTX and/or SPIRV). The env variable helps forcing to decide a device coming from a specific backend.
Ideally in the future we will come up with a better mechanism for the default selector that doesn't require the environment variable.

fwyzard · 2020-02-24T16:00:12Z

Thanks for the details @Ruyk .

I can move my tests to the sycl + cuda-dev branches, and check again the behaviour.

fwyzard · 2020-02-25T11:51:15Z

OK, with the recently merged cuda-dev branch (and so with the current sycl branch) the behaviour is different:

-fsycl-targets=... specifies which backends to buld for (spir64-unknown-linux-sycldevice and/or nvptx64-nvidia-cuda-sycldevice);
SYCL_BE specifies which type of device should be selected by the default device selector (CUDA or OpenCL);
neither have any effect on the list of available devices.

Ruyk · 2020-03-20T21:10:31Z

@fwyzard , is this still relevant?

fwyzard · 2020-03-20T21:59:23Z

I think #1293 updated the getting started guide to describe the current status; however it should be updated again once #1252 is merged.

If you prefer, we can close this issue, and review the documentation after #1252 is merged.

Ruyk · 2020-06-08T10:38:47Z

I think this can be closed now, @fwyzard feel free to open a new issue if something is missing!

Signed-off-by: Dmitry Sidorov <dmitry.sidorov@intel.com> Original commit: KhronosGroup/SPIRV-LLVM-Translator@65b5d1b

bader added bug Something isn't working cuda CUDA back-end labels Feb 22, 2020

bader assigned Ruyk Mar 17, 2020

Ruyk closed this as completed Jun 8, 2020

vmaksimo pushed a commit to vmaksimo/llvm that referenced this issue Aug 23, 2021

Up max SPIR-V version to 1.4 (intel#1161)

b3e77af

Signed-off-by: Dmitry Sidorov <dmitry.sidorov@intel.com> Original commit: KhronosGroup/SPIRV-LLVM-Translator@65b5d1b

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

sycl-for-cuda documentation is inaccurate with the updated plugin system ? #1161

sycl-for-cuda documentation is inaccurate with the updated plugin system ? #1161

fwyzard commented Feb 22, 2020

bader commented Feb 22, 2020

fwyzard commented Feb 22, 2020

bader commented Feb 23, 2020

fwyzard commented Feb 23, 2020

Ruyk commented Feb 24, 2020

fwyzard commented Feb 24, 2020

fwyzard commented Feb 25, 2020

Ruyk commented Mar 20, 2020

fwyzard commented Mar 20, 2020

Ruyk commented Jun 8, 2020

sycl-for-cuda documentation is inaccurate with the updated plugin system ? #1161

sycl-for-cuda documentation is inaccurate with the updated plugin system ? #1161

Comments

fwyzard commented Feb 22, 2020

bader commented Feb 22, 2020

fwyzard commented Feb 22, 2020

bader commented Feb 23, 2020

fwyzard commented Feb 23, 2020

Ruyk commented Feb 24, 2020

fwyzard commented Feb 24, 2020

fwyzard commented Feb 25, 2020

Ruyk commented Mar 20, 2020

fwyzard commented Mar 20, 2020

Ruyk commented Jun 8, 2020