Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sycl-for-cuda documentation is inaccurate with the updated plugin system ? #1161

Closed
fwyzard opened this issue Feb 22, 2020 · 10 comments
Closed
Assignees
Labels
bug Something isn't working cuda CUDA back-end

Comments

@fwyzard
Copy link
Contributor

fwyzard commented Feb 22, 2020

I am trying the cuda branch (currently at 38ec8bf8f2c) and I can compile and run the simple application from the "getting started" guide.

However, the behaviour of the plugin system does not seem to match what is documented.

According to the guide,

the CUDA backend must be selected at runtime using the SYCL_BE environment variable.

SYCL_BE=PI_CUDA ./simple-sycl-app-cuda.exe

However, setting the SYCL_BE variable does not seem to make any difference.

Let's build the simple application (I've added a print out of the device being used):

$ build/bin/clang++ -fsycl -fsycl-targets=spir64-unknown-linux-sycldevice -Wno-unknown-cuda-version simple-sycl-app.cpp -o simple-sycl-app-opencl

Check the available devices:

$ clinfo -l
Platform #0: Intel(R) OpenCL HD Graphics
 `-- Device #0: Intel(R) Gen9 HD Graphics NEO
Platform #1: Intel(R) OpenCL
 `-- Device #0: Intel(R) Core(TM) i9-9900K CPU @ 3.60GHz

(there's a Tesla K40 as well, but the NVIDIA ICD has been removed to avoid even more confusion)

And run with no SYCL_BE variable:

$ ./simple-sycl-app-opencl
Running on SYCL device Intel(R) Gen9 HD Graphics NEO, driver version 20.06.15619
The results are correct!

Set the OpenCL backend:

$ SYCL_BE=PI_OPENCL ./simple-sycl-app-opencl
Running on SYCL device Intel(R) Gen9 HD Graphics NEO, driver version 20.06.15619
The results are correct!

OK, so far so good. Now with the CUDA backend:

$ SYCL_BE=PI_CUDA ./simple-sycl-app-opencl 
Running on SYCL device Intel(R) Gen9 HD Graphics NEO, driver version 20.06.15619
The results are correct!

Ehm, what ?

Something similar happens with the CUDA backend.

Let's rebuild the application with CUDA support:

$ build/bin/clang++ -fsycl -fsycl-targets=nvptx64-nvidia-cuda-sycldevice -Wno-unknown-cuda-version simple-sycl-app.cpp -o simple-sycl-app-cuda

Try with the default selector:

$ ./simple-sycl-app-cuda
Running on SYCL device Intel(R) Gen9 HD Graphics NEO, driver version 20.06.15619
terminate called after throwing an instance of 'cl::sycl::runtime_error'
  what():  OpenCL API failed. OpenCL API returns: -42 (CL_INVALID_BINARY) -42 (CL_INVALID_BINARY)
Aborted (core dumped)

That didn't work... maybe setting the plugin explicitly ?

$ SYCL_BE=PI_CUDA ./simple-sycl-app-cuda   
Running on SYCL device Intel(R) Gen9 HD Graphics NEO, driver version 20.06.15619
terminate called after throwing an instance of 'cl::sycl::runtime_error'
  what():  OpenCL API failed. OpenCL API returns: -42 (CL_INVALID_BINARY) -42 (CL_INVALID_BINARY)
Aborted (core dumped)

Nope.
OK, remove all ICDs and try again:

$ OCL_ICD_VENDORS= SYCL_BE=PI_CUDA ./simple-sycl-app-cuda
Running on SYCL device Tesla K40c, driver version CUDA 10.20
The results are correct!

So, it does work.
And without the SYCL_BE variable ?

$ OCL_ICD_VENDORS= ./simple-sycl-app-cuda
Running on SYCL device Tesla K40c, driver version CUDA 10.20
The results are correct!

Still works.
What if I force the OpenCL plugin ?

$ OCL_ICD_VENDORS= SYCL_BE=PI_OPENCL ./simple-sycl-app-cuda
Running on SYCL device Tesla K40c, driver version CUDA 10.20
The results are correct!

It still works !

So, it seems that the SYCL_BE variable is not needed any more, and in fact it is mostly ignored:

$ OCL_ICD_VENDORS= SYCL_BE=PI_IS_314 ./simple-sycl-app-cuda
Running on SYCL device Tesla K40c, driver version CUDA 10.20
The results are correct!

The only value with any effect seems to be PI_OTHER:

$ OCL_ICD_VENDORS= SYCL_BE=PI_OTHER ./simple-sycl-app-cuda
pi_die: Unknown SYCL_BE
terminate called without an active exception
Aborted (core dumped)

An other interesting and possibly undocumented result: it looks like it is possible to build a binary that supports both OpenCL and CUDA backends:

$ build/bin/clang++ -fsycl -fsycl-targets=nvptx64-nvidia-cuda-sycldevice,spir64-unknown-linux-sycldevice -Wno-unknown-cuda-version simple-sycl-app.cpp -o simple-sycl-app

$ ./simple-sycl-app
Running on SYCL device Intel(R) Gen9 HD Graphics NEO, driver version 20.06.15619
The results are correct!

$ OCL_ICD_VENDORS= ./simple-sycl-app
Running on SYCL device Tesla K40c, driver version CUDA 10.20
The results are correct!

Magic !

@bader bader added bug Something isn't working cuda CUDA back-end labels Feb 22, 2020
@bader
Copy link
Contributor

bader commented Feb 22, 2020

@fwyzard, thank you for trying it out and sharing the results of your experiments.
I noticed that the sample used in this experiment uses default device selector and it seems we have to improve how it behaves.
Also, although it's not documented in CUDA back-end documentation, multiple back-ends support was enabled by Plug-in system design, so it's expected to work.

@fwyzard
Copy link
Contributor Author

fwyzard commented Feb 22, 2020

@bader thanks for your comments.

Since it looks like both OpenCL and CUDA plugins are always available (if built), what is the purpose of the SYCL_BE variable ?

One aspect that could be improved is to avoid listing platforms or devices for which there is no support in the binary. That is, an application compiled with -fsycl-targets=nvptx64-nvidia-cuda-sycldevice should list only the CUDA devices (and the host device), and so on.

@bader
Copy link
Contributor

bader commented Feb 23, 2020

Since it looks like both OpenCL and CUDA plugins are always available (if built), what is the purpose of the SYCL_BE variable ?

I think someone from Codeplay team should reply. According to my understanding it might be just an artifact of old implementation version. We added runtime library support for multiple plug-ins not long ago.

One aspect that could be improved is to avoid listing platforms or devices for which there is no support in the binary. That is, an application compiled with -fsycl-targets=nvptx64-nvidia-cuda-sycldevice should list only the CUDA devices (and the host device), and so on.

I was my first thought too, but this might cause problems for applications, which don't have device code (e.g. use libraries with SYCL API). The application might just create a command queue, buffers and call some library, which has device code.

@fwyzard
Copy link
Contributor Author

fwyzard commented Feb 23, 2020

... this might cause problems for applications, which don't have device code (e.g. use libraries with SYCL API). The application might just create a command queue, buffers and call some library, which has device code.

I see your point - the part of the application that queries the devices and the part that has the device code could have been built with support for different devices.

I think it would be useful to have some way for an application to know if a given device (or platform) is usable or not, for a given kernel.

@Ruyk
Copy link
Contributor

Ruyk commented Feb 24, 2020

@bader thanks for your comments.

Since it looks like both OpenCL and CUDA plugins are always available (if built), what is the purpose of the SYCL_BE variable ?

One aspect that could be improved is to avoid listing platforms or devices for which there is no support in the binary. That is, an application compiled with -fsycl-targets=nvptx64-nvidia-cuda-sycldevice should list only the CUDA devices (and the host device), and so on.

The PI interface added support for multiple backends while the MR for the CUDA backend was up. Rebasing it was more trouble than expected, @Alexander-Johnston and @bader (and others) have been working on reviewing and rebasing the MR.
All this work is in the cuda-dev branch. The cuda branch will be updated once the cuda-dev is a bit more stable.

The SYCL_BE env var was originally used to select the backend, when multiple backend support was not available. The SYCL_BE now is only needed when the default_selector is used. The default selector can decide to pick any device in the system, but it doesnt have access to the information of which device images are available (PTX and/or SPIRV). The env variable helps forcing to decide a device coming from a specific backend.
Ideally in the future we will come up with a better mechanism for the default selector that doesn't require the environment variable.

@fwyzard
Copy link
Contributor Author

fwyzard commented Feb 24, 2020

Thanks for the details @Ruyk .

I can move my tests to the sycl + cuda-dev branches, and check again the behaviour.

@fwyzard
Copy link
Contributor Author

fwyzard commented Feb 25, 2020

OK, with the recently merged cuda-dev branch (and so with the current sycl branch) the behaviour is different:

  • -fsycl-targets=... specifies which backends to buld for (spir64-unknown-linux-sycldevice and/or nvptx64-nvidia-cuda-sycldevice);
  • SYCL_BE specifies which type of device should be selected by the default device selector (CUDA or OpenCL);
  • neither have any effect on the list of available devices.

@Ruyk
Copy link
Contributor

Ruyk commented Mar 20, 2020

@fwyzard , is this still relevant?

@fwyzard
Copy link
Contributor Author

fwyzard commented Mar 20, 2020

I think #1293 updated the getting started guide to describe the current status; however it should be updated again once #1252 is merged.

If you prefer, we can close this issue, and review the documentation after #1252 is merged.

@Ruyk
Copy link
Contributor

Ruyk commented Jun 8, 2020

I think this can be closed now, @fwyzard feel free to open a new issue if something is missing!

@Ruyk Ruyk closed this as completed Jun 8, 2020
vmaksimo pushed a commit to vmaksimo/llvm that referenced this issue Aug 23, 2021
Signed-off-by: Dmitry Sidorov <dmitry.sidorov@intel.com>

Original commit:
KhronosGroup/SPIRV-LLVM-Translator@65b5d1b
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working cuda CUDA back-end
Projects
None yet
Development

No branches or pull requests

3 participants