Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OPENCL Couldn't create sub-devices. Error #34

Open
maolun opened this issue Mar 21, 2018 · 4 comments
Open

OPENCL Couldn't create sub-devices. Error #34

maolun opened this issue Mar 21, 2018 · 4 comments

Comments

@maolun
Copy link

maolun commented Mar 21, 2018

Hello,

I was tying NGM at Texas Advanced Computing Center (https://portal.tacc.utexas.edu/user-guides/stampede2). However, an error occurs constantly. I compiled the NGM through CMake. I wonder if anyone has insight on how to solve this issue. Thanks. Any suggestion is greatly appreciated.

ESC[AESC[2K[OPENCL] Available platforms: 1
[OPENCL] AMD Accelerated Parallel Processing
[OPENCL] Selecting OpenCl platform: AMD Accelerated Parallel Processing
[OPENCL] Platform: OpenCL 1.2 AMD-APP (1214.3)
[OPENCL] 1 CPU device found.
[OPENCL] Device 0: Intel(R) Xeon Phi(TM) CPU 7250 @ 1.40GHz (Driver: 1214.3 (sse2,avx))
[OPENCL] Couldn't create sub-devices. Error:
Error: Invalid value (-30)
terminate called without an active exception

Best,
Mao-Lun

@maolun
Copy link
Author

maolun commented Mar 23, 2018

Hello,

I made a change on line 117 of the file, OclHost.cpp, from "if (ciErrNum == -18)" to "if (ciErrNum == -30), and then compiled the sources using cmake. Now the NGM works.

Mao-Lun

@fritzsedlazeck
Copy link
Member

Thanks
Fritz

@hermannschwaerzlerUIBK
Copy link

hermannschwaerzlerUIBK commented May 14, 2019

Hi @maolun,
hi @fritzsedlazeck,

I had the very same problem on this machine: https://www3.risc.jku.at/projects/mach2/.
As far as I can tell the problem occurs as soon as a computer has more than 256 cores/hardware threads. MACH2 has more than 1700 cores and the Knights Landing (KNL) compute nodes of Stampede 2 have 272 hardware threads (if I read the documentation correctly).

My solution for the problem was this change to the code:

--- lib/mason/opencl/OclHost.cpp.orig   2019-05-14 16:33:09.313712490 +0200
+++ lib/mason/opencl/OclHost.cpp        2019-05-14 16:30:00.601698181 +0200
@@ -111,8 +111,8 @@
                        props[1] = 1; // 4 compute units per sub-device
                        props[2] = 0;

-                       devices = (cl_device_id *) malloc(256 * sizeof(cl_device_id));
-                       ciErrNum = clCreateSubDevices(device_id, props, 256, devices,
+                       devices = (cl_device_id *) malloc(2560 * sizeof(cl_device_id));
+                       ciErrNum = clCreateSubDevices(device_id, props, 2560, devices,
                                        &ciDeviceCount);
                        if (ciErrNum == -18) {
                                ciDeviceCount = 1;

This works for me but will fail as soon as there is a machine with more than 2560 cores (per node).
A better solution might be to first find the core count (maybe like this: https://stackoverflow.com/questions/150355/programmatically-find-the-number-of-cores-on-a-machine) and use this number in the malloc and the clCreateSubDevices calls.

What do you think?

BTW: that comment "4 compute units per sub-device" you see above is most probably wrong, isn't it?

Greetings
Hermann

@fritzsedlazeck
Copy link
Member

Hi Hermann,
thanks for digging in. I think that comment was left over from the GPU code....
Thanks for looking at this.
Fritz

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants