Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OpenCL clCreateCommandQueue error -30 on MacOS 13.4 intel #996

Open
warkcod opened this issue Jun 8, 2023 · 3 comments
Open

OpenCL clCreateCommandQueue error -30 on MacOS 13.4 intel #996

warkcod opened this issue Jun 8, 2023 · 3 comments

Comments

@warkcod
Copy link

warkcod commented Jun 8, 2023

./main -f ../../../../whiper/test1.wav -m ../../models/ggml-small.bin
whisper_init_from_file_no_state: loading model from '../../models/ggml-small.bin'
whisper_model_load: loading model
whisper_model_load: n_vocab = 51865
whisper_model_load: n_audio_ctx = 1500
whisper_model_load: n_audio_state = 768
whisper_model_load: n_audio_head = 12
whisper_model_load: n_audio_layer = 12
whisper_model_load: n_text_ctx = 448
whisper_model_load: n_text_state = 768
whisper_model_load: n_text_head = 12
whisper_model_load: n_text_layer = 12
whisper_model_load: n_mels = 80
whisper_model_load: ftype = 1
whisper_model_load: qntvr = 0
whisper_model_load: type = 3
whisper_model_load: mem required = 743.00 MB (+ 16.00 MB per decoder)
whisper_model_load: adding 1608 extra tokens
whisper_model_load: model ctx = 464.68 MB

Initializing CLBlast (First Run)...
Attempting to use: Platform=0, Device=0 (If invalid, program will crash)
Using Platform: Apple Device: Intel(R) Core(TM) i7-9750H CPU @ 2.60GHz
OpenCL clCreateCommandQueue error -30 at whisper.cpp/ggml-opencl.c:215

Any ideas for how to solve it? Thank you very much.

@warkcod
Copy link
Author

warkcod commented Jun 8, 2023

make test Running tests... Test project CLBlast/build Start 1: clblast_test_xswap 1/53 Test #1: clblast_test_xswap ................. Passed 0.36 sec Start 2: clblast_test_xscal 2/53 Test #2: clblast_test_xscal ................. Passed 0.24 sec Start 3: clblast_test_xcopy 3/53 Test #3: clblast_test_xcopy ................. Passed 0.33 sec Start 4: clblast_test_xaxpy 4/53 Test #4: clblast_test_xaxpy ................. Passed 0.33 sec Start 5: clblast_test_xdot 5/53 Test #5: clblast_test_xdot .................. Passed 0.21 sec Start 6: clblast_test_xdotu 6/53 Test #6: clblast_test_xdotu ................. Passed 0.22 sec Start 7: clblast_test_xdotc 7/53 Test #7: clblast_test_xdotc ................. Passed 0.22 sec Start 8: clblast_test_xnrm2 8/53 Test #8: clblast_test_xnrm2 ................. Passed 0.24 sec Start 9: clblast_test_xasum 9/53 Test #9: clblast_test_xasum ................. Passed 0.24 sec Start 10: clblast_test_xamax 10/53 Test #10: clblast_test_xamax ................. Passed 0.24 sec Start 11: clblast_test_xgemv 11/53 Test #11: clblast_test_xgemv ................. Passed 1.41 sec Start 12: clblast_test_xgbmv 12/53 Test #12: clblast_test_xgbmv ................. Passed 5.49 sec Start 13: clblast_test_xhemv 13/53 Test #13: clblast_test_xhemv ................. Passed 0.42 sec Start 14: clblast_test_xhbmv 14/53 Test #14: clblast_test_xhbmv ................. Passed 0.75 sec Start 15: clblast_test_xhpmv 15/53 Test #15: clblast_test_xhpmv ................. Passed 0.31 sec Start 16: clblast_test_xsymv 16/53 Test #16: clblast_test_xsymv ................. Passed 0.37 sec Start 17: clblast_test_xsbmv 17/53 Test #17: clblast_test_xsbmv ................. Passed 0.64 sec Start 18: clblast_test_xspmv 18/53 Test #18: clblast_test_xspmv ................. Passed 0.29 sec Start 19: clblast_test_xtrmv 19/53 Test #19: clblast_test_xtrmv ................. Passed 1.18 sec Start 20: clblast_test_xtbmv 20/53 Test #20: clblast_test_xtbmv ................. Passed 2.06 sec Start 21: clblast_test_xtpmv 21/53 Test #21: clblast_test_xtpmv ................. Passed 0.65 sec Start 22: clblast_test_xtrsv 22/53 Test #22: clblast_test_xtrsv ................. Passed 7.18 sec Start 23: clblast_test_xger 23/53 Test #23: clblast_test_xger .................. Passed 0.50 sec Start 24: clblast_test_xgeru 24/53 Test #24: clblast_test_xgeru ................. Passed 0.58 sec Start 25: clblast_test_xgerc 25/53 Test #25: clblast_test_xgerc ................. Passed 0.59 sec Start 26: clblast_test_xher 26/53 Test #26: clblast_test_xher .................. Passed 0.29 sec Start 27: clblast_test_xhpr 27/53 Test #27: clblast_test_xhpr .................. Passed 0.24 sec Start 28: clblast_test_xher2 28/53 Test #28: clblast_test_xher2 ................. Passed 0.60 sec Start 29: clblast_test_xhpr2 29/53 Test #29: clblast_test_xhpr2 ................. Passed 0.45 sec Start 30: clblast_test_xsyr 30/53 Test #30: clblast_test_xsyr .................. Passed 0.33 sec Start 31: clblast_test_xspr 31/53 Test #31: clblast_test_xspr .................. Passed 0.25 sec Start 32: clblast_test_xsyr2 32/53 Test #32: clblast_test_xsyr2 ................. Passed 0.50 sec Start 33: clblast_test_xspr2 33/53 Test #33: clblast_test_xspr2 ................. Passed 0.36 sec Start 34: clblast_test_xgemm 34/53 Test #34: clblast_test_xgemm ................. Passed 14.61 sec Start 35: clblast_test_xsymm 35/53 Test #35: clblast_test_xsymm ................. Passed 0.87 sec Start 36: clblast_test_xhemm 36/53 Test #36: clblast_test_xhemm ................. Passed 0.49 sec Start 37: clblast_test_xsyrk 37/53 Test #37: clblast_test_xsyrk ................. Passed 0.55 sec Start 38: clblast_test_xherk 38/53 Test #38: clblast_test_xherk ................. Passed 0.35 sec Start 39: clblast_test_xsyr2k 39/53 Test #39: clblast_test_xsyr2k ................ Passed 0.94 sec Start 40: clblast_test_xher2k 40/53 Test #40: clblast_test_xher2k ................ Passed 0.56 sec Start 41: clblast_test_xtrmm 41/53 Test #41: clblast_test_xtrmm ................. Passed 1.82 sec Start 42: clblast_test_xtrsm 42/53 Test #42: clblast_test_xtrsm ................. Passed 1.16 sec Start 43: clblast_test_xhad 43/53 Test #43: clblast_test_xhad .................. Passed 0.25 sec Start 44: clblast_test_xomatcopy 44/53 Test #44: clblast_test_xomatcopy ............. Passed 0.38 sec Start 45: clblast_test_xim2col 45/53 Test #45: clblast_test_xim2col ............... Passed 13.21 sec Start 46: clblast_test_xcol2im 46/53 Test #46: clblast_test_xcol2im ............... Passed 12.53 sec Start 47: clblast_test_xconvgemm 47/53 Test #47: clblast_test_xconvgemm ............. Passed 33.38 sec Start 48: clblast_test_xaxpybatched 48/53 Test #48: clblast_test_xaxpybatched .......... Passed 0.49 sec Start 49: clblast_test_xgemmbatched 49/53 Test #49: clblast_test_xgemmbatched .......... Passed 4.40 sec Start 50: clblast_test_xgemmstridedbatched 50/53 Test #50: clblast_test_xgemmstridedbatched ... Passed 3.99 sec Start 51: clblast_test_override_parameters 51/53 Test #51: clblast_test_override_parameters ... Passed 1.05 sec Start 52: clblast_test_retrieve_parameters 52/53 Test #52: clblast_test_retrieve_parameters ... Passed 0.40 sec Start 53: clblast_test_preprocessor 53/53 Test #53: clblast_test_preprocessor .......... Passed 2.82 sec

100% tests passed, 0 tests failed out of 53

Total Test time (real) = 122.35 sec

./main -f ../../../../whiper/test1.wav -m ../../models/ggml-small.bin whisper_init_from_file_no_state: loading model from '../../models/ggml-small.bin' whisper_model_load: loading model whisper_model_load: n_vocab = 51865 whisper_model_load: n_audio_ctx = 1500 whisper_model_load: n_audio_state = 768 whisper_model_load: n_audio_head = 12 whisper_model_load: n_audio_layer = 12 whisper_model_load: n_text_ctx = 448 whisper_model_load: n_text_state = 768 whisper_model_load: n_text_head = 12 whisper_model_load: n_text_layer = 12 whisper_model_load: n_mels = 80 whisper_model_load: ftype = 1 whisper_model_load: qntvr = 0 whisper_model_load: type = 3 whisper_model_load: mem required = 743.00 MB (+ 16.00 MB per decoder) whisper_model_load: adding 1608 extra tokens whisper_model_load: model ctx = 464.68 MB

Initializing CLBlast (First Run)... Attempting to use: Platform=0, Device=0 (If invalid, program will crash) Using Platform: Apple Device: Intel(R) Core(TM) i7-9750H CPU @ 2.60GHz OpenCL clCreateCommandQueue error -30 at whisper.cpp/ggml-opencl.c:215

Any ideas for how to solve it? Thank you very much.

make test
Running tests...
Test project CLBlast/build
Start 1: clblast_test_xswap
1/53 Test #1: clblast_test_xswap ................. Passed 0.36 sec
Start 2: clblast_test_xscal
2/53 Test #2: clblast_test_xscal ................. Passed 0.24 sec
Start 3: clblast_test_xcopy
3/53 Test #3: clblast_test_xcopy ................. Passed 0.33 sec
Start 4: clblast_test_xaxpy
4/53 Test #4: clblast_test_xaxpy ................. Passed 0.33 sec
Start 5: clblast_test_xdot
5/53 Test #5: clblast_test_xdot .................. Passed 0.21 sec
Start 6: clblast_test_xdotu
6/53 Test #6: clblast_test_xdotu ................. Passed 0.22 sec
Start 7: clblast_test_xdotc
7/53 Test #7: clblast_test_xdotc ................. Passed 0.22 sec
Start 8: clblast_test_xnrm2
8/53 Test #8: clblast_test_xnrm2 ................. Passed 0.24 sec
Start 9: clblast_test_xasum
9/53 Test #9: clblast_test_xasum ................. Passed 0.24 sec
Start 10: clblast_test_xamax
10/53 Test #10: clblast_test_xamax ................. Passed 0.24 sec
Start 11: clblast_test_xgemv
11/53 Test #11: clblast_test_xgemv ................. Passed 1.41 sec
Start 12: clblast_test_xgbmv
12/53 Test #12: clblast_test_xgbmv ................. Passed 5.49 sec
Start 13: clblast_test_xhemv
13/53 Test #13: clblast_test_xhemv ................. Passed 0.42 sec
Start 14: clblast_test_xhbmv
14/53 Test #14: clblast_test_xhbmv ................. Passed 0.75 sec
Start 15: clblast_test_xhpmv
15/53 Test #15: clblast_test_xhpmv ................. Passed 0.31 sec
Start 16: clblast_test_xsymv
16/53 Test #16: clblast_test_xsymv ................. Passed 0.37 sec
Start 17: clblast_test_xsbmv
17/53 Test #17: clblast_test_xsbmv ................. Passed 0.64 sec
Start 18: clblast_test_xspmv
18/53 Test #18: clblast_test_xspmv ................. Passed 0.29 sec
Start 19: clblast_test_xtrmv
19/53 Test #19: clblast_test_xtrmv ................. Passed 1.18 sec
Start 20: clblast_test_xtbmv
20/53 Test #20: clblast_test_xtbmv ................. Passed 2.06 sec
Start 21: clblast_test_xtpmv
21/53 Test #21: clblast_test_xtpmv ................. Passed 0.65 sec
Start 22: clblast_test_xtrsv
22/53 Test #22: clblast_test_xtrsv ................. Passed 7.18 sec
Start 23: clblast_test_xger
23/53 Test #23: clblast_test_xger .................. Passed 0.50 sec
Start 24: clblast_test_xgeru
24/53 Test #24: clblast_test_xgeru ................. Passed 0.58 sec
Start 25: clblast_test_xgerc
25/53 Test #25: clblast_test_xgerc ................. Passed 0.59 sec
Start 26: clblast_test_xher
26/53 Test #26: clblast_test_xher .................. Passed 0.29 sec
Start 27: clblast_test_xhpr
27/53 Test #27: clblast_test_xhpr .................. Passed 0.24 sec
Start 28: clblast_test_xher2
28/53 Test #28: clblast_test_xher2 ................. Passed 0.60 sec
Start 29: clblast_test_xhpr2
29/53 Test #29: clblast_test_xhpr2 ................. Passed 0.45 sec
Start 30: clblast_test_xsyr
30/53 Test #30: clblast_test_xsyr .................. Passed 0.33 sec
Start 31: clblast_test_xspr
31/53 Test #31: clblast_test_xspr .................. Passed 0.25 sec
Start 32: clblast_test_xsyr2
32/53 Test #32: clblast_test_xsyr2 ................. Passed 0.50 sec
Start 33: clblast_test_xspr2
33/53 Test #33: clblast_test_xspr2 ................. Passed 0.36 sec
Start 34: clblast_test_xgemm
34/53 Test #34: clblast_test_xgemm ................. Passed 14.61 sec
Start 35: clblast_test_xsymm
35/53 Test #35: clblast_test_xsymm ................. Passed 0.87 sec
Start 36: clblast_test_xhemm
36/53 Test #36: clblast_test_xhemm ................. Passed 0.49 sec
Start 37: clblast_test_xsyrk
37/53 Test #37: clblast_test_xsyrk ................. Passed 0.55 sec
Start 38: clblast_test_xherk
38/53 Test #38: clblast_test_xherk ................. Passed 0.35 sec
Start 39: clblast_test_xsyr2k
39/53 Test #39: clblast_test_xsyr2k ................ Passed 0.94 sec
Start 40: clblast_test_xher2k
40/53 Test #40: clblast_test_xher2k ................ Passed 0.56 sec
Start 41: clblast_test_xtrmm
41/53 Test #41: clblast_test_xtrmm ................. Passed 1.82 sec
Start 42: clblast_test_xtrsm
42/53 Test #42: clblast_test_xtrsm ................. Passed 1.16 sec
Start 43: clblast_test_xhad
43/53 Test #43: clblast_test_xhad .................. Passed 0.25 sec
Start 44: clblast_test_xomatcopy
44/53 Test #44: clblast_test_xomatcopy ............. Passed 0.38 sec
Start 45: clblast_test_xim2col
45/53 Test #45: clblast_test_xim2col ............... Passed 13.21 sec
Start 46: clblast_test_xcol2im
46/53 Test #46: clblast_test_xcol2im ............... Passed 12.53 sec
Start 47: clblast_test_xconvgemm
47/53 Test #47: clblast_test_xconvgemm ............. Passed 33.38 sec
Start 48: clblast_test_xaxpybatched
48/53 Test #48: clblast_test_xaxpybatched .......... Passed 0.49 sec
Start 49: clblast_test_xgemmbatched
49/53 Test #49: clblast_test_xgemmbatched .......... Passed 4.40 sec
Start 50: clblast_test_xgemmstridedbatched
50/53 Test #50: clblast_test_xgemmstridedbatched ... Passed 3.99 sec
Start 51: clblast_test_override_parameters
51/53 Test #51: clblast_test_override_parameters ... Passed 1.05 sec
Start 52: clblast_test_retrieve_parameters
52/53 Test #52: clblast_test_retrieve_parameters ... Passed 0.40 sec
Start 53: clblast_test_preprocessor
53/53 Test #53: clblast_test_preprocessor .......... Passed 2.82 sec

100% tests passed, 0 tests failed out of 53

Total Test time (real) = 122.35 sec

@warkcod
Copy link
Author

warkcod commented Jun 8, 2023

I have just fixed it according to ggerganov/llama.cpp#1429
Personally I just change the argument in clCreateCommandQueue in ggml-opencl.c here to simply have no flags.

queue = clCreateCommandQueue(context, device, 0, &err);

And it should compile and run fine!

@warkcod
Copy link
Author

warkcod commented Jun 8, 2023

Then it came out to be this error:
<program source>:1:131: error: fields must have a constant size: 'variable length array in structure' extension will never be supported typedef uchar uint8_t; typedef int int32_t; typedef uint uint32_t; constant uint QK4_0 = 32; struct block_q4_0 { float d; uint8_t qs[QK4_0 / 2]; }; constant uint QK4_1 = 32; struct block_q4_1 { float d; float m; uint8_t qs[QK4_1 / 2]; }; constant uint QK5_0 = 32; struct __attribute__ ((packed)) block_q5_0 { half d; uint32_t qh; uint8_t qs[QK5_0 / 2]; }; constant uint QK5_1 = 32; struct block_q5_1 { half d; half m; uint32_t qh; uint8_t qs[QK5_1 / 2]; }; constant uint QK8_0 = 32; struct block_q8_0 { float d; uint8_t qs[QK8_0]; }; __kernel void dequantize_row_q4_0(__global struct block_q4_0* x, __global float* y) { constant uint qk = QK4_0; const uint i = get_global_id(0) / qk; const uint j = get_local_id(0); const float d = x[i].d; const int x0 = (x[i].qs[j] & 0xf) - 8; const int x1 = (x[i].qs[j] >> 4) - 8; y[i*qk + j + 0 ] = x0*d; y[i*qk + j + qk/2] = x1*d; } __kernel void dequantize_row_q4_1(__global struct block_q4_1* x, __global float* y) { constant uint qk = QK4_1; const uint i = get_global_id(0) / qk; const uint j = get_local_id(0); const float d = x[i].d; const float m = x[i].m; const int x0 = (x[i].qs[j] & 0xf); const int x1 = (x[i].qs[j] >> 4); y[i*qk + j + 0 ] = x0*d + m; y[i*qk + j + qk/2] = x1*d + m; } __kernel void dequantize_row_q5_0(__global struct block_q5_0* x, __global float* y) { constant uint qk = QK5_0; const uint i = get_global_id(0) / qk; const uint j = get_local_id(0); const float d = vload_half(0, (__global half*) &x[i].d); uint32_t qh = x[i].qh; const uint8_t xh_0 = ((qh >> (j + 0)) << 4) & 0x10; const uint8_t xh_1 = ((qh >> (j + 12)) ) & 0x10; const int32_t x0 = ((x[i].qs[j] & 0xf) | xh_0) - 16; const int32_t x1 = ((x[i].qs[j] >> 4) | xh_1) - 16; y[i*qk + j + 0 ] = x0*d; y[i*qk + j + qk/2] = x1*d; } __kernel void dequantize_row_q5_1(__global struct block_q5_1* x, __global float* y) { constant uint qk = QK5_1; const uint i = get_global_id(0) / qk; const uint j = get_local_id(0); const float d = vload_half(0, (__global half*) &x[i].d); const float m = vload_half(0, (__global half*) &x[i].m); uint32_t qh = x[i].qh; const uint8_t xh_0 = ((qh >> (j + 0)) << 4) & 0x10; const uint8_t xh_1 = ((qh >> (j + 12)) ) & 0x10; const int x0 = (x[i].qs[j] & 0xf) | xh_0; const int x1 = (x[i].qs[j] >> 4) | xh_1; y[i*qk + j + 0 ] = x0*d + m; y[i*qk + j + qk/2] = x1*d + m; } __kernel void dequantize_row_q8_0(__global struct block_q8_0* x, __global float* y) { constant uint qk = QK8_0; const uint i = get_global_id(0) / qk; const uint j = get_local_id(0); const float d = x[i].d; y[i*qk + j] = x[i].qs[j]*d; }

After investigating, I found solution here ggerganov/llama.cpp#1435
And just use this fixed code to replace the same file ggml-opencl.c
https://raw.githubusercontent.com/ggerganov/llama.cpp/9ecb30f9594f222d8318fb1e803a4c363b0c39e5/ggml-opencl.c

All get done.
Hope to help anyone who encounter this issue.

Also, @ggerganov Please fix this issue by the way, thank you very much.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant