-
Notifications
You must be signed in to change notification settings - Fork 751
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SYCL] Allow gcc asm statements in kernel code. #1341
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This likely needs a codegen test to show what actually happens. Have you done an end-to-end test to show that this still works?
I have run them by hand (only), but there is someone further downstream who has a whole bunch of these tests and have validated the results. |
add1443
to
a6ee31b
Compare
I added a test recreated from a supplied example to check the asm part. |
c775ac7
to
819ada7
Compare
template <typename name, typename Func> | ||
__attribute__((sycl_kernel)) void kernel_single_task(Func kernelFunc) { | ||
// CHECK: %[[ARRAY_A:[0-9a-z]+]] = alloca [100 x i32], align 4 | ||
// CHECK-NEXT: %[[NUM:[0-9]+]] = bitcast [100 x i32]* %[[ARRAY_A]] to i8* |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is only there for the lifetime marker, so no need for htis check line either.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks again, Erich!
819ada7
to
d3553c3
Compare
Interesting extension! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
I would just remove some unnecessary code from the test.
Signed-off-by: Premanand M Rao <premanand.m.rao@intel.com>
d3553c3
to
13599d9
Compare
@bader, if you get a chance, please merge this. Thank you! |
* sycl: (1209 commits) [SYCL] Check exit status get_device_count_by_type [SYCL][Doc] Update sub-group extension docs (intel#1330) [SYCL][Doc] Add leader to GroupAlgorithms (intel#1297) [SYCL] Add SYCL headers search path to default compilation options (intel#1347) [SYCL][PI] Add interoperability with generic handles to device and program classes (intel#1244) Move SPIR devicelib to top level (intel#1276) [SYCL][Driver] Improve fat static library support (intel#1319) [SYCL] Remove image_api LIT (intel#1349) [SYCL] Fix headers location for check-sycl-deploy target [SYCL] Allow gcc asm statements in kernel code (intel#1341) [SYCL] Add Intel FPGA force_pow2_depth attribute (intel#1284) [SPIR-V][NFC] Fix for building llvm-spirv with -DLLVM_LINK_LLVM_DYLIB=ON (intel#1323) [SYCL][NFC] Fix execution graph dump (intel#1331) [SYCL][Doc] Release SYCL_INTEL_enqueue_barrier extension document (intel#1199) [SYCL][USM] Fix USM malloc_shared and free to handle zero byte (intel#1273) [SYCL] Fix undefined symbols in async_work_group_copy (intel#1243) [SYCL] Mark calls to barrier and work-item functions as convergent [SYCL][CUDA] Fix CUDA plug-in build with enabled assertions (intel#1325) [SYCL][Test] Add OpenCL requirement to test/ordered_queue/prop.cpp (intel#1335) [SYCL][CUDA] Improve CUDA backend documentation (intel#1293) ...
Signed-off-by: Premanand M Rao premanand.m.rao@intel.com