Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SYCL] Allow gcc asm statements in kernel code. #1341

Merged
merged 1 commit into from
Mar 19, 2020

Conversation

premanandrao
Copy link
Contributor

Signed-off-by: Premanand M Rao premanand.m.rao@intel.com

Copy link
Contributor

@erichkeane erichkeane left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This likely needs a codegen test to show what actually happens. Have you done an end-to-end test to show that this still works?

@premanandrao
Copy link
Contributor Author

This likely needs a codegen test to show what actually happens. Have you done an end-to-end test to show that this still works?

I have run them by hand (only), but there is someone further downstream who has a whole bunch of these tests and have validated the results.

@premanandrao
Copy link
Contributor Author

This likely needs a codegen test to show what actually happens.

I added a test recreated from a supplied example to check the asm part.

@premanandrao premanandrao force-pushed the remote_inline_asm branch 2 times, most recently from c775ac7 to 819ada7 Compare March 17, 2020 21:21
template <typename name, typename Func>
__attribute__((sycl_kernel)) void kernel_single_task(Func kernelFunc) {
// CHECK: %[[ARRAY_A:[0-9a-z]+]] = alloca [100 x i32], align 4
// CHECK-NEXT: %[[NUM:[0-9]+]] = bitcast [100 x i32]* %[[ARRAY_A]] to i8*
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is only there for the lifetime marker, so no need for htis check line either.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks again, Erich!

erichkeane
erichkeane previously approved these changes Mar 17, 2020
@keryell
Copy link
Contributor

keryell commented Mar 17, 2020

Interesting extension!

Copy link
Contributor

@bader bader left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.
I would just remove some unnecessary code from the test.

clang/test/CodeGenSYCL/inline_asm.cpp Outdated Show resolved Hide resolved
clang/test/CodeGenSYCL/inline_asm.cpp Outdated Show resolved Hide resolved
clang/test/CodeGenSYCL/inline_asm.cpp Outdated Show resolved Hide resolved
Signed-off-by: Premanand M Rao <premanand.m.rao@intel.com>
@premanandrao
Copy link
Contributor Author

@bader, if you get a chance, please merge this. Thank you!

@bader bader merged commit 6f4e007 into intel:sycl Mar 19, 2020
alexbatashev pushed a commit to alexbatashev/llvm that referenced this pull request Mar 20, 2020
* sycl: (1209 commits)
  [SYCL] Check exit status get_device_count_by_type
  [SYCL][Doc] Update sub-group extension docs (intel#1330)
  [SYCL][Doc] Add leader to GroupAlgorithms (intel#1297)
  [SYCL] Add SYCL headers search path to default compilation options (intel#1347)
  [SYCL][PI] Add interoperability with generic handles to device and program classes (intel#1244)
  Move SPIR devicelib to top level (intel#1276)
  [SYCL][Driver] Improve fat static library support (intel#1319)
  [SYCL] Remove image_api LIT (intel#1349)
  [SYCL] Fix headers location for check-sycl-deploy target
  [SYCL] Allow gcc asm statements in kernel code (intel#1341)
  [SYCL] Add Intel FPGA force_pow2_depth attribute (intel#1284)
  [SPIR-V][NFC] Fix for building llvm-spirv with -DLLVM_LINK_LLVM_DYLIB=ON (intel#1323)
  [SYCL][NFC] Fix execution graph dump (intel#1331)
  [SYCL][Doc] Release SYCL_INTEL_enqueue_barrier extension document (intel#1199)
  [SYCL][USM] Fix USM malloc_shared and free to handle zero byte (intel#1273)
  [SYCL] Fix undefined symbols in async_work_group_copy (intel#1243)
  [SYCL] Mark calls to barrier and work-item functions as convergent
  [SYCL][CUDA] Fix CUDA plug-in build with enabled assertions (intel#1325)
  [SYCL][Test] Add OpenCL requirement to test/ordered_queue/prop.cpp (intel#1335)
  [SYCL][CUDA] Improve CUDA backend documentation (intel#1293)
  ...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants