Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SYCL] Add device_ptr and host_ptr #1864

Merged
merged 7 commits into from
Jul 3, 2020

Conversation

MrSidims
Copy link
Contributor

@MrSidims MrSidims commented Jun 10, 2020

Currently a device backend can't trace from where a pointer allocated by USM comes: it can be either allocated on host or on device (it's just a pointer in OpenCL global address space). On FPGAs at least we can generate more efficient hardware code if the user tells us where the pointer can point. With this change users can create multi_ptr with specialized address space global_host or global_device that will proved to the compiler additional information to process load-store optimizations. Accessor pointers shall be also moved to global_device address spaces - otherwise backend would assume, that a pointer in global address space can access both host and device memory.

Previously there were added global_device in global_host address spaces
for OpenCL/SYCL in clang. With this patch device_space and host_space
were added in the SYCL headers the are mapped into the new address spaces
and aliases to multi_ptr instantiated with the space: device_ptr and
host_ptr.

Signed-off-by: Dmitry Sidorov dmitry.sidorov@intel.com

@MrSidims MrSidims force-pushed the private/MrSidims/DeviceHost branch from 072c149 to 4b4428c Compare June 17, 2020 13:05
@MrSidims MrSidims marked this pull request as ready for review June 17, 2020 13:08
@MrSidims
Copy link
Contributor Author

Please review only clang and SYCL headers part. The translator's one was added only for testing (otherwise LITs are expected to fail).

@romanovvlad
Copy link
Contributor

@MrSidims
Could you please share more information why we need a new address space? Some docs?

@MrSidims
Copy link
Contributor Author

@MrSidims
Could you please share more information why we need a new address space? Some docs?

I updated the description for now. Also I was promised, that the specification will be published today.

sycl/include/CL/sycl/access/access.hpp Outdated Show resolved Hide resolved
sycl/include/CL/sycl/handler.hpp Outdated Show resolved Hide resolved
Copy link
Contributor

@Fznamznon Fznamznon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This may slow down our first upstreaming milestone https://github.com/intel/llvm/milestone/1 since global_device and global_host address space attributes are not upstreamed.

clang/test/CodeGenSYCL/Inputs/sycl.hpp Outdated Show resolved Hide resolved
@MrSidims
Copy link
Contributor Author

This may slow down our first upstreaming milestone https://github.com/intel/llvm/milestone/1 since global_device and global_host address space attributes are not upstreamed.

I have just created a PR to llvm.org: https://reviews.llvm.org/D82174

sycl/include/CL/sycl/access/access.hpp Outdated Show resolved Hide resolved
@@ -104,6 +115,16 @@ struct PtrValueType<ElementType, access::address_space::global_space> {
using type = __OPENCL_GLOBAL_AS__ ElementType;
};

template <typename ElementType>
struct PtrValueType<ElementType, access::address_space::device_space> {
using type = __OPENCL_GLOBAL_DEVICE_AS__ ElementType;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is likely to not be portable and cause ICE rather than a clear error. As this overlaps with global, the address space should fallback to __OPENCL_GLOBAL_AS__ if the backend does not handles this address space.

Copy link
Contributor Author

@MrSidims MrSidims Jun 19, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As DPCPP compiler generated SPIR-V code - this mechanism is currently moved to the SPIR-V translator (basically during reversed translation from SPIR-V to LLVM IR there is an option added - without this option passed, the translator will generate global address space instead of global_device / global_host address space. So if someone would like to support these address spaces in their backend - it's needed to add this option in the backend's driver.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As DPCPP compiler generated SPIR-V code

or PTX via the NVPTX backend without going through SPIR-V

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it, thanks!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, not, I don't really get it. What target is used for NVPTX? I mean, that in clang part of the feature we have added definitions for these new address spaces like this:
`--- a/clang/lib/Basic/Targets/NVPTX.h
+++ b/clang/lib/Basic/Targets/NVPTX.h
@@ -30,6 +30,8 @@ static const unsigned NVPTXAddrSpaceMap[] = {
0, // opencl_private
// FIXME: generic has to be added to the target
0, // opencl_generic

  • 1, // opencl_global_device
  • 1, // opencl_global_host
    1, // cuda_device
    `

If for NVPTX we compile with spir-unknown-unknown triple, than the code above is indeed a problem. But if not - I don't see any issues.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't really get it. What target is used for NVPTX?

nvptx64-nvidia-cuda-sycldevice

I don't see any issues

The issue is in the mangler, given the current definition of the address space mapping

void foo(global_ptr<int>::pointer_t p) { [...] }
void foo(device_ptr<int>::pointer_t p) { [...] }

This will cause the compiler to mangle the 2 foo overloads in the same way.

There is 2 solutions to it:

  • Having a new mangling scheme, but I'm not sure how it should be done (@bader ping for this)
  • a SYCL solution: make the address space available if and only if the target actually supports it.

Note: this is kind of a corner case for now, I pointing this out so you are aware of it. I'm more concerned about the naming here.

Copy link
Contributor Author

@MrSidims MrSidims Jun 19, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you very much for your feedback. I'll think about these options.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would like to leave this corner case unresolved for now. One of the possible solutions is to expand authority of sycl_enable_usm_address_spaces option added in #1986 , but in this case this option (which originally was considered as a temporary solution) will stay in the compiler.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@MrSidims, could you open a GitHub issue to track/discuss solution this problem, please?

@MrSidims MrSidims force-pushed the private/MrSidims/DeviceHost branch 3 times, most recently from e087594 to 0d7b5a3 Compare June 23, 2020 15:22
@MrSidims
Copy link
Contributor Author

Sorry for force-pushing. I'm updating patches on which SYCL header part is based on (clang driver, the translator).

Copy link
Contributor

@mlychkov mlychkov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.
The only doubt is extra spaces around arrow operator in https://github.com/intel/llvm/pull/1864/files#diff-57bb9a63b756ac60f068b8632adc20e7R155. I think that a bug should be created for clang-format tool if it doesn't accept a variant without spaces.

@s-kanaev
Copy link
Contributor

s-kanaev commented Jul 1, 2020

On FPGAs at least we can generate more efficient hardware if the user tells us where the pointer can point.

There's a type, isn't it? I believe it should be "hardware code" or something like this. Am I correct?

sycl/CMakeLists.txt Outdated Show resolved Hide resolved
Otherwise it breaks atomics.

Signed-off-by: Dmitry Sidorov <dmitry.sidorov@intel.com>
Signed-off-by: Dmitry Sidorov <dmitry.sidorov@intel.com>
@MrSidims MrSidims force-pushed the private/MrSidims/DeviceHost branch from 428d2ff to 1a1237c Compare July 1, 2020 14:47
@MrSidims
Copy link
Contributor Author

MrSidims commented Jul 1, 2020

On FPGAs at least we can generate more efficient hardware if the user tells us where the pointer can point.

There's a type, isn't it? I believe it should be "hardware code" or something like this. Am I correct?

Fixed

@MrSidims MrSidims requested a review from s-kanaev July 1, 2020 16:07
@MrSidims MrSidims requested a review from s-kanaev July 2, 2020 10:22
@MrSidims
Copy link
Contributor Author

MrSidims commented Jul 2, 2020

@MrSidims
Copy link
Contributor Author

MrSidims commented Jul 2, 2020

@bader could you please advise me how to pull this PR from the purgatory?

Copy link
Contributor

@s-kanaev s-kanaev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me.
FPGA experts should approve, I reckon.

@bader
Copy link
Contributor

bader commented Jul 2, 2020

@bader could you please advise me how to pull this PR from the purgatory?

Fix clang-format check and get approvals from code owners.

@MrSidims
Copy link
Contributor Author

MrSidims commented Jul 2, 2020

@bader clang-format is firing incorrectly. Regarding code owners.. Well, for FPGA features it's me and Mikhail, who wrote LGTM. For SYCL RT, Sergey is replacing Vlad, while he's on vacation (and he approved as well).

Copy link
Contributor

@mlychkov mlychkov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bader Regarding clang-format error: it suggests to format the code line auto x4 = ptr_4->x; into auto x4 = ptr_4 -> x;, which is not actually used in our code style. So, I think that exception should be made for current review and it should be merged with failed clang-format test. Additionally, an issue should be raised for clang-format tool to fix such false negative case.

@bader
Copy link
Contributor

bader commented Jul 3, 2020

@bader Regarding clang-format error: it suggests to format the code line auto x4 = ptr_4->x; into auto x4 = ptr_4 -> x;, which is not actually used in our code style. So, I think that exception should be made for current review and it should be merged with failed clang-format test.

Okay.

Additionally, an issue should be raised for clang-format tool to fix such false negative case.

+1

@bader bader merged commit 94b36ac into intel:sycl Jul 3, 2020
// ret i8 addrspace(4)* %[[DEVCAST]]
//
// CHECK-LABEL: define {{.*}} spir_func i8 addrspace(4)* @{{.*}}multi_ptr{{.*}}
// CHECK: %m_Pointer = getelementptr inbounds %[[HOSTPTR_T]]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test fails on the builds with disabled assertions. I suppose we should not check variable names - those are stripped.
https://github.com/intel/llvm/runs/834501651
Please, fix ASAP.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok

FreddyLeaf pushed a commit to FreddyLeaf/llvm that referenced this pull request Mar 22, 2023
Signed-off-by: Arvind Sudarsanam <arvind.sudarsanam@intel.com>

Original commit:
KhronosGroup/SPIRV-LLVM-Translator@32721e8
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants