-
Notifications
You must be signed in to change notification settings - Fork 743
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SYCL] Add device_ptr and host_ptr #1864
Conversation
072c149
to
4b4428c
Compare
Please review only clang and SYCL headers part. The translator's one was added only for testing (otherwise LITs are expected to fail). |
@MrSidims |
I updated the description for now. Also I was promised, that the specification will be published today. |
7f82a3b
to
a1de059
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This may slow down our first upstreaming milestone https://github.com/intel/llvm/milestone/1 since global_device and global_host address space attributes are not upstreamed.
I have just created a PR to llvm.org: https://reviews.llvm.org/D82174 |
@@ -104,6 +115,16 @@ struct PtrValueType<ElementType, access::address_space::global_space> { | |||
using type = __OPENCL_GLOBAL_AS__ ElementType; | |||
}; | |||
|
|||
template <typename ElementType> | |||
struct PtrValueType<ElementType, access::address_space::device_space> { | |||
using type = __OPENCL_GLOBAL_DEVICE_AS__ ElementType; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is likely to not be portable and cause ICE rather than a clear error. As this overlaps with global
, the address space should fallback to __OPENCL_GLOBAL_AS__
if the backend does not handles this address space.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As DPCPP compiler generated SPIR-V code - this mechanism is currently moved to the SPIR-V translator (basically during reversed translation from SPIR-V to LLVM IR there is an option added - without this option passed, the translator will generate global address space instead of global_device / global_host address space. So if someone would like to support these address spaces in their backend - it's needed to add this option in the backend's driver.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As DPCPP compiler generated SPIR-V code
or PTX via the NVPTX backend without going through SPIR-V
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got it, thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, not, I don't really get it. What target is used for NVPTX? I mean, that in clang part of the feature we have added definitions for these new address spaces like this:
`--- a/clang/lib/Basic/Targets/NVPTX.h
+++ b/clang/lib/Basic/Targets/NVPTX.h
@@ -30,6 +30,8 @@ static const unsigned NVPTXAddrSpaceMap[] = {
0, // opencl_private
// FIXME: generic has to be added to the target
0, // opencl_generic
- 1, // opencl_global_device
- 1, // opencl_global_host
1, // cuda_device
`
If for NVPTX we compile with spir-unknown-unknown triple, than the code above is indeed a problem. But if not - I don't see any issues.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't really get it. What target is used for NVPTX?
nvptx64-nvidia-cuda-sycldevice
I don't see any issues
The issue is in the mangler, given the current definition of the address space mapping
void foo(global_ptr<int>::pointer_t p) { [...] }
void foo(device_ptr<int>::pointer_t p) { [...] }
This will cause the compiler to mangle the 2 foo
overloads in the same way.
There is 2 solutions to it:
- Having a new mangling scheme, but I'm not sure how it should be done (@bader ping for this)
- a SYCL solution: make the address space available if and only if the target actually supports it.
Note: this is kind of a corner case for now, I pointing this out so you are aware of it. I'm more concerned about the naming here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you very much for your feedback. I'll think about these options.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would like to leave this corner case unresolved for now. One of the possible solutions is to expand authority of sycl_enable_usm_address_spaces
option added in #1986 , but in this case this option (which originally was considered as a temporary solution) will stay in the compiler.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@MrSidims, could you open a GitHub issue to track/discuss solution this problem, please?
e087594
to
0d7b5a3
Compare
Sorry for force-pushing. I'm updating patches on which SYCL header part is based on (clang driver, the translator). |
0d7b5a3
to
ded22f2
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
The only doubt is extra spaces around arrow operator in https://github.com/intel/llvm/pull/1864/files#diff-57bb9a63b756ac60f068b8632adc20e7R155. I think that a bug should be created for clang-format tool if it doesn't accept a variant without spaces.
There's a type, isn't it? I believe it should be "hardware code" or something like this. Am I correct? |
Otherwise it breaks atomics. Signed-off-by: Dmitry Sidorov <dmitry.sidorov@intel.com>
Signed-off-by: Dmitry Sidorov <dmitry.sidorov@intel.com>
428d2ff
to
1a1237c
Compare
Fixed |
@bader could you please advise me how to pull this PR from the purgatory? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me.
FPGA experts should approve, I reckon.
Fix clang-format check and get approvals from code owners. |
@bader clang-format is firing incorrectly. Regarding code owners.. Well, for FPGA features it's me and Mikhail, who wrote LGTM. For SYCL RT, Sergey is replacing Vlad, while he's on vacation (and he approved as well). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@bader Regarding clang-format error: it suggests to format the code line auto x4 = ptr_4->x;
into auto x4 = ptr_4 -> x;
, which is not actually used in our code style. So, I think that exception should be made for current review and it should be merged with failed clang-format test. Additionally, an issue should be raised for clang-format tool to fix such false negative case.
Okay.
+1 |
// ret i8 addrspace(4)* %[[DEVCAST]] | ||
// | ||
// CHECK-LABEL: define {{.*}} spir_func i8 addrspace(4)* @{{.*}}multi_ptr{{.*}} | ||
// CHECK: %m_Pointer = getelementptr inbounds %[[HOSTPTR_T]] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This test fails on the builds with disabled assertions. I suppose we should not check variable names - those are stripped.
https://github.com/intel/llvm/runs/834501651
Please, fix ASAP.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok
Signed-off-by: Arvind Sudarsanam <arvind.sudarsanam@intel.com> Original commit: KhronosGroup/SPIRV-LLVM-Translator@32721e8
Currently a device backend can't trace from where a pointer allocated by USM comes: it can be either allocated on host or on device (it's just a pointer in OpenCL global address space). On FPGAs at least we can generate more efficient hardware code if the user tells us where the pointer can point. With this change users can create multi_ptr with specialized address space global_host or global_device that will proved to the compiler additional information to process load-store optimizations. Accessor pointers shall be also moved to global_device address spaces - otherwise backend would assume, that a pointer in global address space can access both host and device memory.
Previously there were added global_device in global_host address spaces
for OpenCL/SYCL in clang. With this patch device_space and host_space
were added in the SYCL headers the are mapped into the new address spaces
and aliases to multi_ptr instantiated with the space: device_ptr and
host_ptr.
Signed-off-by: Dmitry Sidorov dmitry.sidorov@intel.com