Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SYCL] Add device_ptr and host_ptr #1864

Merged
merged 7 commits into from
Jul 3, 2020
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
28 changes: 27 additions & 1 deletion sycl/include/CL/sycl/access/access.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -45,7 +45,9 @@ enum class address_space : int {
private_space = 0,
global_space,
constant_space,
local_space
local_space,
global_device_space,
global_host_space
s-kanaev marked this conversation as resolved.
Show resolved Hide resolved
};

} // namespace access
Expand Down Expand Up @@ -103,11 +105,15 @@ constexpr bool modeWritesNewData(access::mode m) {

#ifdef __SYCL_DEVICE_ONLY__
#define __OPENCL_GLOBAL_AS__ __attribute__((opencl_global))
#define __OPENCL_GLOBAL_DEVICE_AS__ __attribute__((opencl_global_device))
#define __OPENCL_GLOBAL_HOST_AS__ __attribute__((opencl_global_host))
#define __OPENCL_LOCAL_AS__ __attribute__((opencl_local))
#define __OPENCL_CONSTANT_AS__ __attribute__((opencl_constant))
#define __OPENCL_PRIVATE_AS__ __attribute__((opencl_private))
#else
#define __OPENCL_GLOBAL_AS__
#define __OPENCL_GLOBAL_DEVICE_AS__
#define __OPENCL_GLOBAL_HOST_AS__
#define __OPENCL_LOCAL_AS__
#define __OPENCL_CONSTANT_AS__
#define __OPENCL_PRIVATE_AS__
Expand Down Expand Up @@ -141,6 +147,16 @@ struct PtrValueType<ElementType, access::address_space::global_space> {
using type = __OPENCL_GLOBAL_AS__ ElementType;
};

template <typename ElementType>
struct PtrValueType<ElementType, access::address_space::global_device_space> {
using type = __OPENCL_GLOBAL_DEVICE_AS__ ElementType;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is likely to not be portable and cause ICE rather than a clear error. As this overlaps with global, the address space should fallback to __OPENCL_GLOBAL_AS__ if the backend does not handles this address space.

Copy link
Contributor Author

@MrSidims MrSidims Jun 19, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As DPCPP compiler generated SPIR-V code - this mechanism is currently moved to the SPIR-V translator (basically during reversed translation from SPIR-V to LLVM IR there is an option added - without this option passed, the translator will generate global address space instead of global_device / global_host address space. So if someone would like to support these address spaces in their backend - it's needed to add this option in the backend's driver.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As DPCPP compiler generated SPIR-V code

or PTX via the NVPTX backend without going through SPIR-V

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it, thanks!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, not, I don't really get it. What target is used for NVPTX? I mean, that in clang part of the feature we have added definitions for these new address spaces like this:
`--- a/clang/lib/Basic/Targets/NVPTX.h
+++ b/clang/lib/Basic/Targets/NVPTX.h
@@ -30,6 +30,8 @@ static const unsigned NVPTXAddrSpaceMap[] = {
0, // opencl_private
// FIXME: generic has to be added to the target
0, // opencl_generic

  • 1, // opencl_global_device
  • 1, // opencl_global_host
    1, // cuda_device
    `

If for NVPTX we compile with spir-unknown-unknown triple, than the code above is indeed a problem. But if not - I don't see any issues.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't really get it. What target is used for NVPTX?

nvptx64-nvidia-cuda-sycldevice

I don't see any issues

The issue is in the mangler, given the current definition of the address space mapping

void foo(global_ptr<int>::pointer_t p) { [...] }
void foo(device_ptr<int>::pointer_t p) { [...] }

This will cause the compiler to mangle the 2 foo overloads in the same way.

There is 2 solutions to it:

  • Having a new mangling scheme, but I'm not sure how it should be done (@bader ping for this)
  • a SYCL solution: make the address space available if and only if the target actually supports it.

Note: this is kind of a corner case for now, I pointing this out so you are aware of it. I'm more concerned about the naming here.

Copy link
Contributor Author

@MrSidims MrSidims Jun 19, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you very much for your feedback. I'll think about these options.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would like to leave this corner case unresolved for now. One of the possible solutions is to expand authority of sycl_enable_usm_address_spaces option added in #1986 , but in this case this option (which originally was considered as a temporary solution) will stay in the compiler.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@MrSidims, could you open a GitHub issue to track/discuss solution this problem, please?

};

template <typename ElementType>
struct PtrValueType<ElementType, access::address_space::global_host_space> {
using type = __OPENCL_GLOBAL_HOST_AS__ ElementType;
};

template <typename ElementType>
struct PtrValueType<ElementType, access::address_space::constant_space> {
// Current implementation of address spaces handling leads to possibility
Expand Down Expand Up @@ -171,6 +187,14 @@ struct remove_AS<__OPENCL_GLOBAL_AS__ T> {
typedef T type;
};

template <class T> struct remove_AS<__OPENCL_GLOBAL_DEVICE_AS__ T> {
typedef T type;
};

template <class T> struct remove_AS<__OPENCL_GLOBAL_HOST_AS__ T> {
typedef T type;
};

template <class T>
struct remove_AS<__OPENCL_PRIVATE_AS__ T> {
typedef T type;
Expand All @@ -188,6 +212,8 @@ struct remove_AS<__OPENCL_CONSTANT_AS__ T> {
#endif

#undef __OPENCL_GLOBAL_AS__
#undef __OPENCL_GLOBAL_DEVICE_AS__
#undef __OPENCL_GLOBAL_HOST_AS__
#undef __OPENCL_LOCAL_AS__
#undef __OPENCL_CONSTANT_AS__
#undef __OPENCL_PRIVATE_AS__
Expand Down
18 changes: 12 additions & 6 deletions sycl/include/CL/sycl/atomic.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -46,8 +46,10 @@ template <typename T> struct IsValidAtomicType {
};

template <cl::sycl::access::address_space AS> struct IsValidAtomicAddressSpace {
static constexpr bool value = (AS == access::address_space::global_space ||
AS == access::address_space::local_space);
static constexpr bool value =
(AS == access::address_space::global_space ||
AS == access::address_space::local_space ||
AS == access::address_space::global_device_space);
};

// Type trait to translate a cl::sycl::access::address_space to
Expand All @@ -56,6 +58,10 @@ template <access::address_space AS> struct GetSpirvMemoryScope {};
template <> struct GetSpirvMemoryScope<access::address_space::global_space> {
static constexpr auto scope = __spv::Scope::Device;
};
template <>
struct GetSpirvMemoryScope<access::address_space::global_device_space> {
static constexpr auto scope = __spv::Scope::Device;
};
template <> struct GetSpirvMemoryScope<access::address_space::local_space> {
static constexpr auto scope = __spv::Scope::Workgroup;
};
Expand Down Expand Up @@ -168,12 +174,12 @@ template <typename T, access::address_space addressSpace =
access::address_space::global_space>
class atomic {
static_assert(detail::IsValidAtomicType<T>::value,
"Invalid SYCL atomic type. Valid types are: int, "
"unsigned int, long, unsigned long, long long, unsigned "
"Invalid SYCL atomic type. Valid types are: int, "
"unsigned int, long, unsigned long, long long, unsigned "
"long long, float");
static_assert(detail::IsValidAtomicAddressSpace<addressSpace>::value,
"Invalid SYCL atomic address_space. Valid address spaces are: "
"global_space, local_space");
"Invalid SYCL atomic address_space. Valid address spaces are: "
"global_space, local_space, global_device_space");
static constexpr auto SpirvScope =
detail::GetSpirvMemoryScope<addressSpace>::scope;

Expand Down
18 changes: 11 additions & 7 deletions sycl/include/CL/sycl/detail/generic_type_lists.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -361,21 +361,25 @@ using nan_list = type_list<gtl::unsigned_short_list, gtl::unsigned_int_list,
} // namespace gtl
namespace gvl {
// address spaces
using all_address_space_list =
address_space_list<access::address_space::local_space,
access::address_space::global_space,
access::address_space::private_space,
access::address_space::constant_space>;
using all_address_space_list = address_space_list<
access::address_space::local_space, access::address_space::global_space,
access::address_space::private_space, access::address_space::constant_space,
access::address_space::global_device_space,
access::address_space::global_host_space>;

using nonconst_address_space_list =
address_space_list<access::address_space::local_space,
access::address_space::global_space,
access::address_space::private_space>;
access::address_space::private_space,
access::address_space::global_device_space,
access::address_space::global_host_space>;

using nonlocal_address_space_list =
address_space_list<access::address_space::global_space,
access::address_space::private_space,
access::address_space::constant_space>;
access::address_space::constant_space,
access::address_space::global_device_space,
access::address_space::global_host_space>;
} // namespace gvl
} // namespace detail
} // namespace sycl
Expand Down
36 changes: 21 additions & 15 deletions sycl/include/CL/sycl/multi_ptr.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -108,17 +108,18 @@ template <typename ElementType, access::address_space Space> class multi_ptr {
return reinterpret_cast<ReturnPtr>(m_Pointer)[index];
}

// Only if Space == global_space
// Only if Space == global_space || global_device_space
template <int dimensions, access::mode Mode,
access::placeholder isPlaceholder,
access::address_space _Space = Space,
typename = typename std::enable_if<
_Space == Space &&
Space == access::address_space::global_space>::type>
(Space == access::address_space::global_space ||
Space == access::address_space::global_device_space)>::type>
multi_ptr(accessor<ElementType, dimensions, Mode,
access::target::global_buffer, isPlaceholder>
Accessor) {
m_Pointer = (pointer_t)(Accessor.get_pointer().m_Pointer);
m_Pointer = (pointer_t)(Accessor.get_pointer().get());
}

// Only if Space == local_space
Expand Down Expand Up @@ -152,14 +153,17 @@ template <typename ElementType, access::address_space Space> class multi_ptr {
// 2. from multi_ptr<ElementType, Space> to multi_ptr<const ElementType,
// Space>

// Only if Space == global_space and element type is const
template <
int dimensions, access::mode Mode, access::placeholder isPlaceholder,
access::address_space _Space = Space, typename ET = ElementType,
typename = typename std::enable_if<
_Space == Space && Space == access::address_space::global_space &&
std::is_const<ET>::value &&
std::is_same<ET, ElementType>::value>::type>
// Only if Space == global_space || global_device_space and element type is
// const
template <int dimensions, access::mode Mode,
access::placeholder isPlaceholder,
access::address_space _Space = Space, typename ET = ElementType,
typename = typename std::enable_if<
_Space == Space &&
(Space == access::address_space::global_space ||
Space == access::address_space::global_device_space) &&
std::is_const<ET>::value &&
std::is_same<ET, ElementType>::value>::type>
multi_ptr(accessor<typename std::remove_const<ET>::type, dimensions, Mode,
access::target::global_buffer, isPlaceholder>
Accessor)
Expand Down Expand Up @@ -345,12 +349,13 @@ template <access::address_space Space> class multi_ptr<void, Space> {
return *this;
}

// Only if Space == global_space
// Only if Space == global_space || global_device_space
template <typename ElementType, int dimensions, access::mode Mode,
access::address_space _Space = Space,
typename = typename std::enable_if<
_Space == Space &&
Space == access::address_space::global_space>::type>
(Space == access::address_space::global_space ||
Space == access::address_space::global_device_space)>::type>
multi_ptr(
accessor<ElementType, dimensions, Mode, access::target::global_buffer,
access::placeholder::false_t>
Expand Down Expand Up @@ -466,12 +471,13 @@ class multi_ptr<const void, Space> {
return *this;
}

// Only if Space == global_space
// Only if Space == global_space || global_device_space
template <typename ElementType, int dimensions, access::mode Mode,
access::address_space _Space = Space,
typename = typename std::enable_if<
_Space == Space &&
Space == access::address_space::global_space>::type>
(Space == access::address_space::global_space ||
Space == access::address_space::global_device_space)>::type>
multi_ptr(
accessor<ElementType, dimensions, Mode, access::target::global_buffer,
access::placeholder::false_t>
Expand Down
8 changes: 8 additions & 0 deletions sycl/include/CL/sycl/pointers.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,14 @@ template <typename ElementType, access::address_space Space> class multi_ptr;
template <typename ElementType>
using global_ptr = multi_ptr<ElementType, access::address_space::global_space>;

template <typename ElementType>
using device_ptr =
multi_ptr<ElementType, access::address_space::global_device_space>;

template <typename ElementType>
using host_ptr =
multi_ptr<ElementType, access::address_space::global_host_space>;

template <typename ElementType>
using local_ptr = multi_ptr<ElementType, access::address_space::local_space>;

Expand Down
41 changes: 41 additions & 0 deletions sycl/test/check_device_code/usm_pointers.cpp
Original file line number Diff line number Diff line change
@@ -0,0 +1,41 @@
// RUN: %clangxx -fsycl-device-only -Xclang -fsycl-is-device -emit-llvm %s -S -o %t.ll -I %sycl_include -Wno-sycl-strict -Xclang -verify-ignore-unexpected=note,warning
// RUN: FileCheck %s --input-file %t.ll
//
// Check the address space of the pointer in multi_ptr class
//
// CHECK: %[[DEVPTR_T:.*]] = type { i8 addrspace(5)* }
// CHECK: %[[HOSTPTR_T:.*]] = type { i8 addrspace(6)* }
//
// CHECK-LABEL: define {{.*}} spir_func i8 addrspace(4)* @{{.*}}multi_ptr{{.*}}
// CHECK: %m_Pointer = getelementptr inbounds %[[DEVPTR_T]]
// CHECK-NEXT: %[[DEVLOAD:[0-9]+]] = load i8 addrspace(5)*, i8 addrspace(5)* addrspace(4)* %m_Pointer
// CHECK-NEXT: %[[DEVCAST:[0-9]+]] = addrspacecast i8 addrspace(5)* %[[DEVLOAD]] to i8 addrspace(4)*
// ret i8 addrspace(4)* %[[DEVCAST]]
//
// CHECK-LABEL: define {{.*}} spir_func i8 addrspace(4)* @{{.*}}multi_ptr{{.*}}
// CHECK: %m_Pointer = getelementptr inbounds %[[HOSTPTR_T]]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test fails on the builds with disabled assertions. I suppose we should not check variable names - those are stripped.
https://github.com/intel/llvm/runs/834501651
Please, fix ASAP.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok

// CHECK-NEXT: %[[HOSTLOAD:[0-9]+]] = load i8 addrspace(6)*, i8 addrspace(6)* addrspace(4)* %m_Pointer
// CHECK-NEXT: %[[HOSTCAST:[0-9]+]] = addrspacecast i8 addrspace(6)* %[[HOSTLOAD]] to i8 addrspace(4)*
// ret i8 addrspace(4)* %[[HOSTCAST]]

#include <CL/sycl.hpp>

using namespace cl::sycl;

int main() {
cl::sycl::queue queue;
{
queue.submit([&](cl::sycl::handler &cgh) {
cgh.single_task<class check_adress_space>([=]() {
void *Ptr = nullptr;
device_ptr<void> DevPtr(Ptr);
host_ptr<void> HostPtr(Ptr);
global_ptr<void> GlobPtr = global_ptr<void>(DevPtr);
GlobPtr = global_ptr<void>(HostPtr);
});
});
queue.wait();
}

return 0;
}
17 changes: 17 additions & 0 deletions sycl/test/multi_ptr/multi_ptr.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -82,6 +82,7 @@ template <typename T> void testMultPtr() {
auto local_ptr = make_ptr<T, access::address_space::local_space>(
localAccessor.get_pointer());

// General conversions in multi_ptr class
T *RawPtr = nullptr;
global_ptr<T> ptr_4(RawPtr);
ptr_4 = RawPtr;
Expand All @@ -92,6 +93,12 @@ template <typename T> void testMultPtr() {

ptr_6 = (void *)RawPtr;

// Explicit conversions for device_ptr/host_ptr to global_ptr
device_ptr<void> ptr_7((void *)RawPtr);
global_ptr<void> ptr_8 = global_ptr<void>(ptr_7);
host_ptr<void> ptr_9((void *)RawPtr);
global_ptr<void> ptr_10 = global_ptr<void>(ptr_9);

innerFunc<T>(wiID.get(0), ptr_1, ptr_2, local_ptr);
});
});
Expand All @@ -109,12 +116,14 @@ void testMultPtrArrowOperator() {
point<T> data_1[1] = {1};
point<T> data_2[1] = {2};
point<T> data_3[1] = {3};
point<T> data_4[1] = {4};

{
range<1> numOfItems{1};
buffer<point<T>, 1> bufferData_1(data_1, numOfItems);
buffer<point<T>, 1> bufferData_2(data_2, numOfItems);
buffer<point<T>, 1> bufferData_3(data_3, numOfItems);
buffer<point<T>, 1> bufferData_4(data_4, numOfItems);
queue myQueue;
myQueue.submit([&](handler &cgh) {
accessor<point<T>, 1, access::mode::read, access::target::global_buffer,
Expand All @@ -126,6 +135,9 @@ void testMultPtrArrowOperator() {
accessor<point<T>, 1, access::mode::read_write, access::target::local,
access::placeholder::false_t>
accessorData_3(1, cgh);
accessor<point<T>, 1, access::mode::read, access::target::global_buffer,
access::placeholder::false_t>
accessorData_4(bufferData_4, cgh);

cgh.single_task<class testMultPtrArrowOperatorKernel<T>>([=]() {
auto ptr_1 = make_ptr<point<T>, access::address_space::global_space>(
Expand All @@ -134,17 +146,22 @@ void testMultPtrArrowOperator() {
accessorData_2.get_pointer());
auto ptr_3 = make_ptr<point<T>, access::address_space::local_space>(
accessorData_3.get_pointer());
auto ptr_4 = make_ptr<point<T>, access::address_space::global_device_space>(
accessorData_4.get_pointer());

auto x1 = ptr_1->x;
auto x2 = ptr_2->x;
auto x3 = ptr_3->x;
auto x4 = ptr_4->x;

static_assert(std::is_same<decltype(x1), T>::value,
"Expected decltype(ptr_1->x) == T");
static_assert(std::is_same<decltype(x2), T>::value,
"Expected decltype(ptr_2->x) == T");
static_assert(std::is_same<decltype(x3), T>::value,
"Expected decltype(ptr_3->x) == T");
static_assert(std::is_same<decltype(x4), T>::value,
"Expected decltype(ptr_4->x) == T");
});
});
}
Expand Down