Skip to content

Commit

Permalink
[SYCL] Implement sycl_ext_oneapi_device_architecture on host for Leve…
Browse files Browse the repository at this point in the history
…l Zero and OpenCL (#9843)

This patch introduces new host API for
sycl_ext_oneapi_device_architecture extension and implements it,
currently only for Level Zero and OpenCL

Depends on 

- oneapi-src/unified-runtime#573
- #9873
- #9979
- #10054
  • Loading branch information
dm-vodopyanov authored Jun 30, 2023
1 parent e98280e commit d0b01b2
Show file tree
Hide file tree
Showing 17 changed files with 205 additions and 25 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -523,23 +523,12 @@ call to `if_architecture_is` or `else_if_architecture_is` whose condition is
architectures in the `Archs` parameter pack.


== Limitations with the experimental version

The {dpcpp} implementation of this extension currently has some important
limitations. The application must be compiled in ahead-of-time (AOT) mode
using `-fsycl-targets=<special-target>` where `<special-target>` is one of the
"special target values" listed in the link:../../UsersManual.md[users manual]
description of the `-fsycl-targets` option. These are the target names of the
form "intel_gpu_*", "nvidia_gpu_*", or "amd_gpu_*".


== Future direction
=== New member function of `device` class

This experimental extension is still evolving. We expect that future versions
will include the following:
This extension adds the following new member function to the `device` class,
which returns a Boolean telling whether the device has the specified
architecture.

* An extended member function like:
+
--
```
namespace sycl {
Expand All @@ -551,31 +540,56 @@ class device {

// namespace sycl
```

This provides a way to query a device's architecture from host code.
--

* An extended device information descriptor named
`sycl::ext::oneapi::experimental::info::device::architecture`, which returns
the architecture of the device. This allows host code such as:
+
=== New device descriptor

[%header,cols="5,1,5"]
|===
|Device descriptor
|Return type
|Description

|`ext::oneapi::experimental::info::device::architecture`
|`ext::oneapi::experimental::architecture`
|Returns the architecture of the device

|===

This device descriptor allows host code such as:

--
```
using namespace sycl::ext::oneapi::experimental;
namespace syclex = sycl::ext::oneapi::experimental;

architecture arch = dev.get_info<info::device::architecture>();
syclex::architecture arch = dev.get_info<syclex::info::device::architecture>();
switch (arch) {
case architecture::x86_64:
case syclex::architecture::x86_64:
/* ... */
break;
case architecture::intel_gpu_bdw:
case syclex::architecture::intel_gpu_bdw:
/* ... */
break;
/* etc. */
}
```
--

== Limitations with the experimental version

The {dpcpp} implementation of this extension currently has some important
limitations. The application must be compiled in ahead-of-time (AOT) mode
using `-fsycl-targets=<special-target>` where `<special-target>` is one of the
"special target values" listed in the link:../../UsersManual.md[users manual]
description of the `-fsycl-targets` option. These are the target names of the
form "intel_gpu_*", "nvidia_gpu_*", or "amd_gpu_*".


== Future direction

This experimental extension is still evolving. We expect that future versions
will include the following:

* A compile-time constant property that can be used to decorate kernels and
non-kernel device functions:
+
Expand Down
1 change: 1 addition & 0 deletions sycl/include/sycl/detail/pi.h
Original file line number Diff line number Diff line change
Expand Up @@ -319,6 +319,7 @@ typedef enum {
// Intel UUID extension.
PI_DEVICE_INFO_UUID = 0x106A,
// These are Intel-specific extensions.
PI_EXT_ONEAPI_DEVICE_INFO_IP_VERSION = 0x4250,
PI_DEVICE_INFO_DEVICE_ID = 0x4251,
PI_DEVICE_INFO_PCI_ADDRESS = 0x10020,
PI_DEVICE_INFO_GPU_EU_COUNT = 0x10021,
Expand Down
11 changes: 11 additions & 0 deletions sycl/include/sycl/device.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,7 @@
#include <sycl/detail/export.hpp>
#include <sycl/detail/info_desc_helpers.hpp>
#include <sycl/detail/owner_less_base.hpp>
#include <sycl/ext/oneapi/experimental/device_architecture.hpp>
#include <sycl/ext/oneapi/weak_object_base.hpp>
#include <sycl/info/info_desc.hpp>
#include <sycl/platform.hpp>
Expand Down Expand Up @@ -240,6 +241,16 @@ class __SYCL_EXPORT device : public detail::OwnerLessBase<device> {
/// \return true if the SYCL device has the given feature.
bool has(aspect Aspect) const __SYCL_WARN_IMAGE_ASPECT(Aspect);

/// Indicates if the SYCL device architecture equals to the one passed to
/// the function.
///
/// \param arch is one of the architectures from architecture enum described
/// in sycl_ext_oneapi_device_architecture specification.
///
/// \return true if the SYCL device architecture equals to the one passed to
/// the function.
bool ext_oneapi_architecture_is(ext::oneapi::experimental::architecture arch);

// TODO: Remove this diagnostics when __SYCL_WARN_IMAGE_ASPECT is removed.
#if defined(__clang__)
#pragma clang diagnostic pop
Expand Down
3 changes: 3 additions & 0 deletions sycl/include/sycl/info/ext_oneapi_device_traits.def
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,9 @@ __SYCL_PARAM_TRAITS_SPEC(ext::oneapi::experimental,device, max_global_work_group
__SYCL_PARAM_TRAITS_TEMPLATE_SPEC(ext::oneapi::experimental,device, max_work_groups<1>, id<1>, PI_EXT_ONEAPI_DEVICE_INFO_MAX_WORK_GROUPS_1D)
__SYCL_PARAM_TRAITS_TEMPLATE_SPEC(ext::oneapi::experimental,device, max_work_groups<2>, id<2>, PI_EXT_ONEAPI_DEVICE_INFO_MAX_WORK_GROUPS_2D)
__SYCL_PARAM_TRAITS_TEMPLATE_SPEC(ext::oneapi::experimental,device, max_work_groups<3>, id<3>, PI_EXT_ONEAPI_DEVICE_INFO_MAX_WORK_GROUPS_3D)
__SYCL_PARAM_TRAITS_SPEC(ext::oneapi::experimental, device, architecture,
ext::oneapi::experimental::architecture,
PI_EXT_ONEAPI_DEVICE_INFO_IP_VERSION)
#ifdef __SYCL_PARAM_TRAITS_TEMPLATE_SPEC_NEEDS_UNDEF
#undef __SYCL_PARAM_TRAITS_TEMPLATE_SPEC
#undef __SYCL_PARAM_TRAITS_TEMPLATE_SPEC_NEEDS_UNDEF
Expand Down
1 change: 1 addition & 0 deletions sycl/include/sycl/info/info_desc.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,7 @@
#include <sycl/aspects.hpp>
#include <sycl/detail/common.hpp>
#include <sycl/detail/pi.hpp>
#include <sycl/ext/oneapi/experimental/device_architecture.hpp>
#include <sycl/id.hpp>

namespace sycl {
Expand Down
3 changes: 3 additions & 0 deletions sycl/plugins/unified_runtime/pi2ur.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -1047,6 +1047,9 @@ inline pi_result piDeviceGetInfo(pi_device Device, pi_device_info ParamName,
case PI_DEVICE_INFO_GPU_SUBSLICES_PER_SLICE:
InfoType = UR_DEVICE_INFO_GPU_SUBSLICES_PER_SLICE;
break;
case PI_EXT_ONEAPI_DEVICE_INFO_IP_VERSION:
InfoType = UR_DEVICE_INFO_IP_VERSION;
break;
case PI_DEVICE_INFO_BUILD_ON_SUBDEVICE:
InfoType = UR_DEVICE_INFO_BUILD_ON_SUBDEVICE;
break;
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -239,6 +239,10 @@ ze_structure_type_t getZeStructureType<ze_device_memory_ext_properties_t>() {
return ZE_STRUCTURE_TYPE_DEVICE_MEMORY_EXT_PROPERTIES;
}
template <>
ze_structure_type_t getZeStructureType<ze_device_ip_version_ext_t>() {
return ZE_STRUCTURE_TYPE_DEVICE_IP_VERSION_EXT;
}
template <>
ze_structure_type_t getZeStructureType<ze_device_memory_access_properties_t>() {
return ZE_STRUCTURE_TYPE_DEVICE_MEMORY_ACCESS_PROPERTIES;
}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -396,6 +396,8 @@ UR_APIEXPORT ur_result_t UR_APICALL urDeviceGetInfo(
uint32_t{1});
case UR_DEVICE_INFO_GLOBAL_MEM_CACHE_SIZE:
return ReturnValue(uint64_t{Device->ZeDeviceCacheProperties->cacheSize});
case UR_DEVICE_INFO_IP_VERSION:
return ReturnValue(uint32_t{Device->ZeDeviceIpVersionExt->ipVersion});
case UR_DEVICE_INFO_MAX_PARAMETER_SIZE:
return ReturnValue(
size_t{Device->ZeDeviceModuleProperties->maxArgumentsSize});
Expand Down Expand Up @@ -908,6 +910,14 @@ ur_result_t ur_device_handle_t_::initialize(int SubSubDeviceOrdinal,
ZE_CALL_NOCHECK(zeDeviceGetComputeProperties, (ZeDevice, &Properties));
};

ZeDeviceIpVersionExt.Compute =
[ZeDevice](ze_device_ip_version_ext_t &Properties) {
ze_device_properties_t P;
P.stype = ZE_STRUCTURE_TYPE_DEVICE_PROPERTIES;
P.pNext = (void *)&Properties;
ZE_CALL_NOCHECK(zeDeviceGetProperties, (ZeDevice, &P));
};

ZeDeviceImageProperties.Compute =
[ZeDevice](ze_device_image_properties_t &Properties) {
ZE_CALL_NOCHECK(zeDeviceGetImageProperties, (ZeDevice, &Properties));
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -168,4 +168,5 @@ struct ur_device_handle_t_ : _ur_object {
ZeCache<ZeStruct<ze_device_memory_access_properties_t>>
ZeDeviceMemoryAccessProperties;
ZeCache<ZeStruct<ze_device_cache_properties_t>> ZeDeviceCacheProperties;
ZeCache<ZeStruct<ze_device_ip_version_ext_t>> ZeDeviceIpVersionExt;
};
9 changes: 9 additions & 0 deletions sycl/source/detail/device_impl.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -480,6 +480,15 @@ std::string device_impl::getDeviceName() const {
return MDeviceName;
}

ext::oneapi::experimental::architecture device_impl::getDeviceArch() const {
std::call_once(MDeviceArchFlag, [this]() {
MDeviceArch =
get_info<ext::oneapi::experimental::info::device::architecture>();
});

return MDeviceArch;
}

// On first call this function queries for device timestamp
// along with host synchronized timestamp and stores it in memeber varaible
// MDeviceHostBaseTime. Subsequent calls to this function would just retrieve
Expand Down
7 changes: 7 additions & 0 deletions sycl/source/detail/device_impl.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -234,6 +234,10 @@ class device_impl {

std::string getDeviceName() const;

bool extOneapiArchitectureIs(ext::oneapi::experimental::architecture Arch) {
return Arch == getDeviceArch();
}

/// Gets the current device timestamp
/// @throw sycl::feature_not_supported if feature is not supported on device
uint64_t getCurrentDeviceTime();
Expand All @@ -253,6 +257,7 @@ class device_impl {
explicit device_impl(pi_native_handle InteropDevice,
sycl::detail::pi::PiDevice Device,
PlatformImplPtr Platform, const PluginPtr &Plugin);
ext::oneapi::experimental::architecture getDeviceArch() const;
sycl::detail::pi::PiDevice MDevice = 0;
sycl::detail::pi::PiDeviceType MType;
sycl::detail::pi::PiDevice MRootDevice = nullptr;
Expand All @@ -261,6 +266,8 @@ class device_impl {
bool MIsAssertFailSupported = false;
mutable std::string MDeviceName;
mutable std::once_flag MDeviceNameFlag;
mutable ext::oneapi::experimental::architecture MDeviceArch;
mutable std::once_flag MDeviceArchFlag;
std::pair<uint64_t, uint64_t> MDeviceHostBaseTime;
}; // class device_impl

Expand Down
84 changes: 84 additions & 0 deletions sycl/source/detail/device_info.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,7 @@
#include <sycl/detail/os_util.hpp>
#include <sycl/detail/pi.hpp>
#include <sycl/device.hpp>
#include <sycl/ext/oneapi/experimental/device_architecture.hpp>
#include <sycl/feature_test.hpp>
#include <sycl/info/info_desc.hpp>
#include <sycl/memory_enums.hpp>
Expand Down Expand Up @@ -567,6 +568,83 @@ struct get_device_info_impl<range<Dimensions>,
}
};

template <>
struct get_device_info_impl<
ext::oneapi::experimental::architecture,
ext::oneapi::experimental::info::device::architecture> {
static ext::oneapi::experimental::architecture get(const DeviceImplPtr &Dev) {
using oneapi_exp_arch = sycl::ext::oneapi::experimental::architecture;
auto ReturnHelper = [](auto MapDeviceIpToArch, auto DeviceIp) {
// TODO: use std::map::contains instead of try-catch when SYCL RT be moved
// to C++20
try {
oneapi_exp_arch Result = MapDeviceIpToArch.at(DeviceIp);
return Result;
} catch (std::out_of_range &) {
throw sycl::exception(
make_error_code(errc::runtime),
"The current device architecture is not supported by "
"sycl_ext_oneapi_device_architecture.");
}
};
backend CurrentBackend = Dev->getBackend();
if (Dev->is_gpu() && (backend::ext_oneapi_level_zero == CurrentBackend ||
backend::opencl == CurrentBackend)) {
std::map<uint32_t, oneapi_exp_arch> MapDeviceIpToArch = {
{0x02000000, oneapi_exp_arch::intel_gpu_bdw},
{0x02400009, oneapi_exp_arch::intel_gpu_skl},
{0x02404009, oneapi_exp_arch::intel_gpu_kbl},
{0x02408009, oneapi_exp_arch::intel_gpu_cfl},
{0x0240c000, oneapi_exp_arch::intel_gpu_apl},
{0x02410000, oneapi_exp_arch::intel_gpu_glk},
{0x02414000, oneapi_exp_arch::intel_gpu_whl},
{0x02418000, oneapi_exp_arch::intel_gpu_aml},
{0x0241c000, oneapi_exp_arch::intel_gpu_cml},
{0x02c00000, oneapi_exp_arch::intel_gpu_icllp},
{0x03000000, oneapi_exp_arch::intel_gpu_tgllp},
{0x03004000, oneapi_exp_arch::intel_gpu_rkl},
{0x03008000, oneapi_exp_arch::intel_gpu_adl_s},
{0x03008000, oneapi_exp_arch::intel_gpu_rpl_s},
{0x0300c000, oneapi_exp_arch::intel_gpu_adl_p},
{0x03010000, oneapi_exp_arch::intel_gpu_adl_n},
{0x03028000, oneapi_exp_arch::intel_gpu_dg1},
{0x030dc008, oneapi_exp_arch::intel_gpu_acm_g10},
{0x030e0005, oneapi_exp_arch::intel_gpu_acm_g11},
{0x030e4000, oneapi_exp_arch::intel_gpu_acm_g12},
{0x030f0007, oneapi_exp_arch::intel_gpu_pvc},
};
uint32_t DeviceIp;
Dev->getPlugin()->call<PiApiKind::piDeviceGetInfo>(
Dev->getHandleRef(),
PiInfoCode<
ext::oneapi::experimental::info::device::architecture>::value,
sizeof(DeviceIp), &DeviceIp, nullptr);
return ReturnHelper(MapDeviceIpToArch, DeviceIp);
} else if (Dev->is_cpu() && backend::opencl == CurrentBackend) {
// TODO: add support of different CPU architectures to
// sycl_ext_oneapi_device_architecture
return sycl::ext::oneapi::experimental::architecture::x86_64;
} // else is not needed
// TODO: add support of other arhitectures by extending with else if

// Generating a user-friendly error message
std::string DeviceStr;
if (Dev->is_gpu())
DeviceStr = "GPU";
else if (Dev->is_cpu())
DeviceStr = "CPU";
else if (Dev->is_accelerator())
DeviceStr = "accelerator";
// else if not needed
std::stringstream ErrorMessage;
ErrorMessage
<< "sycl_ext_oneapi_device_architecture feature is not supported on "
<< DeviceStr << " device with sycl::backend::" << CurrentBackend
<< " backend.";
throw sycl::exception(make_error_code(errc::runtime), ErrorMessage.str());
}
};

template <>
struct get_device_info_impl<
size_t, ext::oneapi::experimental::info::device::max_global_work_groups> {
Expand Down Expand Up @@ -826,6 +904,12 @@ inline std::vector<sycl::aspect> get_device_info_host<info::device::aspects>() {
return std::vector<sycl::aspect>();
}

template <>
inline ext::oneapi::experimental::architecture
get_device_info_host<ext::oneapi::experimental::info::device::architecture>() {
return ext::oneapi::experimental::architecture::x86_64;
}

template <>
inline info::device_type get_device_info_host<info::device::device_type>() {
return info::device_type::host;
Expand Down
5 changes: 5 additions & 0 deletions sycl/source/device.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -208,5 +208,10 @@ pi_native_handle device::getNative() const { return impl->getNative(); }

bool device::has(aspect Aspect) const { return impl->has(Aspect); }

bool device::ext_oneapi_architecture_is(
ext::oneapi::experimental::architecture arch) {
return impl->extOneapiArchitectureIs(arch);
}

} // __SYCL_INLINE_VER_NAMESPACE(_V1)
} // namespace sycl
21 changes: 21 additions & 0 deletions sycl/test-e2e/DeviceArchitecture/device_architecture_on_host.cpp
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
// UNSUPPORTED: cuda, hip, esimd_emulator

// Enable this test, when GPU driver on Windows CI machines will be updated
// XFAIL: windows

// RUN: %{build} -o %t.out
// RUN: %{run} %t.out

#include <sycl/sycl.hpp>

int main() {
sycl::queue q;
sycl::device dev = q.get_device();

sycl::ext::oneapi::experimental::architecture arch = dev.get_info<
sycl::ext::oneapi::experimental::info::device::architecture>();

assert(dev.ext_oneapi_architecture_is(arch));

return 0;
}
Loading

0 comments on commit d0b01b2

Please sign in to comment.