Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unified Runtime v0.8.2 changes combine #12192

Closed
wants to merge 18 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
18 commits
Select commit Hold shift + click to select a range
ef7ab54
[SYCL][UR] Combined CTS fixes for CL adapter. (#11794)
aarongreig Nov 14, 2023
7a60482
[SYCL][NATIVECPU] Move Native CPU adapter to UR. (#11804)
martygrant Nov 14, 2023
4ab38d2
[UR] Bump tag to 534071e52f84bad1dd7fb210a360414507f3b3ae (#11880)
fabiomestre Nov 17, 2023
4ccdef8
[UR] Bump UR tag to 04799e7 to get OpenCL adapter fixes (#11806)
callumfare Nov 21, 2023
6759d4c
[UR][L0] Add support for passing device list to urProgramBuild/Link/C…
nrspruit Nov 22, 2023
2b9edaf
[UR] Update how UR_OPENCL_ICD_LOADER_LIBRARY is set (#11997)
kbenzie Nov 24, 2023
96ff073
[UR][L0] Add UR_L0_LEAKS_DEBUG key (#11820)
Nov 27, 2023
795871c
[SYCL][OpenCL] Enable graph extension on OpenCL backend (#11718)
martygrant Nov 28, 2023
a24a4a4
[UR] Run CI for Revert HIP prefetch (#11893)
hdelan Nov 28, 2023
bf888f9
[UR] Bump tag to v0.8.1 (#12004)
kbenzie Nov 29, 2023
d4850b5
[UR] Bump tag to ce4acbc4 (#12101)
kbenzie Dec 8, 2023
aff10b2
[UR][L0] Correctly wait on barrier on urEnqueueEventsWaitWithBarrier …
Dec 8, 2023
0788771
[UR] Pull in fix that allows handling location properties in USM allo…
aarongreig Dec 11, 2023
1492699
[UR][L0] L0 adapter coverity fixes (#12152)
pbalcer Dec 13, 2023
68d7e70
[UR][L0] Add several fixes to L0 adapter for test-usm (#11980)
Dec 14, 2023
63b8693
[UR][L0] Upgrade L0 loader to v1.15.1 (#11844)
Dec 15, 2023
00189d6
[UR] Pull in patch for coverity issues in CL, hip and cuda adapters. …
aarongreig Dec 15, 2023
6ca2cd3
[UR] Bump tag to v0.8.2
kbenzie Dec 15, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
105 changes: 103 additions & 2 deletions sycl/doc/design/CommandGraph.md
Original file line number Diff line number Diff line change
Expand Up @@ -149,8 +149,8 @@ yet been implemented.
Implementation of UR command-buffers
for each of the supported SYCL 2020 backends.

Currently Level Zero and CUDA backends are implemented.
More sub-sections will be added here as other backends are supported.
Backends which are implemented currently are: [Level Zero](#level-zero),
[CUDA](#cuda), and partial support for [OpenCL](#opencl).

### Level Zero

Expand Down Expand Up @@ -246,3 +246,104 @@ the executable CUDA Graph that represent this series of operations.
An executable CUDA Graph, which contains all commands and synchronization
information, is saved in the UR command-buffer to allow for efficient
graph resubmission.

### OpenCL

SYCL-Graph is only enabled for an OpenCL backend when the
[cl_khr_command_buffer](https://registry.khronos.org/OpenCL/specs/3.0-unified/html/OpenCL_Ext.html#cl_khr_command_buffer)
extension is available, however this information isn't available until runtime
due to OpenCL implementations being loaded through an ICD.

The `ur_exp_command_buffer` string is conditionally returned from the OpenCL
command-buffer UR backend at runtime based on `cl_khr_command_buffer` support
to indicate that the graph extension should be enabled. This is information
is propagated to the SYCL user via the
`device.get_info<info::device::graph_support>()` query for graph extension
support.

#### Limitations

Due to the API mapping gaps documented in the following section, OpenCL as a
SYCL backend cannot fully support the graph API. Instead, there are
limitations in the types of nodes which a user can add to a graph, using
an unsupported node type will cause a sycl exception to be thrown in graph
finalization with error code `sycl::errc::feature_not_supported` and a message
mentioning the unsupported command. For example,

```
terminate called after throwing an instance of 'sycl::_V1::exception'
what(): USM copy command not supported by graph backend
```

The types of commands which are unsupported, and lead to this exception are:
* `handler::copy(src, dest)` - Where `src` is an accessor and `dest` is a pointer.
This corresponds to a memory buffer read command.
* `handler::copy(src, dest)` - Where `src` is an pointer and `dest` is an accessor.
This corresponds to a memory buffer write command.
* `handler::copy(src, dest)` or `handler::memcpy(dest, src)` - Where both `src` and
`dest` are USM pointers. This corresponds to a USM copy command.

Note that `handler::copy(src, dest)` where both `src` and `dest` are an accessor
is supported, as a memory buffer copy command exists in the OpenCL extension.

#### UR API Mapping

There are some gaps in both the OpenCL and UR specifications for Command
Buffers shown in the list below. There are implementations in the UR OpenCL
adapter where there is matching support for each function in the list.

| UR | OpenCL | Supported |
| --- | --- | --- |
| urCommandBufferCreateExp | clCreateCommandBufferKHR | Yes |
| urCommandBufferRetainExp | clRetainCommandBufferKHR | Yes |
| urCommandBufferReleaseExp | clReleaseCommandBufferKHR | Yes |
| urCommandBufferFinalizeExp | clFinalizeCommandBufferKHR | Yes |
| urCommandBufferAppendKernelLaunchExp | clCommandNDRangeKernelKHR | Yes |
| urCommandBufferAppendUSMMemcpyExp | | No |
| urCommandBufferAppendUSMFillExp | | No |
| urCommandBufferAppendMembufferCopyExp | clCommandCopyBufferKHR | Yes |
| urCommandBufferAppendMemBufferWriteExp | | No |
| urCommandBufferAppendMemBufferReadExp | | No |
| urCommandBufferAppendMembufferCopyRectExp | clCommandCopyBufferRectKHR | Yes |
| urCommandBufferAppendMemBufferWriteRectExp | | No |
| urCommandBufferAppendMemBufferReadRectExp | | No |
| urCommandBufferAppendMemBufferFillExp | clCommandFillBufferKHR | Yes |
| urCommandBufferEnqueueExp | clEnqueueCommandBufferKHR | Yes |
| | clCommandBarrierWithWaitListKHR | No |
| | clCommandCopyImageKHR | No |
| | clCommandCopyImageToBufferKHR | No |
| | clCommandFillImageKHR | No |
| | clGetCommandBufferInfoKHR | No |
| | clCommandSVMMemcpyKHR | No |
| | clCommandSVMMemFillKHR | No |

We are looking to address these gaps in the future so that SYCL-Graph can be
fully supported on a `cl_khr_command_buffer` backend.

#### UR Command-Buffer Implementation

Many of the OpenCL functions take a `cl_command_queue` parameter which is not
present in most of the UR functions. Instead, when a new command buffer is
created in `urCommandBufferCreateExp` we also create and maintain a new
internal `ur_queue_handle_t` with a reference stored inside of the
`ur_exp_command_buffer_handle_t_` struct. The internal queue is retained and
released whenever the owning command buffer is retained or released.

With command buffers being an OpenCL extension, each function is accessed by
loading a function pointer to its implementation. These are defined in a common
header file in the UR OpenCL adapter. The symbols for the functions are however
defined in [OpenCL-Headers](https://github.com/KhronosGroup/OpenCL-Headers/blob/main/CL/cl_ext.h)
but it is not known at this time what version of the headers will be used in
the UR GitHub CI configuration, so loading the function pointers will be used
until this can be verified. A future piece of work would be replacing the
custom defined symbols with the ones from OpenCL-Headers.

#### Available OpenCL Command-Buffer Implementations

Publicly available implementations of `cl_khr_command_buffer` that can be used
to enable the graph extension in OpenCL:

- [OneAPI Construction Kit](https://github.com/codeplaysoftware/oneapi-construction-kit) (must enable `OCL_EXTENSION_cl_khr_command_buffer` when building)
- [PoCL](http://portablecl.org/)
- [Command-Buffer Emulation Layer](https://github.com/bashbaug/SimpleOpenCLSamples/tree/efeae73139ddf064fafce565cc39640af10d900f/layers/10_cmdbufemu)

2 changes: 1 addition & 1 deletion sycl/doc/design/images/SYCL-Graph-Architecture.svg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
47 changes: 18 additions & 29 deletions sycl/plugins/native_cpu/CMakeLists.txt
Original file line number Diff line number Diff line change
@@ -1,38 +1,27 @@
# Plugin for SYCL Native CPU
# Create shared library for libpi_nativecpu.so

# Get the Native CPU adapter sources so they can be shared with the Native CPU PI plugin
get_target_property(UR_NATIVE_CPU_ADAPTER_SOURCES ur_adapter_native_cpu SOURCES)

add_sycl_plugin(native_cpu
SOURCES
"pi_native_cpu.cpp"
${UR_NATIVE_CPU_ADAPTER_SOURCES}
# Some code is shared with the UR adapter
"../unified_runtime/pi2ur.hpp"
"../unified_runtime/pi2ur.cpp"
"../unified_runtime/ur/ur.hpp"
"../unified_runtime/ur/ur.cpp"
"../unified_runtime/ur/adapters/native_cpu/adapter.cpp"
"../unified_runtime/ur/adapters/native_cpu/command_buffer.cpp"
"../unified_runtime/ur/adapters/native_cpu/common.cpp"
"../unified_runtime/ur/adapters/native_cpu/common.hpp"
"../unified_runtime/ur/adapters/native_cpu/context.cpp"
"../unified_runtime/ur/adapters/native_cpu/context.hpp"
"../unified_runtime/ur/adapters/native_cpu/device.cpp"
"../unified_runtime/ur/adapters/native_cpu/device.hpp"
"../unified_runtime/ur/adapters/native_cpu/enqueue.cpp"
"../unified_runtime/ur/adapters/native_cpu/event.cpp"
"../unified_runtime/ur/adapters/native_cpu/image.cpp"
"../unified_runtime/ur/adapters/native_cpu/kernel.cpp"
"../unified_runtime/ur/adapters/native_cpu/kernel.hpp"
"../unified_runtime/ur/adapters/native_cpu/memory.cpp"
"../unified_runtime/ur/adapters/native_cpu/memory.hpp"
"../unified_runtime/ur/adapters/native_cpu/platform.cpp"
"../unified_runtime/ur/adapters/native_cpu/platform.hpp"
"../unified_runtime/ur/adapters/native_cpu/program.cpp"
"../unified_runtime/ur/adapters/native_cpu/program.hpp"
"../unified_runtime/ur/adapters/native_cpu/queue.cpp"
"../unified_runtime/ur/adapters/native_cpu/queue.hpp"
"../unified_runtime/ur/adapters/native_cpu/sampler.cpp"
"../unified_runtime/ur/adapters/native_cpu/ur_interface_loader.cpp"
"../unified_runtime/ur/adapters/native_cpu/usm.cpp"
"../unified_runtime/ur/adapters/native_cpu/usm_p2p.cpp"
"${sycl_inc_dir}/sycl/detail/pi.h"
"${sycl_inc_dir}/sycl/detail/pi.hpp"
"pi_native_cpu.cpp"
"pi_native_cpu.hpp"
INCLUDE_DIRS
${CMAKE_CURRENT_SOURCE_DIR}/../unified_runtime
${sycl_inc_dir}
${CMAKE_CURRENT_SOURCE_DIR}/../unified_runtime # for Unified Runtime
${UNIFIED_RUNTIME_SOURCE_DIR}/source/ # for adapters/native_cpu
LIBRARIES
sycl
UnifiedRuntime-Headers
UnifiedRuntimeCommon
)

set_target_properties(pi_native_cpu PROPERTIES LINKER_LANGUAGE CXX)
14 changes: 7 additions & 7 deletions sycl/plugins/native_cpu/pi_native_cpu.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -8,13 +8,13 @@

#include <pi2ur.hpp>

#include <ur/adapters/native_cpu/context.hpp>
#include <ur/adapters/native_cpu/device.hpp>
#include <ur/adapters/native_cpu/kernel.hpp>
#include <ur/adapters/native_cpu/memory.hpp>
#include <ur/adapters/native_cpu/platform.hpp>
#include <ur/adapters/native_cpu/program.hpp>
#include <ur/adapters/native_cpu/queue.hpp>
#include <adapters/native_cpu/context.hpp>
#include <adapters/native_cpu/device.hpp>
#include <adapters/native_cpu/kernel.hpp>
#include <adapters/native_cpu/memory.hpp>
#include <adapters/native_cpu/platform.hpp>
#include <adapters/native_cpu/program.hpp>
#include <adapters/native_cpu/queue.hpp>

struct _pi_context : ur_context_handle_t_ {
using ur_context_handle_t_::ur_context_handle_t_;
Expand Down
64 changes: 13 additions & 51 deletions sycl/plugins/unified_runtime/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -30,9 +30,12 @@ if("hip" IN_LIST SYCL_ENABLE_PLUGINS)
endif()
if("opencl" IN_LIST SYCL_ENABLE_PLUGINS)
set(UR_BUILD_ADAPTER_OPENCL ON)
set(UR_OPENCL_ICD_LOADER_LIBRARY OpenCL-ICD)
set(UR_OPENCL_ICD_LOADER_LIBRARY OpenCL-ICD CACHE FILEPATH
"Path of the OpenCL ICD Loader library" FORCE)
endif()
if("native_cpu" IN_LIST SYCL_ENABLE_PLUGINS)
set(UR_BUILD_ADAPTER_NATIVE_CPU ON)
endif()
# TODO: Set UR_BUILD_ADAPTER_NATIVE_CPU once adapter moved

# Disable errors from warnings while building the UR.
# And remember origin flags before doing that.
Expand All @@ -54,13 +57,11 @@ if(SYCL_PI_UR_USE_FETCH_CONTENT)
include(FetchContent)

set(UNIFIED_RUNTIME_REPO "https://github.com/oneapi-src/unified-runtime.git")
# commit ec7982bac6cb3a6b9ed610cd6b7cb41fcbc780dc
# Merge: 62e6d2f9 5fb82924
# commit aaa4661f5c32e6dcb43248ed7575de6971852cc3
# Author: Kenneth Benzie (Benie) <k.benzie@codeplay.com>
# Date: Wed Nov 8 13:32:46 2023 +0000
# Merge pull request #1022 from 0x12CC/l0_usm_error_checking_2
# [UR][L0] Propagate OOM errors from `USMAllocationMakeResident`
set(UNIFIED_RUNTIME_TAG ec7982bac6cb3a6b9ed610cd6b7cb41fcbc780dc)
# Date: Fri Dec 15 16:05:36 2023 +0000
# Set version to v0.8.2
set(UNIFIED_RUNTIME_TAG aaa4661f5c32e6dcb43248ed7575de6971852cc3)

if(SYCL_PI_UR_OVERRIDE_FETCH_CONTENT_REPO)
set(UNIFIED_RUNTIME_REPO "${SYCL_PI_UR_OVERRIDE_FETCH_CONTENT_REPO}")
Expand Down Expand Up @@ -159,49 +160,6 @@ endif()

add_sycl_plugin(unified_runtime ${UNIFIED_RUNTIME_PLUGIN_ARGS})

if("native_cpu" IN_LIST SYCL_ENABLE_PLUGINS)
add_sycl_library("ur_adapter_native_cpu" SHARED
SOURCES
"ur/ur.cpp"
"ur/ur.hpp"
"ur/adapters/native_cpu/adapter.cpp"
"ur/adapters/native_cpu/command_buffer.cpp"
"ur/adapters/native_cpu/common.cpp"
"ur/adapters/native_cpu/common.hpp"
"ur/adapters/native_cpu/context.cpp"
"ur/adapters/native_cpu/context.hpp"
"ur/adapters/native_cpu/device.cpp"
"ur/adapters/native_cpu/device.hpp"
"ur/adapters/native_cpu/enqueue.cpp"
"ur/adapters/native_cpu/event.cpp"
"ur/adapters/native_cpu/image.cpp"
"ur/adapters/native_cpu/kernel.cpp"
"ur/adapters/native_cpu/kernel.hpp"
"ur/adapters/native_cpu/memory.cpp"
"ur/adapters/native_cpu/memory.hpp"
"ur/adapters/native_cpu/platform.cpp"
"ur/adapters/native_cpu/platform.hpp"
"ur/adapters/native_cpu/program.cpp"
"ur/adapters/native_cpu/program.hpp"
"ur/adapters/native_cpu/queue.cpp"
"ur/adapters/native_cpu/queue.hpp"
"ur/adapters/native_cpu/sampler.cpp"
"ur/adapters/native_cpu/ur_interface_loader.cpp"
"ur/adapters/native_cpu/usm.cpp"
"ur/adapters/native_cpu/usm_p2p.cpp"
LIBRARIES
UnifiedRuntime-Headers
Threads::Threads
OpenCL-Headers
)

set_target_properties("ur_adapter_native_cpu" PROPERTIES
VERSION "0.0.0"
SOVERSION "0"
)
endif()


if(TARGET UnifiedRuntimeLoader)
set_target_properties(hello_world PROPERTIES EXCLUDE_FROM_ALL 1 EXCLUDE_FROM_DEFAULT_BUILD 1)
# Install the UR loader.
Expand Down Expand Up @@ -238,3 +196,7 @@ endif()
if ("opencl" IN_LIST SYCL_ENABLE_PLUGINS)
add_dependencies(sycl-runtime-libraries ur_adapter_opencl)
endif()

if ("native_cpu" IN_LIST SYCL_ENABLE_PLUGINS)
add_dependencies(sycl-runtime-libraries ur_adapter_native_cpu)
endif()
Loading