From e22cb798f8363f8e2a95a7e6df9a294b34c52fc4 Mon Sep 17 00:00:00 2001 From: Wu Yingcong Date: Thu, 20 Jun 2024 02:38:31 -0700 Subject: [PATCH] Fix Basic/image/srgba-read.cpp failure under SYCL_PREFER_UR with ONEAPI_DEVICE_SELECTOR=opencl:cpu (#14233) ### Problems: - Basic/image/srgba-read.cpp will fail with SYCL_PREFER_UR=1 and ONEAPI_DEVICE_SELECTOR=opencl:cpu, the piKernelSetArg returns PI_ERROR_INVALID_MEM_OBJECT. ``` ---> piKernelSetArg( : 0x3339020 : 0 : 8 : 0x7ffc5b159bb0 ) ---> pi_result : -38 [out]pi_mem * : 0x7ffc5b159bb0[ 0x214cd00 ... ] ``` - Many device sanitizer tests and other sycl e2e tests with device sanitizer will fail the same way. ### Root cause: Without SYCL_PREFER_UR=1, SYCL will pick the opencl plugin, which is actually a thin wrapper for UR implementation now, but there still differences. For opencl plugin, the `MemArg` in the function will be a cl_mem object, then it can be simply passed to `clSetKernelArg()`, which is what essentially `piKernelSetArg()` (which for UR is `urKernelSetArgValue()`) does. However, with SYCL_PREFER_UR=1, the `MemArg` would become a ur_mem_object_t which wraps a cl_mem, so attamping to call without extracting the handle first will result in the CL_INVALID_MEM_OBJECT. In such case, we should call `piextKernelSetArgMemObj()` which calls `urKernelSetArgMemObj()` to do the unwrapping first then pass to `clSetKernelArg()`. - https://github.com/oneapi-src/unified-runtime/blob/ded4b88acc091011ed7b24362ee3e61ee332955d/source/loader/ur_ldrddi.cpp#L3336-L3339 --- sycl/source/detail/scheduler/commands.cpp | 13 ++++++++++++- 1 file changed, 12 insertions(+), 1 deletion(-) diff --git a/sycl/source/detail/scheduler/commands.cpp b/sycl/source/detail/scheduler/commands.cpp index e34ef9e0699e..a164c455fed5 100644 --- a/sycl/source/detail/scheduler/commands.cpp +++ b/sycl/source/detail/scheduler/commands.cpp @@ -2303,7 +2303,18 @@ void SetArgBasedOnType( getMemAllocationFunc ? (sycl::detail::pi::PiMem)getMemAllocationFunc(Req) : nullptr; - if (Context.get_backend() == backend::opencl) { + // Only call piKernelSetArg for opencl plugin. Although for now opencl + // plugin is a thin wrapper for UR plugin, but they still produce different + // MemArg. For opencl plugin, the MemArg is a straight-forward cl_mem, so it + // will be fine using piKernelSetArg, which will call urKernelSetArgValue to + // pass the cl_mem object directly to clSetKernelArg. But when in + // SYCL_PREFER_UR=1, the MemArg is a cl_mem wrapped by ur_mem_object_t, + // which will need to unpack by calling piextKernelSetArgMemObj, which calls + // urKernelSetArgMemObj. If we call piKernelSetArg in such case, the + // clSetKernelArg will report CL_INVALID_MEM_OBJECT since the arg_value is + // not a valid cl_mem object but a ur_mem_object_t object. + if (Context.get_backend() == backend::opencl && + !Plugin->hasBackend(backend::all)) { // clSetKernelArg (corresponding to piKernelSetArg) returns an error // when MemArg is null, which is the case when zero-sized buffers are // handled. Below assignment provides later call to clSetKernelArg with