SYCL building system

Main parts of the PR include, 1. Find SYCL compiler tool kit in cmake. (cmake/Module/FindSYCLToolkit.cmake) 2. Setup cmake variables for SYCL compilation flags, includes and runtime library. (cmake/public/xpu.cmake, cmake/Dependencies.cmake) 3. Implement two cmake custom target building helpers, sycl_add_library and sycl_add_executable. (cmake/Module/FindSYCL.cmake, cmake/Module/FindSYCL/run_sycl.cmake) 4. Provide unit test cases to verify SYCL building system, since there is no SYCL kernel added in PyTorch. (test/cpp/sycl/CMakeLists.txt) SYCL program building is similar as CUDA and HIP. SYCL program requires custom compilation and additional device object linkage steps. We implement sycl_add_library and sycl_add_excutable to introduce SYCL compiler, 1. Introduce SYCL compiler to compile .cpp, in which we have SYCL kernels. As to "host only" .cpp, we use CXX compiler for compilation. So far, we only enable GCC compiler in SYCL separate compilation. Next step we will enable CLANG compiler to align with PyTorch building requirement. 2. Introduce SYCL compiler to link device object and produce a host relocatable object which could be linked by host linker in subsequence. Signed-off-by: Feng Yuan <feng1.yuan@intel.com> [ghstack-poisoned]
intel · Feb 24, 2024 · 225d597 · 225d597
1 parent 90fecb7
commit 225d597
Show file tree

Hide file tree

Showing 11 changed files with 1,166 additions and 0 deletions.
diff --git a/CMakeLists.txt b/CMakeLists.txt
@@ -0,0 +1,33 @@
+# torch-xpu-ops: XPU implementation for PyTorch ATen
+
+# outputs:
+#
+#  PYTORCH_FOUND_XPU
+#  -- The flag to indicate whether XPU backend stacks are setup successfully or not.
+#
+#  libtorch_xpu_ops
+#  -- Static archive library target
+
+cmake_minimum_required(VERSION 3.13 FATAL_ERROR)
+project(${TORCH_XPU_OPS_PROJ_NAME} VERSION ${CMAKE_PROJECT_VERSION})
+
+set(PYTORCH_FOUND_XPU FALSE)
+
+if(NOT CMAKE_SYSTEM_NAME MATCHES "Linux")
+  message("torch-xpu-ops only supports Linux system so far. We will support more systems in future.")
+  return()
+endif()
+
+set(TORCH_XPU_OPS_ROOT ${PROJECT_SOURCE_DIR})
+list(APPEND CMAKE_MODULE_PATH ${TORCH_XPU_OPS_ROOT}/cmake/Modules)
+
+include(${TORCH_XPU_OPS_ROOT}/cmake/SYCL.cmake)
+include(${TORCH_XPU_OPS_ROOT}/cmake/BuildFlags.cmake)
+
+if(BUILD_TEST)
+  add_subdirectory(${TORCH_XPU_OPS_ROOT}/test/sycl ${CMAKE_BINARY_DIR}/test_sycl)
+endif()
+
+set(PYTORCH_FOUND_XPU TRUE)
+
+message(STATUS "XPU found")
diff --git a/cmake/BuildFlags.cmake b/cmake/BuildFlags.cmake
@@ -0,0 +1,41 @@
+# Setup building flags for SYCL device and host codes.
+
+# Support GCC only at the moment.
+if(CMAKE_CXX_COMPILER_ID STREQUAL "GNU")
+  # # -- Host flags (SYCL_CXX_FLAGS)
+  list(APPEND SYCL_HOST_FLAGS -fPIC)
+  list(APPEND SYCL_HOST_FLAGS -std=c++17)
+  # SYCL headers warnings
+  list(APPEND SYCL_HOST_FLAGS -Wno-deprecated-declarations)
+  list(APPEND SYCL_HOST_FLAGS -Wno-attributes)
+
+  if(CMAKE_BUILD_TYPE MATCHES Debug)
+    list(APPEND SYCL_HOST_FLAGS -g)
+    list(APPEND SYCL_HOST_FLAGS -O0)
+  endif(CMAKE_BUILD_TYPE MATCHES Debug)
+
+  # -- Kernel flags (SYCL_KERNEL_OPTIONS)
+  # The fast-math will be enabled by default in SYCL compiler.
+  # Refer to [https://clang.llvm.org/docs/UsersManual.html#cmdoption-fno-fast-math]
+  # 1. We enable below flags here to be warn about NaN and Infinity,
+  # which will be hidden by fast-math by default.
+  # 2. The associative-math in fast-math allows floating point
+  # operations to be reassociated, which will lead to non-deterministic
+  # results compared with CUDA backend.
+  # 3. The approx-func allows certain math function calls (such as log, sqrt, pow, etc)
+  # to be replaced with an approximately equivalent set of instructions or
+  # alternative math function calls, which have great errors.
+  set(SYCL_KERNEL_OPTIONS ${SYCL_KERNEL_OPTIONS} -fno-sycl-unnamed-lambda)
+  set(SYCL_KERNEL_OPTIONS ${SYCL_KERNEL_OPTIONS} -sycl-std=2020)
+  set(SYCL_KERNEL_OPTIONS ${SYCL_KERNEL_OPTIONS} -fhonor-nans)
+  set(SYCL_KERNEL_OPTIONS ${SYCL_KERNEL_OPTIONS} -fhonor-infinities)
+  set(SYCL_KERNEL_OPTIONS ${SYCL_KERNEL_OPTIONS} -fno-associative-math)
+  set(SYCL_KERNEL_OPTIONS ${SYCL_KERNEL_OPTIONS} -fno-approx-func)
+  # TODO: Align with PyTorch and switch to ABI=0 eventually, after
+  # resolving incompatible implementation in SYCL runtime.
+  set(SYCL_KERNEL_OPTIONS ${SYCL_KERNEL_OPTIONS} -D_GLIBCXX_USE_CXX11_ABI=1)
+  set(SYCL_FLAGS ${SYCL_FLAGS} ${SYCL_KERNEL_OPTIONS})
+else()
+  message("Not compiling with XPU. Only support GCC compiler as CXX compiler.")
+  return()
+endif()