Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SYCL][CUDA] Initial CUDA backend support #1091

Merged
merged 24 commits into from
Feb 24, 2020
Merged

Conversation

Alexander-Johnston
Copy link
Contributor

Initial support of CUDA backend bringing roughly 40% of SYCL 1.2.1 CTS conformance on devices that support CUDA 10.2.

@@ -61,7 +61,7 @@ static CudaVersion ParseCudaVersionFile(llvm::StringRef V) {
return CudaVersion::CUDA_92;
if (Major == 10 && Minor == 0)
return CudaVersion::CUDA_100;
if (Major == 10 && Minor == 1)
if (Major == 10 && (Minor == 1 || Minor == 2))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you please put some comment with explanations why it is okay to map 10.2 to CUDA_101?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ToT Clang lags behind on released CUDA versions.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are hoping upstream clang catches up with 10.2 support, this change allows to use it on the meantime. Their release notes and our testing shows 10.2 can work as if 10.1 and there are no regressions. Otherwise, the default installation of many people simply fails to compile any SYCL for CUDA application.

Suggested change
if (Major == 10 && (Minor == 1 || Minor == 2))
// Enable CUDA 10.2 toolkit acting as 10.1 until tip clang catches up
if (Major == 10 && (Minor == 1 || Minor == 2))

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, upstream llvm patch llvm/llvm-project@12fefee (https://reviews.llvm.org/D73231) is a better fix, so we won't need once this lands into the intel/llvm fork

Comment on lines 1109 to 1108
if (!getenv("DISABLE_INFER_AS"))
Builder.defineMacro("__SYCL_ENABLE_INFER_AS__", "1");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In presence of #893, this change is not needed anymore

@@ -462,6 +462,10 @@ if( LLVM_USE_PERF )
endif( NOT CMAKE_SYSTEM_NAME MATCHES "Linux" )
endif( LLVM_USE_PERF )

option(SYCL_BUILD_PI_CUDA
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe that this should be in llvm/sycl/CMakeLists.txt. Is there any reason it cannot be put in there?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We ifdef in/out SYCL NVPTX support in the clang Driver based on this variable, which is why we put it in llvm/CMakeLists.txt rather than in llvm/sycl/CMakeLists.txt

Copy link
Contributor

@bader bader Feb 19, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The name of the option is confusing in this case. PI has nothing to do with the driver.
Why do we need ifdef out SYCL NVPTX support in the clang driver?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't need to, but it helps prevent users accidentally compiling kernels to the wrong target for the plugins they have available.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you able to compile kernels for PTX w/o CUDA toolchain?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I so, we could use this for testing SYCL for CUDA compiler using cross-compilation on Intel HW.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe clang will fail to produce PTX kernels without a CUDA toolchain installed.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could enable support for the SYCL cuda toolchain if the NVPTX target is added, so it has the same requirements as the CUDA support in llvm? I suggest we do that on a separate patch

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note, the code added under ifdef is not covered by CI system, so it might be broken.
It seems to me that in order to enable compilation for NVPTX target we need just a few changes. Am I wrong?

@@ -4,6 +4,7 @@
#cmake_policy(SET CMP0057 NEW)
#include(AddLLVM)


Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change

}

} // sycl
} // sycl
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
} // sycl
} // cl

//
//===----------------------------------------------------------------------===//

namespace cl {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
namespace cl {
#include <CL/sycl/detail/defines.hpp>
__SYCL_INLINE namespace cl {

should fix the build error

[  2%] Building CXX object tools/sycl/plugins/cuda/CMakeFiles/pi_cuda.dir/pi_cuda.cpp.o
In file included from /data/user/fwyzard/sycl/llvm/sycl/include/CL/sycl/detail/pi.hpp:13,
                 from /data/user/fwyzard/sycl/llvm/sycl/plugins/cuda/pi_cuda.cpp:10:
/data/user/fwyzard/sycl/llvm/sycl/include/CL/sycl/detail/common.hpp:25:25: error: inline namespace must be specified at initial definition
 __SYCL_INLINE namespace cl {
                         ^~
In file included from /data/user/fwyzard/sycl/llvm/sycl/plugins/cuda/pi_cuda.cpp:9:
/data/user/fwyzard/sycl/llvm/sycl/include/CL/sycl/backend/cuda.hpp:9:11: note: 'cl' defined here
 namespace cl {
           ^~

@@ -0,0 +1,479 @@
//===-- pi_cuda.hpp - CUDA Plugin -----------------------------------------===//
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you move this file to plugins/cuda/ ? It is the better place for it.

@@ -23,25 +23,31 @@

__SYCL_INLINE namespace cl {
namespace sycl {
context::context(const async_handler &AsyncHandler)
: context(default_selector().select_device(), AsyncHandler) {}
context::context(const async_handler &AsyncHandler, bool usePrimaryContext)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit:

Suggested change
context::context(const async_handler &AsyncHandler, bool usePrimaryContext)
context::context(const async_handler &AsyncHandler, bool UsePrimaryContext)

Same applies to the rest of the file.

@@ -190,7 +193,6 @@ add_subdirectory( source )
# SYCL toolchain builds all components: compiler, libraries, headers, etc.
add_custom_target( sycl-toolchain
DEPENDS ${SYCL_RT_LIBS}
pi_opencl
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why removing OpenCL support?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The pi_opencl dependency is added to sycl-toolchain in the opencl plugin cmake sycl/plugins/opencl/CMakeLists.txt in 884459b#diff-edef34e019c51a57812a69f7a111afdcR22

std::function<void(cl::sycl::interop_handler)> MFunc;

public:
InteropTask(std::function<void(cl::sycl::interop_handler)> Func)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
InteropTask(std::function<void(cl::sycl::interop_handler)> Func)
InteropTask(function_class<void(cl::sycl::interop_handler)> Func)

@@ -48,7 +48,7 @@ class queue_impl {
QueueOrder Order, const property_list &PropList)
: queue_impl(Device,
detail::getSyclObjImpl(
context(createSyclObjFromImpl<device>(Device))),
context(createSyclObjFromImpl<device>(Device), {}, true)),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: {} and true seem to me some "magic" values. Can you please add comments here?

@@ -77,7 +77,7 @@ class Scheduler {
// releaseHostAccessor is called.
// Returns an event which indicates when these nodes are completed and host
// accessor is ready for using.
EventImplPtr addHostAccessor(Requirement *Req);
EventImplPtr addHostAccessor(Requirement *Req, const bool destructor = false);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit:

Suggested change
EventImplPtr addHostAccessor(Requirement *Req, const bool destructor = false);
EventImplPtr addHostAccessor(Requirement *Req, const bool Destructor = false);

sycl/test/basic_tests/access_to_subset.cpp Show resolved Hide resolved
@@ -771,6 +771,15 @@ class handler {
#endif
}

// Similar to single_task, but passed lambda will be executed on host
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Can you please make doxygen comment out of this?

@@ -80,6 +81,7 @@ set(SYCL_SOURCES
"detail/usm/usm_dispatch.cpp"
"detail/usm/usm_impl.cpp"
"detail/util.cpp"
"cg.cpp"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The corresponding header file resides in CL/sycl/detail/. Have you considered putting cg.cpp to source/detail?

@@ -14,8 +14,9 @@ namespace sycl {
namespace detail {

AccessorImplHost::~AccessorImplHost() {
if (MBlockedCmd)
if (MBlockedCmd) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AFAIK, it's common practice in LLVM not to wrap single line statement in curly brackets.

/// @param useCUDAPrimaryContext is a bool determining whether to use the
/// primary context in the CUDA backend.
explicit context(const async_handler &AsyncHandler = {},
bool useCUDAPrimaryContext = false);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit:

Suggested change
bool useCUDAPrimaryContext = false);
bool UseCUDAPrimaryContext = false);

Copy link
Contributor

@AlexeySachkov AlexeySachkov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you please also split this PR into several commits? Even if you created just two commits (with libclc changes and the rest), it would be easier to review

@Ruyk
Copy link
Contributor

Ruyk commented Feb 6, 2020

Thanks for the feedback, we are working in fixing some lit-testing regressions and rebasing on top of the latest changes.

@@ -80,6 +80,9 @@
#cmakedefine01 CLANG_ENABLE_OBJC_REWRITER
#cmakedefine01 CLANG_ENABLE_STATIC_ANALYZER

/* Define if we have SYCL PI CUDA support */
#cmakedefine SYCL_HAVE_PI_CUDA ${SYCL_HAVE_PI_CUDA}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
#cmakedefine SYCL_HAVE_PI_CUDA ${SYCL_HAVE_PI_CUDA}
#cmakedefine01 SYCL_HAVE_PI_CUDA

According to the docs it should do the same

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we really need this define? Can we have "SYCL PI CUDA support" unconditionally?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we were to have PI CUDA support unconditionally the cuda toolchain will always be required for compilation. We decided to make it optional to allow people who only use the OpenCL plugin to compile the project without a cuda toolchain on their system.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we were to have PI CUDA support unconditionally the cuda toolchain will always be required for compilation. We decided to make it optional to allow people who only use the OpenCL plugin to compile the project without a cuda toolchain on their system.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

According to my understanding we need CUDA toolchain to build CUDA plugin only.
Could you clarify why we should require CUDA toolchain to build the driver?
https://llvm.org/docs/CompileCudaWithLLVM.html - doesn't seem to require some custom driver.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are correct, we only need CUDA toolchain for building of the plugin, but we limit the valid SYCL triples in the clang driver based if PI CUDA support is available or not.

https://github.com/intel/llvm/pull/1091/files#diff-beaf25b0cdf8830dd4ea165404b00671R618

static bool isValidSYCLTriple(llvm::Triple T) {
#ifdef SYCL_HAVE_PI_CUDA
  // NVPTX is valid for SYCL.
  if (T.isNVPTX())
    return true;
#endif
  // Check for invalid SYCL device triple values.
  // Non-SPIR arch.
  if (!T.isSPIR())
    return false;
  // SPIR arch, but has invalid SubArch for AOT.
  StringRef A(T.getArchName());
  if (T.getSubArch() == llvm::Triple::NoSubArch &&
      ((T.getArch() == llvm::Triple::spir && !A.equals("spir")) ||
       (T.getArch() == llvm::Triple::spir64 && !A.equals("spir64"))))
    return false;
  return true;
}

We can remove this limitation though and always allow nvptx triples for compilation, regardless of it the CUDA plugin is available.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 for removing.

Comment on lines 242 to 249
// The translator tries to convert these function to the same name as the
// SPIR-V built-in for
// another function but with a different signature which results in an
// attempt to re-define a
// function.
"read_imagei", // conflicts with `read_imagef`
"read_imageui", // conflicts with `read_imagef`
"read_imageh", // conflicts with `read_imagef`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BTW, this should be fixed by KhronosGroup/SPIRV-LLVM-Translator#408

@@ -288,11 +288,31 @@ class OCL20ToSPIRV : public ModulePass, public InstVisitor<OCL20ToSPIRV> {
Module *M;
LLVMContext *Ctx;
unsigned CLVer; /// OpenCL version as major*10+minor
unsigned CLLang; /// OpenCL language, see `spv::SourceLanguage`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please note that this repo contains a copy of KrhonosGroup/SPIRV-LLVM-Translator which is updated from time to time and usually, by simple copy-pasting the whole directory. So, there is a huge chance that these changes will be overwritten occasionally

Please at least extract these changes to a separate commit, so it can be re-applied at each sync, or (which is much better) commit them directly to KhronosGroup/SPIRV-LLVM-Translator

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We will try to commit these directly to the Khronos repository, but will need the changes here until it is merged on the Khronos repository and pulled back down for the CUDA backend to work.

For now the changes to llvm-spirv are separated into 09b3b2e

@fwyzard
Copy link
Contributor

fwyzard commented Feb 8, 2020

The codeplaysoftware:cuda-dev branch fails to build if configured with -DBUILD_SHARED_LIBS=ON:

Scanning dependencies of target prepare_builtins
[ 64%] Building CXX object tools/libclc/utils/CMakeFiles/prepare_builtins.dir/prepare-builtins.cpp.o
[ 64%] Linking CXX executable ../../../bin/prepare_builtins
CMakeFiles/prepare_builtins.dir/prepare-builtins.cpp.o: In function `main':
prepare-builtins.cpp:(.text.startup.main+0x100): undefined reference to `llvm::parseBitcodeFile(llvm::MemoryBufferRef, llvm::LLVMContext&)'
prepare-builtins.cpp:(.text.startup.main+0x46c): undefined reference to `llvm::errorToErrorCodeAndEmitErrors(llvm::LLVMContext&, llvm::Error)'
collect2: error: ld returned 1 exit status
make[3]: *** [tools/libclc/utils/CMakeFiles/prepare_builtins.dir/build.make:90: bin/prepare_builtins] Error 1
make[2]: *** [CMakeFiles/Makefile2:98983: tools/libclc/utils/CMakeFiles/prepare_builtins.dir/all] Error 2
make[1]: *** [CMakeFiles/Makefile2:99597: tools/sycl/CMakeFiles/sycl-toolchain.dir/rule] Error 2
make: *** [Makefile:25159: sycl-toolchain] Error 2

The intel:sycl branch does build fine.

"-DCMAKE_BUILD_TYPE={}".format(args.build_type),
"-DLLVM_ENABLE_ASSERTIONS={}".format(llvm_enable_assertions),
"-DLLVM_TARGETS_TO_BUILD={}".format(llvm_targets_to_build),
"-DLLVM_EXTERNAL_PROJECTS=sycl;llvm-spirv",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

opencl-aot project is missed

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this needed for the overall project as well or just for the testing? Its not mentioned on the Getting Started Guide

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Ruyk opencl-aot is an optional tool to enable AOT compilation for SYCL. It is not a requirement for end users, but it is being tested in CI.

icd_loader_lib = ''
icd_loader_lib = os.path.join(args.obj_dir, "OpenCL-ICD-Loader", "build")
llvm_targets_to_build = 'X86'
llvm_enable_projects = 'clang;llvm-spirv;sycl'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

opencl-aot project is missed

//
//===----------------------------------------------------------------------===//

#define __SPIRV_FUNCTION __spirv_ocl_fmod
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see definition of __spirv_ocl_fmod function. Is it a bug?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There may not be a definition for __spirv_ocl_fmod yet. We haven't finished implementing all of the builtins yet, but have some internal patches with additional builtins in progress to raise later.

@Ruyk
Copy link
Contributor

Ruyk commented Feb 14, 2020

We are currently working on fixing the default device selection. The current implementation does not take into account the multiple backends, and causes several lit tests to fail. We should have a fix soon.

@bader bader added the cuda CUDA back-end label Feb 18, 2020
@Alexander-Johnston Alexander-Johnston force-pushed the cuda-dev branch 2 times, most recently from ea6e850 to 683faf0 Compare February 19, 2020 12:08
@Alexander-Johnston
Copy link
Contributor Author

@bader @smaslov-intel

We've pushed some more fixes for the CUDA backend but are still finding issues relating to the new plugin interface introduced in #1030.

We've found that the new plugin interface is unstable when multiple plugins are available. Though we've fixed some issues related to mismatching devices/platforms/plugins when doing info queries and device selection, we are still finding device selection and some of the plugin internals unstable. As a result we are still seeing some test failures. The most notable is basic_tests/queue.cpp, which visibly fails for us when trying to use the OpenCL or CUDA plugins.

We also found a number of tests failing silently by catching exceptions from the sycl runtime, but not erroring after they occur.

We've added a new target, check-sycl-cuda, for specifically attempting to test with CUDA devices by passing SYCL_BE=PI_CUDA. We've also added a matching flag to check-sycl to ensure a path to test OpenCL devices. Without SYCL_BE it is nondeterministic which device is chosen by the default_selector used by many tests when multiple plugins are available.

We have also been testing without SYCL_BUILD_PI_CUDA enabled or CUDA available on the system with success.

For now we can either

  • merge and not enable testing for the CUDA backend while we work together on the new plugin interfaces stability with multiple plugins
  • merge and enable testing for the CUDA backend, but mark the failures as warnings instead of failures until the new plugin interface is stable with multiple plugins

To enable building/testing of the CUDA backend you will need to add SYCL_BUILD_PI_CUDA=ON to the buildbot CMake invocation and check-sycl-cuda as a test target to the buildbots.

@smaslov-intel
Copy link
Contributor

We've also added a matching flag to check-sycl to ensure a path to test OpenCL devices. Without SYCL_BE it is nondeterministic which device is chosen by the default_selector...

That sounds weird. Isn't the useBackend deterministic in preferring OpenCL when SYCL_BE is unset?

merge and enable testing for the CUDA backend, but mark the failures as warnings...

I'd prefer it this way, and work further to stabilize the plugins mechanism (tagging @garimagu)

bader
bader previously approved these changes Feb 21, 2020
Copy link
Contributor

@bader bader left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's test for regressions.

@bader bader dismissed their stale review February 21, 2020 14:40

Remove approve to avoid unintentional merge.

buildbot/configure.py Outdated Show resolved Hide resolved
Alexander Johnston and others added 8 commits February 24, 2020 11:43
Signed-off-by: Alexander Johnston <alexander@codeplay.com>
Synchronise the CUDA backend with the general SYCL changes from #1121.

Signed-off-by: Andrea Bocci <andrea.bocci@cern.ch>
Signed-off-by: Alexander Johnston <alexander@codeplay.com>
To ensure that the check-sycl targets test OpenCL devices, pass
SYCL_BE=PI_OPENCL. This mirrors the check-sycl-cuda target which
passes SYCL_BE=PI_CUDA. Without this it is nondeterministic which
device is tested by check-sycl.

Signed-off-by: Alexander Johnston <alexander@codeplay.com>
Removes PI_CUDA specific code paths and tests from clang, opting to
always enable them.

Signed-off-by: Alexander Johnston <alexander@codeplay.com>
Signed-off-by: Alexander Johnston <alexander@codeplay.com>
Fix platform string comparison for CUDA platform detection.
Fix device info platform query so that it uses the device's plugin,
rather than the GlobalPlugin.

Signed-off-by: Alexander Johnston <alexander@codeplay.com>
Signed-off-by: Alexander Johnston <alexander@codeplay.com>
bader
bader previously approved these changes Feb 24, 2020
@bader bader dismissed their stale review February 24, 2020 12:42

Dismiss review.

Copy link
Contributor

@bader bader left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unexpected Passing Tests (3):
    SYCL :: basic_tests/image.cpp
    SYCL :: basic_tests/stream/stream.cpp
    SYCL :: hier_par/hier_par_basic.cpp

Let's remove XFAIL from these tests.
Are these failing on NV HW only?

sycl/test/basic_tests/image.cpp Outdated Show resolved Hide resolved
sycl/test/basic_tests/image.cpp Outdated Show resolved Hide resolved
sycl/test/basic_tests/stream/stream.cpp Outdated Show resolved Hide resolved
sycl/test/hier_par/hier_par_basic.cpp Outdated Show resolved Hide resolved
sycl/CMakeLists.txt Outdated Show resolved Hide resolved
sycl/CMakeLists.txt Outdated Show resolved Hide resolved
Fix minor test and build configuration issues introduced in the
development of the CUDA backend.

Signed-off-by: Alexander Johnston <alexander@codeplay.com>
Copy link
Contributor

@bader bader left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Start testing.

@bader
Copy link
Contributor

bader commented Feb 24, 2020

@Alexander-Johnston, I'm going to do more thorough code review post-commit. This PR is too big to review in one shot.

@bader bader merged commit 7a9a425 into intel:sycl Feb 24, 2020
alexbatashev pushed a commit to alexbatashev/llvm that referenced this pull request Feb 25, 2020
…ages_docs

* origin/sycl: (1092 commits)
  [CI] Add clang-format checker to pre-commit checks (intel#1163)
  [SYCL][CUDA] Initial CUDA backend support (intel#1091)
  [USM] Align OpenCL USM extension header with the specification (intel#1162)
  [SYCL][NFC] Fix unreferenced variable warning (intel#1158)
  [SYCL] Fix __spirv_GroupBroadcast overloads (intel#1152)
  [SYCL] Add llvm/Demangle link dependency for llvm-no-spir-kernel (intel#1156)
  [SYCL] LowerWGScope pass should not be skipped when -O0 is used
  [SYCL][Doc][USM] Add refactored pointer and device queries to USM spec (intel#1118)
  [SYCL] Update the kernel parameter rule to is-trivially-copy-construc… (intel#1144)
  [SYCL] Move internal headers to source dir (intel#1136)
  [SYCL] Forbid declaration of non-const static variables inside kernels (intel#1141)
  [SYCL][NFC] Remove idle space (intel#1148)
  [SYCL] Improve the error mechanism of llvm-no-spir-kernel (intel#1068)
  [SYCL] Added CTS test config (intel#1063)
  [SYCL] Implement check-sycl-deploy target (intel#1142)
  [SYCL] Preserve original message and code of kernel/program build result (intel#1108)
  [SYCL] Fix LIT after LLVM change in community
  Translate LLVM's cmpxchg instruction to SPIR-V
  Add volatile qualifier for atom_ builtins
  Fix -Wunused-variable warnings
  ...
cl_mem MemArg = (cl_mem)AllocaCmd->getMemAllocation();
Plugin.call<PiApiKind::piKernelSetArg>(Kernel, Arg.MIndex,
sizeof(cl_mem), &MemArg);
Plugin.call<PiApiKind::piKernelSetArg>(Kernel, Arg.MIndex,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bader @Alexander-Johnston Was this extra API added by mistake?
Is there a reason for adding it twice?
FYI @rbegam .

MKernelProgramCache.setContextPtr(this);
}

context_impl::context_impl(const vector_class<cl::sycl::device> Devices,
async_handler AsyncHandler)
async_handler AsyncHandler, bool UseCUDAPrimaryContext)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the purpose of this variable? It is unused.

vladimirlaz pushed a commit to vladimirlaz/llvm that referenced this pull request Jun 1, 2020
The flag -nostdlib added in "cf4d4e366a2 libclc: Compile with -nostdlib"
was moved to the file AddLibclc.cmake, which was added in:
"7a9a4251f57 [SYCL][CUDA] Initial CUDA backend support (intel#1091)"

  CONFLICT (content): Merge conflict in libclc/CMakeLists.txt
iclsrc pushed a commit that referenced this pull request Feb 15, 2023
PR #1091 has refactor the logic of prepare_builtins into utils/CMakeLists.txt.
We don't need to duplicate logic in CMakeLists.txt now.

Before we upstream the utils/CMakeLists.txt, we should just remove the duplicate code.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cuda CUDA back-end
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants