Skip to content

Commit

Permalink
Merge remote-tracking branch 'upstream/sycl' into xpti_sycl_version_b…
Browse files Browse the repository at this point in the history
…ackend_plugin

* upstream/sycl: (26 commits)
  [SPIR-V][NFC] Move non-upstreamed FuncParam decorations into internal:: (intel#4138)
  [SYCL] Move free function queries to experimental namespace (intel#4090)
  [SYCL][XPTI] Enable PI calls notifications with arguments (intel#4148)
  [SYCL] Revert queue::wait() to its old behaviour with Level Zero (intel#4153)
  [SYCL] Add missing <cstring> header to spirv.hpp (intel#4157)
  [SYCL] Adds info query for atomic_memory_order_capabilities on device and context (intel#4105)
  [SYCL] Improve performance of generic shuffles (intel#3815)
  [SYCL] Fix the error with namespaces caused during rebase of intel#4014 (intel#4151)
  [ESIMD] Fix 'ambiguous operator' error with length 1 simd operands (intel#4149)
  [libdevice][NFC] Fix libdevice dependencies list (intel#4130)
  [SPIR-V] Reland Encode debug info producer in SPIR-V (intel#4082)
  [SYCL][ROCm] Add ROCm support to get_device_count_by_type (intel#4113)
  [SYCL] Fix sRGB device info (intel#4145)
  [SYCL][ROCm] Fix kernel launch with multiple dimensions (intel#4063)
  [SYCL][ROCm] Fix compilation for AMD GPU with -fsycl-dead-args-optimization (intel#4126)
  [SYCL][Level Zero] Enable multi-CCS support. (intel#4038)
  [SYCL] Pass bound arch to unbundler (intel#4112)
  [ESIMD][doc] Added documentation for some ESIMD math APIs (intel#3995)
  [ESIMD] rename gather4/scatter4 to gather_rgba/scatter_rgba (intel#4120)
  [SYCL][NFC] Remove unused variable. (intel#4131)
  ...
  • Loading branch information
alexbatashev committed Jul 22, 2021
2 parents a2d7ea5 + 66ef4eb commit 2e7f05d
Show file tree
Hide file tree
Showing 252 changed files with 10,593 additions and 7,435 deletions.
18 changes: 9 additions & 9 deletions .github/CODEOWNERS
Validating CODEOWNERS rules …
Original file line number Diff line number Diff line change
Expand Up @@ -28,9 +28,9 @@ sycl/doc/extensions/ @intel/dpcpp-specification-reviewers

# Sub-groups
sycl/include/CL/sycl/detail/spirv.hpp @Pennycook @AlexeySachkov
sycl/include/CL/sycl/intel/group_algorithm.hpp @Pennycook @AlexeySachkov
sycl/include/CL/sycl/intel/sub_group.hpp @Pennycook @AlexeySachkov
sycl/include/CL/sycl/intel/sub_group_host.hpp @Pennycook @AlexeySachkov
sycl/include/sycl/ext/intel/group_algorithm.hpp @Pennycook @AlexeySachkov
sycl/include/sycl/ext/intel/sub_group.hpp @Pennycook @AlexeySachkov
sycl/include/sycl/ext/intel/sub_group_host.hpp @Pennycook @AlexeySachkov

# PI API
sycl/include/CL/sycl/detail/pi.def @smaslov-intel
Expand All @@ -53,17 +53,17 @@ sycl/source/detail/stream_impl.cpp @againull
sycl/source/stream.cpp @againull

# FPGA extensions
sycl/include/CL/sycl/intel/fpga_device_selector.hpp @MrSidims
sycl/include/CL/sycl/intel/fpga_extensions.hpp @MrSidims
sycl/include/CL/sycl/intel/fpga_reg.hpp @MrSidims
sycl/include/CL/sycl/intel/pipes.hpp @MrSidims
sycl/include/sycl/ext/intel/fpga_device_selector.hpp @MrSidims
sycl/include/sycl/ext/intel/fpga_extensions.hpp @MrSidims
sycl/include/sycl/ext/intel/fpga_reg.hpp @MrSidims
sycl/include/sycl/ext/intel/pipes.hpp @MrSidims
sycl/include/CL/sycl/pipes.hpp @MrSidims

# Reduction extension
sycl/include/CL/sycl/intel/reduction.hpp @v-klochkov
sycl/include/sycl/ext/intel/reduction.hpp @v-klochkov

# Function pointers
sycl/include/CL/sycl/intel/function_pointer.hpp @AlexeySachkov
sycl/include/sycl/ext/intel/function_pointer.hpp @AlexeySachkov
sycl/source/function_pointer.cpp @AlexeySachkov

# Half Type
Expand Down
8 changes: 8 additions & 0 deletions clang/include/clang/Basic/Attr.td
Original file line number Diff line number Diff line change
Expand Up @@ -1261,6 +1261,14 @@ def SYCLSimdAccessorPtr : InheritableAttr {
let Documentation = [Undocumented];
}

// Used to mark readonly accessors. It is not to be used directly in the source.
def SYCLAccessorReadonly : Attr {
// This attribute has no spellings as it is only ever created implicitly.
let Spellings = [];
let SemaHandler = 0;
let Documentation = [Undocumented];
}

// The attribute denotes that it is a function written in a scalar fashion, which
// is used in ESIMD context and needs to be vectorized by a vector backend compiler.
// For now, this attribute will be used only in internal implementation of
Expand Down
44 changes: 36 additions & 8 deletions clang/include/clang/Basic/AttrDocs.td
Original file line number Diff line number Diff line change
Expand Up @@ -415,6 +415,10 @@ def SYCLSimdDocs : Documentation {
The compiler may decide to compile such functions using different optimization
and code generation pipeline. Also, this attribute is used to distinguish
ESIMD private globals from regular SYCL global variables.

In SYCL 1.2.1 mode, the ``intel::sycl_explicit_simd`` attribute is propagated
from the function it is applied to onto the kernel which calls the function.
In SYCL 2020 mode, the attribute is not propagated to the kernel.
}];
}

Expand Down Expand Up @@ -2443,8 +2447,9 @@ lambda capture, or function object member, of the callable to which the
attribute was applied. This effect is equivalent to annotating restrict on
**all** kernel pointer arguments in an OpenCL or SPIR-V kernel.

If ``intel::kernel_args_restrict`` is applied to a function called from a device
kernel, the attribute is not ignored and it is propagated to the kernel.
In SYCL 1.2.1 mode, the ``intel::kernel_args_restrict`` attribute is propagated
from the function it is applied to onto the kernel which calls the function.
In SYCL 2020 mode, the attribute is not propagated to the kernel.

The attribute forms an unchecked assertion, in that implementations
do not need to check/confirm the pre-condition in any way. If a user applies
Expand Down Expand Up @@ -2482,8 +2487,10 @@ def SYCLIntelNumSimdWorkItemsAttrDocs : Documentation {
let Content = [{
Applies to a device function/lambda function. Indicates the number of work
items that should be processed in parallel. Valid values are positive integers.
If ``intel::num_simd_work_items`` is applied to a function called from a
device kernel, the attribute is not ignored and it is propagated to the kernel.

In SYCL 1.2.1 mode, the ``intel::num_simd_work_items`` attribute is propagated
from the function it is applied to onto the kernel which calls the function.
In SYCL 2020 mode, the attribute is not propagated to the kernel.

.. code-block:: c++

Expand Down Expand Up @@ -2656,6 +2663,11 @@ allows the Y and Z arguments to be optional. If not provided by the user, the
value of Y and Z defaults to 1. See section 5.8.1 Kernel Attributes for more
details.

In SYCL 1.2.1 mode, the ``intel::reqd_work_group_size``,
``cl::reqd_work_group_size``, and ``sycl::reqd_work_group_size`` attributes are
propagated from the function they are applied to onto the kernel which calls the
function. In SYCL 2020 mode, the attributes are not propagated to the kernel.

.. code-block:: c++

[[sycl::reqd_work_group_size(4, 4, 4)]] void foo() {}
Expand Down Expand Up @@ -2800,8 +2812,10 @@ Applies to a device function/lambda function. Indicates the maximum dimensions
of a work group. Values must be positive integers. This is similar to
reqd_work_group_size, but allows work groups that are smaller or equal to the
specified sizes.
If ``intel::max_work_group_size`` is applied to a function called from a
device kernel, the attribute is not ignored and it is propagated to the kernel.

In SYCL 1.2.1 mode, the ``intel::max_work_group_size`` attribute is propagated
from the function it is applied to onto the kernel which calls the function.
In SYCL 2020 mode, the attribute is not propagated to the kernel.

.. code-block:: c++

Expand Down Expand Up @@ -2832,8 +2846,10 @@ Applies to a device function/lambda function or function call operator (of a
function object). Indicates the largest valid global work dimension that will be
accepted when running the kernel on a device. Valid values are integers in a
range of [0, 3].
If ``intel::max_global_work_dim`` is applied to a function called from a
device kernel, the attribute is not ignored and it is propagated to the kernel.

In SYCL 1.2.1 mode, the ``intel::max_global_work_dim`` attribute is propagated
from the function it is applied to onto the kernel which calls the function.
In SYCL 2020 mode, the attribute is not propagated to the kernel.

.. code-block:: c++

Expand Down Expand Up @@ -2890,6 +2906,10 @@ device operation, guiding the FPGA backend to insert the appropriate number of
registers to break-up the combinational logic circuit, and thereby controlling
the length of the longest combinational path.

In SYCL 1.2.1 mode, the ``intel::scheduler_target_fmax_mhz`` attribute is
propagated from the function it is applied to onto the kernel which calls the
function. In SYCL 2020 mode, the attribute is not propagated to the kernel.

.. code-block:: c++

[[intel::scheduler_target_fmax_mhz(4)]] void foo() {}
Expand Down Expand Up @@ -2920,6 +2940,10 @@ function object). If 1, compiler doesn't use the global work offset values for
the device function. Valid values are 0 and 1. If used without argument, value
of 1 is set implicitly.

In SYCL 1.2.1 mode, the ``intel::no_global_work_offset`` attribute is
propagated from the function it is applied to onto the kernel which calls the
function. In SYCL 2020 mode, the attribute is not propagated to the kernel.

.. code-block:: c++

[[intel::no_global_work_offset]]
Expand Down Expand Up @@ -4607,6 +4631,10 @@ the ``[[intel::named_sub_group_size(NAME)]]`` documentation for clarification.
This attribute is mutually exclusive with ``[[intel::named_sub_group_size(NAME)]]``
and ``[[intel::sycl_explicit_simd]]``.

In SYCL 1.2.1 mode, the ``intel::reqd_sub_group_size`` attribute is propagated
from the function it is applied to onto the kernel which calls the function.
In SYCL 2020 mode, the attribute is not propagated to the kernel.

In addition to device functions, the required sub-group size attribute may also
be specified in the definition of a named functor object and lambda functions,
as in the examples below:
Expand Down
3 changes: 3 additions & 0 deletions clang/lib/CodeGen/CGCall.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -2672,6 +2672,9 @@ void CodeGenFunction::EmitFunctionProlog(const CGFunctionInfo &FI,
unsigned FirstIRArg, NumIRArgs;
std::tie(FirstIRArg, NumIRArgs) = IRFunctionArgs.getIRArgs(ArgNo);

if (Arg->hasAttr<SYCLAccessorReadonlyAttr>())
Fn->getArg(FirstIRArg)->addAttr(llvm::Attribute::ReadOnly);

switch (ArgI.getKind()) {
case ABIArgInfo::InAlloca: {
assert(NumIRArgs == 0);
Expand Down
6 changes: 4 additions & 2 deletions clang/lib/Driver/Driver.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -4151,8 +4151,10 @@ class OffloadingActionBuilder final {
}
for (unsigned I = 0; I < ToolChains.size(); ++I) {
SYCLDeviceActions.push_back(UA);
UA->registerDependentActionInfo(
ToolChains[I], /*BoundArch=*/StringRef(), Action::OFK_SYCL);
withBoundArchForToolChain(ToolChains[I], [&](const char *BoundArch) {
UA->registerDependentActionInfo(ToolChains[I], BoundArch,
Action::OFK_SYCL);
});
}
return ABRT_Success;
}
Expand Down
5 changes: 3 additions & 2 deletions clang/lib/Driver/ToolChains/Clang.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -4624,7 +4624,7 @@ void Clang::ConstructJob(Compilation &C, const JobAction &JA,
}

// Turn on Dead Parameter Elimination Optimization with early optimizations
if (!RawTriple.isNVPTX() &&
if (!(RawTriple.isNVPTX() || RawTriple.isAMDGCN()) &&
Args.hasFlag(options::OPT_fsycl_dead_args_optimization,
options::OPT_fno_sycl_dead_args_optimization, false))
CmdArgs.push_back("-fenable-sycl-dae");
Expand Down Expand Up @@ -8856,7 +8856,8 @@ void SYCLPostLink::ConstructJob(Compilation &C, const JobAction &JA,
// -fsycl-device-code-split=auto

// Turn on Dead Parameter Elimination Optimization with early optimizations
if (!getToolChain().getTriple().isNVPTX() &&
if (!(getToolChain().getTriple().isNVPTX() ||
getToolChain().getTriple().isAMDGCN()) &&
TCArgs.hasFlag(options::OPT_fsycl_dead_args_optimization,
options::OPT_fno_sycl_dead_args_optimization, false))
addArgs(CmdArgs, TCArgs, {"-emit-param-info"});
Expand Down
2 changes: 0 additions & 2 deletions clang/lib/Driver/ToolChains/HIP.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -250,8 +250,6 @@ void HIPToolChain::addClangTargetOptions(
DeviceOffloadingKind == Action::OFK_SYCL) &&
"Only HIP and SYCL offloading kinds are supported for GPUs.");

StringRef GpuArch = getGPUArch(DriverArgs);

CC1Args.push_back("-fcuda-is-device");

if (DriverArgs.hasFlag(options::OPT_fcuda_approx_transcendentals,
Expand Down
Loading

0 comments on commit 2e7f05d

Please sign in to comment.