-
Notifications
You must be signed in to change notification settings - Fork 71
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
New submdspan CPP 17 compatible #227
Conversation
This uses ADL customization point design
for strided_index_range.
This is the equivalent to #223 just with C++17 support |
@youyu3 check this out |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
EXCELLENT
Just one typo in a comment.
return get<1>(i); | ||
} | ||
|
||
// Suppress spurious warning with NVCC about no return statement. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've seen this with NVCC + GCC 10. The GCC intrinsic function __builtin_unreachable();
has helped me get rid of the warnings in this case. C++23 adds std::unreachable
, which hopefully will accomplish the same goal.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does that GCC intrinsic also work with nvc++, clang, msvc as host compiler?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's a GCC-only thing, so when I used it, I defined it as a macro.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
GCC 10 isn't quite as smart as GCC 11 about constexpr auto
functions that have exhaustive if constexpr
branches in them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for figuring this out! : - )
tests/test_submdspan.cpp
Outdated
MDSPAN_INLINE_FUNCTION | ||
static int create_slice_arg(int) { | ||
return 2; | ||
} | ||
MDSPAN_INLINE_FUNCTION | ||
static std::pair<int,int> create_slice_arg(std::pair<int,int>) { | ||
return std::pair<int,int>(1,3); | ||
} | ||
MDSPAN_INLINE_FUNCTION | ||
static stdex::strided_index_range<int,int,int> create_slice_arg(stdex::strided_index_range<int,int,int>) { | ||
return stdex::strided_index_range<int,int,int>{1,3,2}; | ||
} | ||
MDSPAN_INLINE_FUNCTION | ||
static stdex::full_extent_t create_slice_arg(stdex::full_extent_t) { | ||
return stdex::full_extent; | ||
} | ||
|
||
template<class SrcExtents, class SubExtents, class ... SliceArgs> | ||
MDSPAN_INLINE_FUNCTION | ||
static bool match_expected_extents(int src_idx, int sub_idx, SrcExtents src_ext, SubExtents sub_ext, int, SliceArgs ... slices) { | ||
return match_expected_extents(++src_idx, sub_idx, src_ext, sub_ext, slices...); | ||
} | ||
template<class SrcExtents, class SubExtents, class ... SliceArgs> | ||
MDSPAN_INLINE_FUNCTION | ||
static bool match_expected_extents(int src_idx, int sub_idx, SrcExtents src_ext, SubExtents sub_ext, std::pair<int,int> p, SliceArgs ... slices) { | ||
return (sub_ext.extent(sub_idx)==p.second-p.first) && match_expected_extents(++src_idx, ++sub_idx, src_ext, sub_ext, slices...); | ||
} | ||
template<class SrcExtents, class SubExtents, class ... SliceArgs> | ||
MDSPAN_INLINE_FUNCTION | ||
static bool match_expected_extents(int src_idx, int sub_idx, SrcExtents src_ext, SubExtents sub_ext, stdex::strided_index_range<int,int,int> p, SliceArgs ... slices) { | ||
return (sub_ext.extent(sub_idx)==(p.extent+p.stride-1)/p.stride) && match_expected_extents(++src_idx, ++sub_idx, src_ext, sub_ext, slices...); | ||
} | ||
template<class SrcExtents, class SubExtents, class ... SliceArgs> | ||
MDSPAN_INLINE_FUNCTION | ||
static bool match_expected_extents(int src_idx, int sub_idx, SrcExtents src_ext, SubExtents sub_ext, stdex::full_extent_t, SliceArgs ... slices) { | ||
return (sub_ext.extent(sub_idx)==src_ext.extent(src_idx)) && match_expected_extents(++src_idx, ++sub_idx, src_ext, sub_ext, slices...); | ||
} | ||
template<class SrcExtents, class SubExtents> | ||
MDSPAN_INLINE_FUNCTION | ||
static bool match_expected_extents(int, int, SrcExtents, SubExtents) { return true; } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am wondering whether we are we being to clever. (I don't have a better idea.)
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception | ||
// | ||
//@HEADER | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
include headers as appropriate
extents_constructor<K - 1, Extents, NewExtents..., dynamic_extent>; | ||
return next_t::next_extent( | ||
ext, slices..., | ||
r.extent > 0 ? 1 + divide<index_t>(r.extent - 1, r.stride) : 0); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The code as it stands won't achieve the goal of divide
, which is to return integral_constant
if both inputs are integral_constant
.
- Adding 1 to
integral_constant<size_t, Value>
results insize_t
, notintegral_constant<size_t, Value+1>
. - Even if adding 1 to
integral_constant
resulted inintegral_constant
, the result of the ternary operator would becommon_type_t<integral_constant<size_t, Value>, int>
, which isint
.
https://godbolt.org/z/qWEdn5xM6
#include <utility>
#include <type_traits>
template<size_t Val1, size_t Val2>
constexpr auto divide(std::integral_constant<size_t, Val1> x,
std::integral_constant<size_t, Val2> y) {
return std::integral_constant<size_t, Val1 / Val2>{};
}
template<size_t Value>
using Size = std::integral_constant<size_t, Value>;
int main() {
constexpr auto z = divide(Size<8>{}, Size<2>{});
static_assert(std::is_same_v<std::remove_const_t<decltype(z)>, Size<4>>);
constexpr auto w = 1 + divide(Size<8>{}, Size<2>{});
static_assert(not std::is_same_v<std::remove_const_t<decltype(w)>, Size<4>>);
static_assert(std::is_same_v<std::remove_const_t<decltype(w)>, size_t>);
static_assert(not std::is_same_v<std::common_type_t<Size<4>, int>, Size<4>>);
static_assert(not std::is_same_v<std::common_type_t<Size<4>, int>, size_t>);
static_assert(std::is_same_v<std::common_type_t<Size<4>, int>, int>);
return 0;
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doesn't matter since it doesn't determine the type. Note the values I am generating here are used as constructor arguments for an extents type which is independently determined. In case static extents are generated or preserved, these values will at best be checked against the static_extents in the extents constructor.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
integral_constant<T, Value>
implicitly converts to T
. This means that 1 + integral_constant<size_t, X>{} / integral_constant<size_t, Y>{}
just returns 1 + X / Y
, so you don't even need divide
here.
3d40324
to
15899dc
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me, just a few minor changes
) | ||
MDSPAN_INLINE_FUNCTION | ||
constexpr static auto next_extent(const Extents &ext, const Slice &sl, | ||
SliceSpecifiers... slices) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
More of a code readability issue -- the slices and new extent parameters here are kind of mixed up into the same parameter pack so it's difficult to understand the code. Might be worth separating out the new extents into a separate parameter.
const integral_constant<T1, v1> &) { | ||
// cutting short division by zero | ||
// this is used for strided_index_range with zero extent/stride | ||
return integral_constant<IndexT, v0 == 0 ? 0 : v0 / v1>(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't really understand this -- should stride be allowed to be zero? I guess if it is not specified it will be zero but then I think it would make sense to transform the value to nonzero before we get here.
I checked and this works now over my test suite of compilers/standards. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is fine for now. We can fix the strided_slice
name and add Mandates checks later. Thanks!
|
||
// Slice Specifier allowing for strides and compile time extent | ||
template <class OffsetType, class ExtentType, class StrideType> | ||
struct strided_index_range { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is called strided_slice
now in the paper, but I'm OK with doing the rename as a separate PR.
OffsetType offset; | ||
ExtentType extent; | ||
StrideType stride; | ||
}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we consider enforcing the Mandates here?
}; | |
static_assert(is_integral_v<OffsetType> || is_integral_constant_v<OffsetType>); | |
static_assert(is_integral_v<ExtentType> || is_integral_constant_v<ExtentType>); | |
static_assert(is_integral_v<StrideType> || is_integral_constant_v<StrideType>); | |
}; |
In the change above, is_integral_constant_v
is defined as follows.
template<class T>
struct is_integral_constant : std::false_type {};
template<std::integral T, T Value>
struct is_integral_constant<std::integral_constant<T, Value>> : std::true_type {};
template<class T>
constexpr inline bool is_integral_constant_v = is_integral_constant<T>::value;
return get<1>(i); | ||
} | ||
|
||
// Suppress spurious warning with NVCC about no return statement. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for figuring this out! : - )
…gles kokkos: fix reversed triangles in `trmm`, `trsm` and `trsv`
No description provided.