-
Notifications
You must be signed in to change notification settings - Fork 734
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SYCL][COMPAT] Add math extend_v*4 to SYCLCompat #14078
Conversation
Co-authored-by: Alberto Cabrera Pérez <alberto.cabrera@intel.com>
Co-authored-by: Joe Todd <joe.todd@codeplay.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey @OuadiElfarouki nice work! Just a couple comments just now - to replace enable_if
w/ static_assert
. Apologies again for sending you the wrong way!
I notice the test math_extend_v.cpp
is very slow (284 seconds when I run it locally). I'll investigate that further.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
@intel/llvm-gatekeepers PR is ready for merge 🙏🏻 |
@joeatodd has requested a change to this PR and as such will need to approve before we can go ahead. |
Co-authored-by: Yihan Wang <yihan.wang@intel.com>
sycl/include/syclcompat/math.hpp
Outdated
typename BinaryOperation> | ||
inline constexpr RetT extend_vbinary4(AT a, BT b, RetT c, | ||
BinaryOperation binary_op) { | ||
static_assert(std::is_integral_v<AT> && std::is_integral_v<BT> && |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can simplify to check whether AT and BT is int32_t
/uint32_t
, WDYT?
template <class T>
constexpr inline bool is_i32_or_u32 = std::is_same_v<std::decay_t<T>, int32_t> ||
std::is_same_v<std::decay_t<T>, uint32_t>;
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes that should look better @yihanwg. We don't want to put it inside the detail
namespace though, any suggestion?
sycl/include/syclcompat/math.hpp
Outdated
int16_t min_val = 0, max_val = 0; | ||
min_val = std::numeric_limits<Tint>::min(); | ||
max_val = std::numeric_limits<Tint>::max(); | ||
temp = detail::clamp(temp, {min_val, min_val, min_val, min_val}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can use the following code to construct a 4 element min_val
/max_val
vector.
https://registry.khronos.org/SYCL/specs/sycl-2020/html/sycl-2020.html#_vec_interface
explicit constexpr vec(const DataT& arg);
sycl::vec<int16_t, 4>(min_val), sycl::vec<int16_t, 4>(max_val)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will apply for the v2 variant as well (just to keep it consistent, as the v2 extend got merged already).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
@intel/llvm-gatekeepers PR is ready! Thank you. |
This PR adds math `extend_v*4` operators (18 in total) along with unit-tests for signed and unsigned int32 cases. *Some changes overlap with the previous `extend_v*2` PR intel#13953 and thus should be reviewed/merged first. --------- Co-authored-by: Alberto Cabrera Pérez <alberto.cabrera@intel.com> Co-authored-by: Joe Todd <joe.todd@codeplay.com> Co-authored-by: Yihan Wang <yihan.wang@intel.com>
This PR adds math `extend_vcompare[2/4] `operators (4 in total) along with unit-tests for signed and unsigned int32 cases. Also, Unit-tests from previous `extend_v*4` #14078 and `extend_v*2` #13953 are moved to two different files. --------- Co-authored-by: Alberto Cabrera Pérez <alberto.cabrera@intel.com> Co-authored-by: Joe Todd <joe.todd@codeplay.com> Co-authored-by: Yihan Wang <yihan.wang@intel.com>
This PR adds math
extend_v*4
operators (18 in total) along with unit-tests for signed and unsigned int32 cases.*Some changes overlap with the previous
extend_v*2
PR #13953 and thus should be reviewed/merged first.