[SYCL] Fix `sycl::vec::convert<>` to allow conversion to and from `sycl::vec` of `bfloat16` type to that of other data types #14105

uditagarwal97 · 2024-06-09T23:15:31Z

Follow-up of and blocked by: #14085

After this change:
On host, conversion between vec<bfloat16> and vec<float> will happen element-by-element. While on device, we'll use Spirv intrinsic OpConvertFToBF16INTEL and OpConvertBF16ToFINTEL (https://github.com/intel/llvm/blob/sycl/sycl/doc/design/spirv-extensions/SPV_INTEL_bfloat16_conversion.asciidoc) for vector conversion.

…6INTEL

uditagarwal97 · 2024-06-12T20:48:59Z

Converted this PR back to draft to:
(1) Accommodate changes in vec::convert after #14130 gets merged.
(2) Further simplify the changes in this PR after adding BF16 to uint16 conversion to detail::convertToOpenCLType

uditagarwal97 · 2024-06-16T15:51:32Z

sycl/include/sycl/vector.hpp

-                   detail::convertImpl<T, R, roundingMode, 1, OpenCLT, OpenCLR>(
-                       vec_data<DataT>::get(getValue(I)))));
+        // For float -> bf16.
+        if constexpr (isFloatToBF16Conv) {


detail::convertImpl<> expects OpenCL type as input and returns the OpenCL type corresponding to convertT. In the case of BF16, the OpenCL type will be uint16 for device and bfloat16 on host.
However, currently, vec_data<bfloat16>::get() returns bfloat16 value on both device and host.

As a workaround to this, I've added explicit if constexpr for BF16 <--> float conversion. A proper fix would require more if conditions/if defs, which IMO, is not worth it since we will anyway be replacing vector.hpp with vector_poreview.hpp soon.

Alternatively, we can just refactor the entire convertImpl, if you have a good plan/picture for that.

AlexeySachkov · 2024-06-17T11:47:29Z

sycl/include/sycl/vector_preview.hpp

+          std::is_same_v<DataT, bfloat16> && std::is_same_v<convertT, float>;
+      if constexpr (isFloatToBF16Conv || isBF16ToFloatConv) {
+        static_assert(roundingMode == rounding_mode::automatic ||
+                      roundingMode == rounding_mode::rte);


Should we add a message to this static assert to explicitly say that not all rounding modes are supported for bfloat16?

Sure. Fixed in 8a6caf1

aelovikov-intel · 2024-06-17T16:23:15Z

sycl/include/sycl/detail/vector_convert.hpp

+template <typename NativeBFT, typename NativeFloatT, int VecSize>
+inline NativeFloatT ConvertBF16ToF(NativeBFT val) {


Can NativeFloatT be anything other than float?

On host, no. NativeFloatT is always float.

aelovikov-intel · 2024-06-17T16:23:27Z

sycl/include/sycl/detail/vector_convert.hpp

+  // On host, ensure that we don't convert BF16 to uint16 for conversion.
+  static_assert(std::is_same_v<NativeBFT, sycl::ext::oneapi::bfloat16>);
+
+  return (NativeFloatT)val;


Please don't use C-style casts.

aelovikov-intel · 2024-06-17T16:25:42Z

sycl/include/sycl/detail/vector_convert.hpp

@@ -498,6 +528,51 @@ __SYCL_FLOAT_FLOAT_CONVERT_FOR_TYPE(double)
 #undef __SYCL_FLOAT_FLOAT_CONVERT
 #undef __SYCL_FLOAT_FLOAT_CONVERT_FOR_TYPE

+template <typename NativeBFT, typename NativeFloatT, int VecSize>
+inline NativeFloatT ConvertBF16ToF(NativeBFT vec) {


For the scalar case, are we going vec<bf16,1> -> operator[] -> cast_to_ushort->cast back to bf16 -> convert to float here + in the caller? Do you think it still makes sense after we changed storage type in vec?

The problem is that convertImpl accepts native OpenCL type for device, whether it is uint16 (For vec<bfloat, 1>) or uint16 ext_vector_type() (For vec<bfloat, N>).
I had to do the casts to provide a unified interface for vec::convert (to use convertImpl), plus I expect compiler to get rid of these extra casts.

A long term solution, would be to refactor convertImpl entirely but that is tangential to this PR.

aelovikov-intel · 2024-06-17T16:28:38Z

sycl/include/sycl/vector.hpp

+    // Currently, for BF16 <--> float conversion, we only support
+    // Round-to-even rounding mode.


I'd expect that bfloat maps precisely onto floats, so that direction should "support" all the rounding modes. Am I wrong here?

IIUC, there can not be a 1:1 mapping between float and bfloat as bfloat has only 8-bit mantissa while float as 24-bit mantissa. The default rounding mode is RTE(https://registry.khronos.org/OpenCL/specs/3.0-unified/html/OpenCL_Env.html#_rounding_modes_for_conversions) for floating point to floating point conversion.

aelovikov-intel · 2024-06-17T16:30:00Z

sycl/include/sycl/vector.hpp

-                   detail::convertImpl<T, R, roundingMode, 1, OpenCLT, OpenCLR>(
-                       vec_data<DataT>::get(getValue(I)))));
+        // For float -> bf16.
+        if constexpr (isFloatToBF16Conv) {


Alternatively, we can just refactor the entire convertImpl, if you have a good plan/picture for that.

uditagarwal97 · 2024-06-20T19:28:34Z

@cperkinsintel Since @aelovikov-intel is OOO, could you help review this PR?
I have made the following changes since @aelovikov-intel's last review:

Except double, now we can convert sycl::vec of all the types to sycl::vec<bfloat16> and vice versa, with all rounding modes supported. For double, we only support RTE rounding mode.
On Intel HW's, I used __imf_ builtins for element-by-element conversion between bfloat16 and other data types. For non-Intel HWs and host, I've added a helper class ConvertToBfloat16 to ext/oneapi/bfloat16.hpp to facilitate conversion to/from bfloat16 with different rounding modes.
For conversion between sycl::vec<float> and sycl::vec<bfloat16> when RTE rounding mode is used, I used OpConvertFToBF16INTEL and OpConvertBF16ToFINTEL for optimized vectorized conversion. For other rounding modes, we default to imf_ builtins.

cperkinsintel · 2024-06-21T16:58:40Z

sycl/include/sycl/ext/oneapi/bfloat16.hpp

+              roundingMode == SYCLRoundingMode::rte,
+          "Only automatic/RTE rounding mode is supported for double type.");
+      return getBFloat16FromDoubleWithRoundingMode(a, roundingMode);
+    }


is there a possibility of other floating types besides float and double? Half? Should there be a std::is_floating_point<T> clause for the future?

Nice catch. I've added the clause for half as well.

cperkinsintel

looks good, had one question.

uditagarwal97 · 2024-06-21T21:19:29Z

@intel/llvm-gatekeepers the PR is ready to be merged!

Followup and blocked by: #14105 Currently, `vec<bfloat>` math builtins do element-by-element operations. This PR optimize `vec<bfloat>` math builtins by: (1) Converting `vec<bfloat>` to `vec<float>`. (2) Do the operation on `vec<float>` (which uses Spirv built-ins underneath for optimized vector operations). (3) Convert back the return value to `vec<bfloat>`. Look at the beautiful diff in `check_device_code/vector/vector_bf16_builtins.cpp` to visualize the device code generated before and after this optimization.

uditagarwal97 added 9 commits April 18, 2024 16:28

Add copy constructor

0aa7a9a

Merge branch 'sycl' of https://github.com/uditagarwal97/llvm into sycl

f2a1dc2

Merge branch 'sycl' of https://github.com/uditagarwal97/llvm into sycl

361eea7

Add vector overloads on ConvertBFloat16ToFINTEL and ConvertFToBFloat1…

48a8574

…6INTEL

Fix test case

6bce35d

Fix tests; Address reviews.

8d8295e

Fix formatting

91cd730

Merge branch 'bf16tof' into opt_math_builtins

20580de

Fix conversion between vec<float> <--> vec<bfloat16>.

d76bc66

uditagarwal97 temporarily deployed to WindowsCILock June 9, 2024 23:15 — with GitHub Actions Inactive

uditagarwal97 changed the title ~~[SYCL] Fix sycl::vec::convert<> to allow conversion between sycl::vec of float and bfloat16 type~~ [SYCL] Fix sycl::vec::convert<> to allow conversion between sycl::vec of float and bfloat16 types Jun 9, 2024

uditagarwal97 self-assigned this Jun 9, 2024

uditagarwal97 mentioned this pull request Jun 9, 2024

[SYCL] Optimize vec<bfloat> math builtins #14106

Merged

uditagarwal97 had a problem deploying to WindowsCILock June 10, 2024 00:07 — with GitHub Actions Failure

Call libdevice primitives instead of sprirv ones

3b7826e

uditagarwal97 had a problem deploying to WindowsCILock June 11, 2024 16:09 — with GitHub Actions Failure

Merge remote-tracking branch 'upstream/sycl' into opt_math_builtins

5608861

uditagarwal97 had a problem deploying to WindowsCILock June 12, 2024 19:49 — with GitHub Actions Error

Fix formatting

402073d

uditagarwal97 marked this pull request as ready for review June 12, 2024 20:07

uditagarwal97 requested review from a team as code owners June 12, 2024 20:07

uditagarwal97 requested a review from aelovikov-intel June 12, 2024 20:07

uditagarwal97 temporarily deployed to WindowsCILock June 12, 2024 20:07 — with GitHub Actions Inactive

uditagarwal97 temporarily deployed to WindowsCILock June 12, 2024 20:39 — with GitHub Actions Inactive

uditagarwal97 marked this pull request as draft June 12, 2024 20:46

uditagarwal97 added 2 commits June 12, 2024 14:16

Simplify convert for float and BF16

fc6aa6a

Merge remote-tracking branch 'upstream/sycl' into opt_math_builtins

c60ec69

uditagarwal97 temporarily deployed to WindowsCILock June 14, 2024 19:54 — with GitHub Actions Inactive

uditagarwal97 temporarily deployed to WindowsCILock June 16, 2024 15:40 — with GitHub Actions Inactive

uditagarwal97 commented Jun 16, 2024

View reviewed changes

uditagarwal97 temporarily deployed to WindowsCILock June 16, 2024 16:12 — with GitHub Actions Inactive

AlexeySachkov approved these changes Jun 17, 2024

View reviewed changes

Add comment in static assert.

8a6caf1

uditagarwal97 had a problem deploying to WindowsCILock June 17, 2024 14:01 — with GitHub Actions Failure

Fix build error

85a33f8

uditagarwal97 temporarily deployed to WindowsCILock June 17, 2024 15:53 — with GitHub Actions Inactive

aelovikov-intel reviewed Jun 17, 2024

View reviewed changes

uditagarwal97 temporarily deployed to WindowsCILock June 17, 2024 17:29 — with GitHub Actions Inactive

Support all rounding modes for BF16 <--> float conversion.

e5ef19a

uditagarwal97 temporarily deployed to WindowsCILock June 18, 2024 06:42 — with GitHub Actions Inactive

uditagarwal97 temporarily deployed to WindowsCILock June 18, 2024 07:14 — with GitHub Actions Inactive

Fix formatting

fbeb2db

uditagarwal97 temporarily deployed to WindowsCILock June 18, 2024 16:20 — with GitHub Actions Inactive

uditagarwal97 temporarily deployed to WindowsCILock June 18, 2024 17:08 — with GitHub Actions Inactive

Don't emit __imf_ functions on non-intel hardwares

a9df444

uditagarwal97 had a problem deploying to WindowsCILock June 20, 2024 19:18 — with GitHub Actions Failure

uditagarwal97 requested a review from cperkinsintel June 20, 2024 19:28

uditagarwal97 changed the title ~~[SYCL] Fix sycl::vec::convert<> to allow conversion between sycl::vec of float and bfloat16 types~~ [SYCL] Fix sycl::vec::convert<> to allow conversion to and from sycl::vec of bfloat16 type to that of other data types Jun 20, 2024

uditagarwal97 temporarily deployed to WindowsCILock June 20, 2024 21:19 — with GitHub Actions Inactive

cperkinsintel reviewed Jun 21, 2024

View reviewed changes

cperkinsintel approved these changes Jun 21, 2024

View reviewed changes

Address review. Fix build error.

cc288cb

uditagarwal97 temporarily deployed to WindowsCILock June 21, 2024 18:23 — with GitHub Actions Inactive

uditagarwal97 temporarily deployed to WindowsCILock June 21, 2024 19:33 — with GitHub Actions Inactive

againull merged commit 02c6bba into intel:sycl Jun 21, 2024
14 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[SYCL] Fix `sycl::vec::convert<>` to allow conversion to and from `sycl::vec` of `bfloat16` type to that of other data types #14105

[SYCL] Fix `sycl::vec::convert<>` to allow conversion to and from `sycl::vec` of `bfloat16` type to that of other data types #14105

uditagarwal97 commented Jun 9, 2024

uditagarwal97 commented Jun 12, 2024

uditagarwal97 Jun 16, 2024

aelovikov-intel Jun 17, 2024

AlexeySachkov Jun 17, 2024

uditagarwal97 Jun 17, 2024

aelovikov-intel Jun 17, 2024

uditagarwal97 Jun 17, 2024 •

edited

Loading

aelovikov-intel Jun 17, 2024

aelovikov-intel Jun 17, 2024

uditagarwal97 Jun 17, 2024

aelovikov-intel Jun 17, 2024

uditagarwal97 Jun 17, 2024

aelovikov-intel Jun 17, 2024

uditagarwal97 commented Jun 20, 2024

cperkinsintel Jun 21, 2024 •

edited

Loading

uditagarwal97 Jun 21, 2024

cperkinsintel left a comment

uditagarwal97 commented Jun 21, 2024

		template <typename NativeBFT, typename NativeFloatT, int VecSize>
		inline NativeFloatT ConvertBF16ToF(NativeBFT val) {

		// Currently, for BF16 <--> float conversion, we only support
		// Round-to-even rounding mode.

[SYCL] Fix sycl::vec::convert<> to allow conversion to and from sycl::vec of bfloat16 type to that of other data types #14105

[SYCL] Fix sycl::vec::convert<> to allow conversion to and from sycl::vec of bfloat16 type to that of other data types #14105

Conversation

uditagarwal97 commented Jun 9, 2024

uditagarwal97 commented Jun 12, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

uditagarwal97 Jun 17, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

uditagarwal97 commented Jun 20, 2024

cperkinsintel Jun 21, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

cperkinsintel left a comment

Choose a reason for hiding this comment

uditagarwal97 commented Jun 21, 2024

[SYCL] Fix `sycl::vec::convert<>` to allow conversion to and from `sycl::vec` of `bfloat16` type to that of other data types #14105

[SYCL] Fix `sycl::vec::convert<>` to allow conversion to and from `sycl::vec` of `bfloat16` type to that of other data types #14105

uditagarwal97 Jun 17, 2024 •

edited

Loading

cperkinsintel Jun 21, 2024 •

edited

Loading