[SYCL][libdevice] Add fast_* in imf libdevice #10004

jinge90 · 2023-06-21T02:00:43Z

No description provided.

Signed-off-by: jinge90 <ge.jin@intel.com>

jinge90 · 2023-06-21T02:19:12Z

Hi, @xtian-github @akolesov-intel @zettai-reido
This PR aims to add "fast_" math functions to imf libdevice to correspond to NV libdevice "_nv_fast" functions. Currently, the mapping is following:
__nv_fast_exp10f(x)--------------->sycl::native::exp(log10 * x)
__nv_fast_expf(x)------------------>sycl::native::exp(x)
__nv_fast_fdividef(x)--------------->sycl::native::divide(x, y)
__nv_fast_log10f(x)---------------->sycl::native::log10(x)
__nv_fast_log2f(x)------------------->sycl::native::log2(x)
__nv_fast_log(x)--------------------->sycl::native::log(x)
__nv_fast_powf(x)------------------->sycl::naitve::powr(x)

Could you help reivew?
Thanks very much.

jinge90 · 2023-06-21T02:22:15Z

libdevice/device_imf.hpp

+}
+
+static inline float __fast_fdividef(float x, float y) {
+  unsigned ybits = __builtin_bit_cast(unsigned, y);


Hi, @akolesov-intel and @zettai-reido
For __nv_fast_fdividef: https://docs.nvidia.com/cuda/libdevice-users-guide/__nv_fast_fdividef.html#__nv_fast_fdividef
NV has requirements for 2^126 < y < 2^128 which sycl native math doesn't have, the code below is to handle this.

Looks good to me. We might optimize it in future updates: simplify the range check and use fast approximation while fdividef allows 2 ulp rather than correctly rounded x/y.

xtian-github

LGTM

zettai-reido

I approve, but see two possible changes.

Usually the constants are spelled (in hexadecimal),
so there is no variety on how compiler treats it.

And floating point is continuous, especially without sign.
So >2^126 can be replaced with one comparison with hex constant too.

libdevice/device_imf.hpp

zettai-reido · 2023-06-21T06:32:21Z

libdevice/device_imf.hpp

+  unsigned xexp_bits = (xbits >> 23) & 0xFF;
+  unsigned yman_bits = ybits & 0x7F'FFFF;
+  unsigned xman_bits = xbits & 0x7F'FFFF;
+  if ((yexp_bits == 0xFD && yman_bits != 0) || (yexp_bits == 0xFE)) {


(ybits > 0x7e80'0000) should do.

akolesov-intel

Looks good, thank you!

AlexeySachkov

sycl-post-link part LGTM

Signed-off-by: jinge90 <ge.jin@intel.com>

jinge90 · 2023-06-27T05:54:19Z

Hi, @intel/llvm-reviewers-runtime
Could you help review this patch?
Thanks very much.

steffenlarsen

Runtime changes LGTM!

jinge90 · 2023-06-28T02:40:04Z

Hi, @intel/llvm-gatekeepers
Could you help review and merge this patch?
Thanks very much.

steffenlarsen · 2023-06-28T05:40:20Z

Taking @AlexeySachkov's review as an approval.

jinge90 requested review from a team as code owners June 21, 2023 02:00

jinge90 requested review from cperkinsintel, zettai-reido and xtian-github June 21, 2023 02:00

Add fast_* in imf libdevice

3b414dc

Signed-off-by: jinge90 <ge.jin@intel.com>

jinge90 temporarily deployed to aws June 21, 2023 02:21 — with GitHub Actions Inactive

jinge90 commented Jun 21, 2023

View reviewed changes

jinge90 temporarily deployed to aws June 21, 2023 04:20 — with GitHub Actions Inactive

xtian-github approved these changes Jun 21, 2023

View reviewed changes

zettai-reido approved these changes Jun 21, 2023

View reviewed changes

akolesov-intel approved these changes Jun 21, 2023

View reviewed changes

AlexeySachkov reviewed Jun 22, 2023

View reviewed changes

Use hex floating point constant

64e170a

Signed-off-by: jinge90 <ge.jin@intel.com>

jinge90 temporarily deployed to aws June 25, 2023 03:51 — with GitHub Actions Inactive

jinge90 temporarily deployed to aws June 25, 2023 04:30 — with GitHub Actions Inactive

Merge remote-tracking branch 'upstream/sycl' into imf_fast_math

e53d73b

jinge90 temporarily deployed to aws June 25, 2023 06:28 — with GitHub Actions Inactive

jinge90 temporarily deployed to aws June 25, 2023 07:06 — with GitHub Actions Inactive

akolesov-intel approved these changes Jun 26, 2023

View reviewed changes

steffenlarsen approved these changes Jun 27, 2023

View reviewed changes

Merge remote-tracking branch 'upstream/sycl' into imf_fast_math

b114f84

jinge90 temporarily deployed to aws June 28, 2023 01:28 — with GitHub Actions Inactive

jinge90 temporarily deployed to aws June 28, 2023 02:08 — with GitHub Actions Inactive

jinge90 requested a review from a team June 28, 2023 02:39

steffenlarsen merged commit d96e507 into intel:sycl Jun 28, 2023

Chenyang-L mentioned this pull request Jul 11, 2023

LLVM and SPIRV-LLVM-Translator pulldown (WW25) #10311

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[SYCL][libdevice] Add fast_* in imf libdevice #10004

[SYCL][libdevice] Add fast_* in imf libdevice #10004

jinge90 commented Jun 21, 2023

jinge90 commented Jun 21, 2023

jinge90 Jun 21, 2023

akolesov-intel Jun 21, 2023

xtian-github left a comment

zettai-reido left a comment

zettai-reido Jun 21, 2023

jinge90 Jun 25, 2023

akolesov-intel left a comment

AlexeySachkov left a comment

jinge90 commented Jun 27, 2023

steffenlarsen left a comment

jinge90 commented Jun 28, 2023

steffenlarsen commented Jun 28, 2023

[SYCL][libdevice] Add fast_* in imf libdevice #10004

[SYCL][libdevice] Add fast_* in imf libdevice #10004

Conversation

jinge90 commented Jun 21, 2023

jinge90 commented Jun 21, 2023

jinge90 Jun 21, 2023

Choose a reason for hiding this comment

akolesov-intel Jun 21, 2023

Choose a reason for hiding this comment

xtian-github left a comment

Choose a reason for hiding this comment

zettai-reido left a comment

Choose a reason for hiding this comment

zettai-reido Jun 21, 2023

Choose a reason for hiding this comment

jinge90 Jun 25, 2023

Choose a reason for hiding this comment

akolesov-intel left a comment

Choose a reason for hiding this comment

AlexeySachkov left a comment

Choose a reason for hiding this comment

jinge90 commented Jun 27, 2023

steffenlarsen left a comment

Choose a reason for hiding this comment

jinge90 commented Jun 28, 2023

steffenlarsen commented Jun 28, 2023