Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SYCL][CUDA] Add no-fast-math to tests that rely on it. #9889

Merged
merged 4 commits into from
Jul 10, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 2 additions & 1 deletion sycl/test-e2e/BFloat16/bfloat16_builtins.cpp
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
// DEFINE: %{mathflags} = %if cl_options %{/clang:-fno-fast-math%} %else %{-fno-fast-math%}
// REQUIRES: aspect-ext_oneapi_bfloat16_math_functions
// RUN: %clangxx -fsycl -fsycl-targets=%{sycl_triple} %if any-device-is-cuda %{ -Xsycl-target-backend --cuda-gpu-arch=sm_80 %} %s -o %t.out
// RUN: %clangxx -fsycl -fsycl-targets=%{sycl_triple} %if any-device-is-cuda %{ -Xsycl-target-backend --cuda-gpu-arch=sm_80 %} %s -o %t.out %{mathflags}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@JackAKirk - I'm working with this test and on our slightly older shared CUDA dev machines sm_80 gets rejected. But with sm_75 the test both compiles and behaves as expected.

Can I just switch this to sm_75 ? Or, better yet, given that we have a REQUIRES: aspect-ext_oneapi_bfloat16_math_functions in this test, can the whole %if any-device-is-cuda ... %} block be removed?

Copy link
Contributor Author

@JackAKirk JackAKirk Jan 3, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test behaves differently depending on whether it is compiled for sm_xx>=sm_80 or not:

  • sm_80 and above uses some native bfloat16 math instructions
  • below sm_80 always uses generic impls

So I set it to compile with sm_80 flag because this is what the CI has, and allows testing of the native impls.

It is possible to remove the arch flag, and it is probably the best thing to do now that bfloat16 is generically supported, to avoid confusion. Unfortunately this means that the native impls won't be tested automatically via the CI. But I suppose we could test this in release testing.

ext_oneapi_bfloat16_math_functions is really an artifact of earlier times when bfloat16 was not generically implemented for all devices: I think it should be removed.

Thanks

Copy link
Contributor

@cperkinsintel cperkinsintel Jan 3, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given that, I think I'll have it test both and add a comment summarizing what you just said. Or, at minimum, the comment.

// RUN: %{run} %t.out
// Currently the feature isn't supported on FPGA.
// UNSUPPORTED: accelerator
Expand Down
3 changes: 2 additions & 1 deletion sycl/test-e2e/DeviceLib/built-ins/scalar_relational.cpp
Original file line number Diff line number Diff line change
@@ -1,4 +1,5 @@
// RUN: %{build} -fsycl-device-code-split=per_kernel -o %t.out
// DEFINE: %{mathflags} = %if cl_options %{/clang:-fno-fast-math%} %else %{-fno-fast-math%}
// RUN: %{build} -fsycl-device-code-split=per_kernel -o %t.out %{mathflags}
// RUN: %{run} %t.out

#include <sycl/sycl.hpp>
Expand Down
3 changes: 2 additions & 1 deletion sycl/test-e2e/DeviceLib/built-ins/vector_relational.cpp
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
// FIXME unsupported on windows (opencl and level-zero) until fix of libdevice
// UNSUPPORTED: windows && (opencl || level_zero)
// RUN: %{build} -o %t.out
// DEFINE: %{mathflags} = %if cl_options %{/clang:-fno-fast-math%} %else %{-fno-fast-math%}
// RUN: %{build} -o %t.out %{mathflags}
// RUN: %{run} %t.out

#include <sycl/sycl.hpp>
Expand Down
Loading