Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[fix] Added fp64 kind for radians operations in PTX compiler #488

Merged
merged 1 commit into from
Jul 9, 2024

Conversation

stratika
Copy link
Collaborator

@stratika stratika commented Jul 8, 2024

Description

This PR provides a fix for radians operations with double input types when PTX is used.

Problem description

The problem was that the radians implementation is implemented as (pi / 180) * degrees, since there is no instruction. And in the compiler we had a check to ensure that it is only supported for fp32.

In this PR, I added support for fp64 types.

Backend/s tested

Mark the backends affected by this PR.

  • OpenCL
  • PTX
  • SPIRV

OS tested

Mark the OS where this PR is tested.

  • Linux
  • OSx
  • Windows

Did you check on FPGAs?

If it is applicable, check your changes on FPGAs.

  • Yes
  • No

How to test the new patch?

Build TornadoVM with PTX:

make BACKEND=ptx

Run:

tornado-test -V --printKernel --fast uk.ac.manchester.tornado.unittests.math.TestTornadoMathCollection#testTornadoMathRadiansDouble

Expected output:

WARNING: Using incubator modules: jdk.incubator.vector
.version 7.6 
.target sm_86 
.address_size 64 

.visible .entry s0_t0_testtornadoradians_arrays_doublearray(.param .u64 .ptr .global .align 8 kernel_context, .param .u64 .ptr .global .align 8 a) {
	.reg .s32 rsi<6>;
	.reg .f64 rfd<3>;
	.reg .pred rpb<2>;
	.reg .u32 rui<5>;
	.reg .s64 rsd<3>;
	.reg .u64 rud<5>;

BLOCK_0:
	ld.param.u64	rud0, [kernel_context];
	ld.param.u64	rud1, [a];
	mov.u32	rui0, %nctaid.x;
	mov.u32	rui1, %ntid.x;
	mul.wide.u32	rud2, rui0, rui1;
	cvt.s32.u64	rsi0, rud2;
	mov.u32	rui2, %tid.x;
	mov.u32	rui3, %ctaid.x;
	mad.lo.s32	rsi1, rui3, rui1, rui2;

BLOCK_1:
	mov.s32	rsi2, rsi1;
LOOP_COND_1:
	setp.lt.s32	rpb0, rsi2, 128;
	@!rpb0 bra	BLOCK_3;

BLOCK_2:
	add.s32	rsi3, rsi2, 3;
	cvt.s64.s32	rsd0, rsi3;
	shl.b64	rsd1, rsd0, 3;
	add.u64	rud3, rud1, rsd1;
	ld.global.f64	rfd0, [rud3];
	mul.rn.f64	rfd1, 0D3F91DF4720000000, rfd0;
	st.global.f64	[rud3], rfd1;
	add.s32	rsi4, rsi0, rsi2;
	mov.s32	rsi2, rsi4;
	bra.uni	LOOP_COND_1;

BLOCK_3:
	ret;
}

Test: class uk.ac.manchester.tornado.unittests.math.TestTornadoMathCollection#testTornadoMathRadiansDouble
	Running test: testTornadoMathRadiansDouble ................  [PASS] 
Test ran: 1, Failed: 0, Unsupported: 0

@stratika stratika self-assigned this Jul 8, 2024
@stratika stratika added bug Something isn't working PTX labels Jul 8, 2024
@stratika stratika requested a review from mikepapadim July 9, 2024 06:58
Copy link
Member

@mikepapadim mikepapadim left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@stratika stratika merged commit dafba62 into beehive-lab:develop Jul 9, 2024
2 checks passed
@stratika stratika deleted the fix/radians-fp64/ptx branch July 9, 2024 08:13
jjfumero added a commit to jjfumero/TornadoVM that referenced this pull request Aug 30, 2024
Improvements
~~~~~~~~~~~~~~~~~~

- beehive-lab#468: Cleanup Abstract Metadata Class.
- beehive-lab#473: Add maven plugin to build TornadoVM source for the releases.
- beehive-lab#474: Refactor `<X>TornadoDevice` to place common methods in the `TornadoXPUInterface`.
- beehive-lab#482: Help messages improve when an out-of-memory exception is raised.
- beehive-lab#484: Double-type for the trigonometric functions added in the `TornadoMath` class.
- beehive-lab#487: Prebuilt API simplified.
- beehive-lab#494: Add test to trigger unsupported features related to direct use of Memory Segments.
- beehive-lab#509: Add a quick pass configuration to skip the heavy tests during active development.
- beehive-lab#532: Improve thread scheduler to support RISC-V Accelerators from Codeplay.
- beehive-lab#533: Support for scalar values to be passed via lambda expressions as tasks.
- beehive-lab#538: `README` file updated.
- beehive-lab#539: Refactor core classes and add new API methods to pass compilation flags to the low-level driver compilers (OpenCL, PTX and Level Zero).
- beehive-lab#542: Tagged LevelZero JNI and Beehive Toolkit dependencies added in the build and installer.

Compatibility
~~~~~~~~~~~~~~~~~~

- beehive-lab#465: Support for JDK 22 and GraalVM 24.0.2.
- beehive-lab#486: Temurin for Windows added in the list of supported JDKs.
- beehive-lab#525: Revert usage of String Templates in preparation for JDK 23.
- beehive-lab#527: SPIR-V version parameter added. TornadoVM may run previous SPIR-V versions (e.g., ComputeAorta from Codeplay).
- beehive-lab#513: LevelZero JNI Library updated to v0.1.4.

Bug Fixes
~~~~~~~~~~~~~~~~~~

- beehive-lab#470: README documentation fixed.
- beehive-lab#478: Fix the test names that are present in the white list.
- beehive-lab#488: FP64 Kind for radian operations and the PTX backend fixed.
- beehive-lab#493: Tests Whitelist for PTX backend fixed.
- beehive-lab#502: Fix barrier type in the documentation regarding programmability of reductions.
- beehive-lab#514: Installer script fixed.
- beehive-lab#540: Fix  issue with clean-up execution IDs function.
- beehive-lab#541: Fix Data Accessors for the prebuilt API.
- beehive-lab#543: Fix checkstyle condition and FP16 error message improved.
jjfumero added a commit to jjfumero/TornadoVM that referenced this pull request Aug 30, 2024
Improvements
~~~~~~~~~~~~~~~~~~

- beehive-lab#468: Cleanup Abstract Metadata Class.
- beehive-lab#473: Add maven plugin to build TornadoVM source for the releases.
- beehive-lab#474: Refactor `<X>TornadoDevice` to place common methods in the `TornadoXPUInterface`.
- beehive-lab#482: Help messages improved when an out-of-memory exception is raised.
- beehive-lab#484: Double-type for the trigonometric functions added in the `TornadoMath` class.
- beehive-lab#487: Prebuilt API simplified.
- beehive-lab#494: Add test to trigger unsupported features related to direct use of Memory Segments.
- beehive-lab#509: Add a quick pass configuration to skip the heavy tests during active development.
- beehive-lab#532: Improve thread scheduler to support RISC-V Accelerators from Codeplay.
- beehive-lab#533: Support for scalar values to be passed via lambda expressions as tasks.
- beehive-lab#538: `README` file updated.
- beehive-lab#539: Refactor core classes and add new API methods to pass compilation flags to the low-level driver compilers (OpenCL, PTX and Level Zero).
- beehive-lab#542: Tagged LevelZero JNI and Beehive Toolkit dependencies added in the build and installer.

Compatibility
~~~~~~~~~~~~~~~~~~

- beehive-lab#465: Support for JDK 22 and GraalVM 24.0.2.
- beehive-lab#486: Temurin for Windows added in the list of supported JDKs.
- beehive-lab#525: Revert usage of String Templates in preparation for JDK 23.
- beehive-lab#527: SPIR-V version parameter added. TornadoVM may run previous SPIR-V versions (e.g., ComputeAorta from Codeplay).
- beehive-lab#513: LevelZero JNI Library updated to v0.1.4.

Bug Fixes
~~~~~~~~~~~~~~~~~~

- beehive-lab#470: README documentation fixed.
- beehive-lab#478: Fix the test names that are present in the white list.
- beehive-lab#488: FP64 Kind for radian operations and the PTX backend fixed.
- beehive-lab#493: Tests Whitelist for PTX backend fixed.
- beehive-lab#502: Fix barrier type in the documentation regarding programmability of reductions.
- beehive-lab#514: Installer script fixed.
- beehive-lab#540: Fix  issue with clean-up execution IDs function.
- beehive-lab#541: Fix Data Accessors for the prebuilt API.
- beehive-lab#543: Fix checkstyle condition and FP16 error message improved.

minor change
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working PTX
Projects
Development

Successfully merging this pull request may close these issues.

3 participants