-
Notifications
You must be signed in to change notification settings - Fork 115
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support vectors of float-16 values #372
Conversation
…plemented functionality
…sult of an operation or a copy of the fields of an existing vector
…alf float vectors
… for each loadindexedvector if is it for a half float vector instead of assuming that all of them are if one is
Some testing: a) OpenCL on the Intel HD Graphics: tornado-test --threadInfo -V --jvm="-Dtornado.unittests.device=0:1" uk.ac.manchester.tornado.unittests.vectortypes.TestHalfFloats
WARNING: Using incubator modules: jdk.incubator.vector
Task info: s0.t0
Backend : OPENCL
Device : Intel(R) UHD Graphics 770 CL_DEVICE_TYPE_GPU (available)
Dims : 1
Global work offset: [0]
Global work size : [16]
Local work size : [16, 1, 1]
Number of workgroups : [1]
Test: class uk.ac.manchester.tornado.unittests.vectortypes.TestHalfFloats
Running test: vectorPhiTest ................ [PASS]
Running test: testSimpleDotProductHalf2 ................ [PASS]
Running test: testSimpleDotProductHalf3 ................ [PASS]
Running test: testSimpleDotProductHalf4 ................ [PASS]
Running test: testSimpleDotProductHalf8 ................ [PASS]
Running test: testSimpleDotProductHalf16 ................ [PASS]
Running test: testSimpleVectorAddition ................ [PASS]
Running test: testVectorHalf2 ................ [PASS]
Running test: testVectorHalf3 ................ [PASS]
Running test: testVectorFloat3toString ................ [PASS]
Running test: testVectorHalf4 ................ [PASS]
Running test: testVectorHalf16 ................ [PASS]
Running test: testVectorHalf8 ................ [PASS]
Running test: testVectorHalf8_Storage ................ [PASS]
Running test: testDotProduct ................ [PASS]
Running test: privateVectorHalf2 ................ [PASS]
Running test: privateVectorHalf4 ................ [PASS]
Running test: privateVectorHalf8 ................ [PASS]
Running test: testVectorHalf4_Unary ................ [PASS]
Running test: testInternalSetMethod01 ................ [PASS]
Running test: testInternalSetMethod02 ................ [PASS]
Running test: testInternalSetMethod03 ................ [PASS]
Running test: testInternalSetMethod04 ................ [PASS]
Running test: testAllocationIssue ................ [PASS] B) SPIR-V Backend: Task info: s0.t0
Backend : SPIRV
Device : SPIRV LevelZero - Intel(R) UHD Graphics 770 GPU
Dims : 1
Global work offset: [0]
Global work size : [16]
Local work size : [16, 1, 1]
Number of workgroups : [1]
Test: class uk.ac.manchester.tornado.unittests.vectortypes.TestHalfFloats
Running test: vectorPhiTest ................ [FAILED]
\_[REASON] expected:<8.0> but was:<1.0>
Running test: testSimpleDotProductHalf2 ................ [PASS]
Running test: testSimpleDotProductHalf3 ................ [PASS]
Running test: testSimpleDotProductHalf4 ................ [PASS]
Running test: testSimpleDotProductHalf8 ................ [PASS]
Running test: testSimpleDotProductHalf16 ................ [PASS]
Running test: testSimpleVectorAddition ................ [FAILED]
\_[REASON] expected:<4.0> but was:<1.0>
Running test: testVectorHalf2 ................ [FAILED]
\_[REASON] expected:<16.0> but was:<1.0>
Running test: testVectorHalf3 ................ [FAILED]
\_[REASON] expected:<8.0> but was:<1.0>
Running test: testVectorFloat3toString ................ [PASS]
Running test: testVectorHalf4 ................ [FAILED]
\_[REASON] expected:<8.0> but was:<1.0>
Running test: testVectorHalf16 ................ [FAILED]
\_[REASON] expected:<16.0> but was:<1.0>
Running test: testVectorHalf8 ................ [FAILED]
\_[REASON] expected:<8.0> but was:<1.0>
Running test: testVectorHalf8_Storage ................ [PASS]
Running test: testDotProduct ................ [PASS]
Running test: privateVectorHalf2 ................ [FAILED]
\_[REASON] expected:<120.0> but was:<1.0>
Running test: privateVectorHalf4 ................ [FAILED]
\_[REASON] expected:<120.0> but was:<1.0>
Running test: privateVectorHalf8 ................ [FAILED]
\_[REASON] expected:<120.0> but was:<1.0>
Running test: testVectorHalf4_Unary ................ [PASS]
Running test: testInternalSetMethod01 ................ [PASS]
Running test: testInternalSetMethod02 ................ [PASS]
Running test: testInternalSetMethod03 ................ [PASS]
Running test: testInternalSetMethod04 ................ [PASS]
Running test: testAllocationIssue ................ [PASS]
Test ran: 24, Failed: 10, Unsupported: 0 C) For the PTX backend: ornado-test --threadInfo -V --jvm="-Dtornado.unittests.device=0:1" uk.ac.manchester.tornado.unittests.vectortypes.TestHalfFloats
WARNING: Using incubator modules: jdk.incubator.vector
Test: class uk.ac.manchester.tornado.unittests.vectortypes.TestHalfFloats
Running test: vectorPhiTest ................ [FAILED]
\_[REASON] Index 1 out of bounds for length 1
Running test: testSimpleDotProductHalf2 ................ [FAILED]
\_[REASON] Index 1 out of bounds for length 1
Running test: testSimpleDotProductHalf3 ................ [FAILED]
\_[REASON] Index 1 out of bounds for length 1
Running test: testSimpleDotProductHalf4 ................ [FAILED]
\_[REASON] Index 1 out of bounds for length 1
Running test: testSimpleDotProductHalf8 ................ [FAILED]
\_[REASON] Index 1 out of bounds for length 1
Running test: testSimpleDotProductHalf16 ................ [FAILED]
\_[REASON] Index 1 out of bounds for length 1
Running test: testSimpleVectorAddition ................ [FAILED]
\_[REASON] Index 1 out of bounds for length 1
Running test: testVectorHalf2 ................ [FAILED]
\_[REASON] Index 1 out of bounds for length 1
Running test: testVectorHalf3 ................ [FAILED]
\_[REASON] Index 1 out of bounds for length 1
Running test: testVectorFloat3toString ................ [FAILED]
\_[REASON] Index 1 out of bounds for length 1
Running test: testVectorHalf4 ................ [FAILED]
\_[REASON] Index 1 out of bounds for length 1
Running test: testVectorHalf16 ................ [FAILED]
\_[REASON] Index 1 out of bounds for length 1
Running test: testVectorHalf8 ................ [FAILED]
\_[REASON] Index 1 out of bounds for length 1
Running test: testVectorHalf8_Storage ................ [FAILED]
\_[REASON] Index 1 out of bounds for length 1
Running test: testDotProduct ................ [FAILED]
\_[REASON] Index 1 out of bounds for length 1
Running test: privateVectorHalf2 ................ [FAILED]
\_[REASON] Index 1 out of bounds for length 1
Running test: privateVectorHalf4 ................ [FAILED]
\_[REASON] Index 1 out of bounds for length 1
Running test: privateVectorHalf8 ................ [FAILED]
\_[REASON] Index 1 out of bounds for length 1
Running test: testVectorHalf4_Unary ................ [FAILED]
\_[REASON] Index 1 out of bounds for length 1
Running test: testInternalSetMethod01 ................ [FAILED]
\_[REASON] Index 1 out of bounds for length 1
Running test: testInternalSetMethod02 ................ [FAILED]
\_[REASON] Index 1 out of bounds for length 1
Running test: testInternalSetMethod03 ................ [FAILED]
\_[REASON] Index 1 out of bounds for length 1
Running test: testInternalSetMethod04 ................ [FAILED]
\_[REASON] Index 1 out of bounds for length 1
Running test: testAllocationIssue ................ [FAILED]
\_[REASON] Index 1 out of bounds for length 1
Test ran: 24, Failed: 24, Unsupported: 0 Commit point: #24c971a95 |
Let's work on it together. We can start with the SPIR-V Backend. |
I cannot reproduce these errors for some reason. These are the tests for the SPIV backend for me:
|
For PTX
|
ok. let me check with an older CPU. I detected that some of the tests are not passing using > Intel 12th gen HD Graphics. |
My mistake. The PTX tests are passing. The command I used was wrong. Let me work on the SPIR-V and see what I can spot. |
Still with an older CPU fails. I am using Intel compute runtime |
This did the trick for SPIR-V Half2 vectors: diff --git a/tornado-drivers/spirv/src/main/java/uk/ac/manchester/tornado/drivers/spirv/graal/nodes/vector/VectorAddNode.java b/tornado-drivers/spirv/src/main/java/uk/ac/manchester/tornado/drivers/spirv/graal/nodes/vector/VectorAddNode.java
index 761e060ce..01e2c8ae5 100644
--- a/tornado-drivers/spirv/src/main/java/uk/ac/manchester/tornado/drivers/spirv/graal/nodes/vector/VectorAddNode.java
+++ b/tornado-drivers/spirv/src/main/java/uk/ac/manchester/tornado/drivers/spirv/graal/nodes/vector/VectorAddNode.java
@@ -13,7 +13,7 @@
*
* This code is distributed in the hope that it will be useful, but WITHOUT
* ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
- * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License
+ * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License
* version 2 for more details (a copy is included in the LICENSE file that
* accompanied this code).
*
@@ -95,6 +95,8 @@ public class VectorAddNode extends BinaryNode implements LIRLowerable, VectorOp
if (kind.getElementKind().isFloatingPoint()) {
binaryOp = SPIRVAssembler.SPIRVBinaryOp.ADD_FLOAT;
+ } else if (kind.isHalf()) {
+ binaryOp = SPIRVAssembler.SPIRVBinaryOp.ADD_FLOAT;
}
Le'ts replicate this change for all vector types. It might be a driver fix after all with new versions of the Intel compute runtime. |
Cool, now it passes all new tests regarding FP16 with SPIR-V:
|
public static boolean isEqual(HalfFloatArray a, HalfFloatArray b) { | ||
boolean result = true; | ||
for (int i = 0; i < a.getSize() && result; i++) { | ||
result = compareBits(a.get(i).getHalfFloatValue(), b.get(i).getHalfFloatValue()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't be something like:
result = result & compareBits(a.get(i).getHalfFloatValue(), b.get(i).getHalfFloatValue());
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a copy from the other isEqual
methods we have in this class, just for HalfFloatArray
data. If I change this one, should I change all the others as well?
|
||
public final class VectorHalf implements TornadoCollectionInterface<ShortBuffer> { | ||
|
||
private static final int ELEMENT_SIZE = 1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Elements size is 2 bytes, correct?
tornado-api/src/main/java/uk/ac/manchester/tornado/api/types/collections/VectorHalf.java
Outdated
Show resolved
Hide resolved
|
||
public static final Class<VectorHalf16> TYPE = VectorHalf16.class; | ||
|
||
private static final int ELEMENT_SIZE = 16; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, I am confused now. Element size then indicates the number of Half elements, not the half size.
In this case, I suggest renaming this constant: ELEMENT_VECTOR_SIZE
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, makes sense. I kept it like that for consistency, because this is how this field is named in all the other vector collection classes. I was thinking to have a separate PR for refactoring all the vector classes at some point, but I can just rename this field for the new classes.
tornado-api/src/main/java/uk/ac/manchester/tornado/api/types/vectors/Half2.java
Outdated
Show resolved
Hide resolved
.../java/uk/ac/manchester/tornado/drivers/opencl/graal/phases/TornadoHalfFloatVectorOffset.java
Outdated
Show resolved
Hide resolved
.../java/uk/ac/manchester/tornado/drivers/opencl/graal/phases/TornadoHalfFloatVectorOffset.java
Show resolved
Hide resolved
...ain/java/uk/ac/manchester/tornado/drivers/ptx/graal/phases/TornadoHalfFloatVectorOffset.java
Outdated
Show resolved
Hide resolved
for (Node vectorElement : vectorValueNode.inputs()) { | ||
if (vectorElement instanceof VectorLoadElementNode) { | ||
VectorLoadElementNode vectorLoad = (VectorLoadElementNode) vectorElement; | ||
VectorLoadElementNode vectorLoadShort = new VectorLoadElementNode(SPIRVKind.OP_TYPE_FLOAT_16, vectorLoad.getVector(), vectorLoad.getLaneId()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In this case, FLOAT16 is used, instead of SHORT, as we saw in the OpenCL.
...n/java/uk/ac/manchester/tornado/drivers/spirv/graal/phases/TornadoHalfFloatVectorOffset.java
Outdated
Show resolved
Hide resolved
… apply minor compiler fixes for the new unittest
…n and remove div function
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In a second review, LGTM. I do not have access to the SPIR-V backend on OSx. I will check the latest changes by Monday on my other laptop.
@@ -0,0 +1,11 @@ | |||
package uk.ac.manchester.tornado.api.internal.annotations; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add License Header
LeftShiftNode leftShiftNode = index.inputs().filter(LeftShiftNode.class).first(); | ||
ConstantNode currentOffset = leftShiftNode.inputs().filter(ConstantNode.class).first(); | ||
// if the shifting is by 3 (for float values) | ||
if (currentOffset.getValue().toValueString().equals("3")) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why the shift is by 3 for a float value? Can we generalize this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The comment above is wrong, it's not because of float types, it's because the JavaKind for half is Object (8 bytes). This was done because otherwise we were having issues with the stamp. I will update the comment to reflect that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor comments
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Improvements ~~~~~~~~~~~~~~~~~~ - [beehive-lab#369](beehive-lab#369): Introduction of Tensor types in TornadoVM API and interoperability with ONNX Runtime. - [beehive-lab#370](beehive-lab#370): Array concatenation operation for TornadoVM native arrays. - [beehive-lab#371](beehive-lab#371): TornadoVM installer script ported for Windows 10/11. - [beehive-lab#372](beehive-lab#372): Add support for ``HalfFloat`` (``Float16``) in vector types. - [beehive-lab#374](beehive-lab#374): Support for TornadoVM array concatenations from the constructor-level. - [beehive-lab#375](beehive-lab#375): Support for TornadoVM native arrays using slices from the Panama API. - [beehive-lab#376](beehive-lab#376): Support for lazy copy-outs in the batch processing mode. - [beehive-lab#377](beehive-lab#377): Expand the TornadoVM profiler with power metrics for NVIDIA GPUs (OpenCL and PTX backends). - [beehive-lab#384](beehive-lab#384): Auto-closable Execution Plans for automatic memory management. Compatibility ~~~~~~~~~~~~~~~~~~ - [beehive-lab#386](beehive-lab#386): OpenJDK 17 support removed. - [beehive-lab#390](beehive-lab#390): SapMachine OpenJDK 21 supported. - [beehive-lab#395](beehive-lab#395): OpenJDK 22 and GraalVM 22.0.1 supported. - TornadoVM tested with Apple M3 chips. Bug Fixes ~~~~~~~~~~~~~~~~~~ - [beehive-lab#367](beehive-lab#367): Fix for Graal/Truffle languages in which some Java modules were not visible. - [beehive-lab#373](beehive-lab#373): Fix for data copies of the ``HalfFloat`` types for all backends. - [beehive-lab#378](beehive-lab#378): Fix free memory markers when running multi-thread execution plans. - [beehive-lab#379](beehive-lab#379): Refactoring package of vector api unit-tests. - [beehive-lab#380](beehive-lab#380): Fix event list sizes to accommodate profiling of large applications. - [beehive-lab#385](beehive-lab#385): Fix code check style. - [beehive-lab#387](beehive-lab#387): Fix TornadoVM internal events in OpenCL, SPIR-V and PTX for running multi-threaded execution plans. - [beehive-lab#388](beehive-lab#388): Fix of expected and actual values of tests. - [beehive-lab#392](beehive-lab#392): Fix installer for using existing JDKs. - [beehive-lab#389](beehive-lab#389): Fix ``DataObjectState`` for multi-thread execution plans. - [beehive-lab#396](beehive-lab#396): Fix JNI code for the CUDA NVML library access with OpenCL.
Description
This PR provides support for vectors containing half-float values.
Mark the backends affected by this PR.
OS tested
Mark the OS where this PR is tested.
Did you check on FPGAs?
If it is applicable, check your changes on FPGAs.
How to test the new patch?
Run
tornado-test -V uk.ac.manchester.tornado.unittests.vectortypes.TestHalfFloats