Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support conditional copy-by-reference in local memory #462

Merged
merged 23 commits into from
Jun 26, 2024

Conversation

mairooni
Copy link
Collaborator

@mairooni mairooni commented Jun 25, 2024

Description

This PR adds support for conditional copy-by-reference in local memory.

Problem description

If a plain copy-by-reference occurs in local memory, graal updates the references automatically. For example, the code below is legal and functionally correct:

public static void intPrivateCopyNoCondition(IntArray a, IntArray b) {
        for (@Parallel int i = 0; i < a.getSize(); i++) {
            int[] arrayA = new int[128];
            int[] arrayB = new int[128];
            for (int j = 0; j < 128; j++) {
                arrayA[j] = j;
                arrayB[j] = 2;
            }

            // copy arrayA to arrayB
            arrayB = arrayA;

            b.set(i, arrayB[i]);
        }
}

However, if the copy takes place depending on a condition, e.g.:

 if ((a.get(i) % 2) == 0 ) {
       // if condition is true, copy arrayA to arrayB
        arrayB = arrayA;
 }

Tornado produces a compiler error.
This PR adds the necessary functionality to make this conditional copy-by-reference legal. This patch is for OpenCL and PTX, SPIR-V needs further investigation.

Backend/s tested

Mark the backends affected by this PR.

  • OpenCL
  • PTX
  • SPIRV

OS tested

Mark the OS where this PR is tested.

  • Linux
  • OSx
  • Windows

Did you check on FPGAs?

If it is applicable, check your changes on FPGAs.

  • Yes
  • No

How to test the new patch?

Run
tornado-test -V --fast uk.ac.manchester.tornado.unittests.arrays.TestArrayCopies

@mairooni mairooni requested review from jjfumero and mikepapadim June 25, 2024 13:44
@mairooni mairooni self-assigned this Jun 25, 2024
@jjfumero jjfumero added feature New feature proposal and removed fix Provides a fix labels Jun 26, 2024
@jjfumero
Copy link
Member

[Review in progress] Test are passing for OpenCL and PTX.


protected void run(StructuredGraph graph) {
for (ValuePhiNode phiNode : graph.getNodes().filter(ValuePhiNode.class)) {
if (phiNode.usages().filter(OffsetAddressNode.class).isNotEmpty() && phiNode.values().filter(FixedArrayNode.class).isNotEmpty()) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a method that returns true or false using these condition.

// finally, since we know that the data accessed is a fixed array, fix the offset
ValuePhiNode privateIndex = getPrivateArrayIndex(offsetAddressNode.getOffset());
offsetAddressNode.setOffset(privateIndex);
break;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove the break, we want to explore all copies within the kernel.

offsetAddressNode.replaceFirstInput(phiNode, fixedArrayCopyNode);
// finally, since we know that the data accessed is a fixed array, fix the offset
ValuePhiNode privateIndex = getPrivateArrayIndex(offsetAddressNode.getOffset());
offsetAddressNode.setOffset(privateIndex);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can privateIndex be null?

result = tool.newVariable(lirKind);
tool.append(new PTXLIRStmt.AssignStmt(result, new PTXUnary.Expr(PTXAssembler.PTXUnaryOp.CVT_FLOAT, lirKind, baseAddress)));
declarationPtr = new PTXBinary.Expr(pointerCopyTemplate, lirKind, result, baseAddress);
} else {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Check the formatter

offsetAddressNode.safeDelete();
}
phiNodeToReplace = phiNode;
break;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove the break to explore all copies.

deleteFixed(readNode);
offsetAddressNode.safeDelete();
}
phiNodeToReplace = phiNode;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Replace this with a LinkedList of phoNodes to be replaced after the for loop

PTXArchitecture.PTXMemoryBase oclMemoryBase = fixedArrayNode.getMemoryRegister();
OffsetAddressNode offsetAddressNode = phiNode.usages().filter(OffsetAddressNode.class).first();
ValuePhiNode privateIndex = getPrivateArrayIndex(offsetAddressNode.getOffset());
FixedArrayCopyNode fixedArrayCopyNode = new FixedArrayCopyNode(phiNode, resolvedJavaType, oclMemoryBase, privateIndex);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Check privateIndex is not null.

@@ -0,0 +1,276 @@
package uk.ac.manchester.tornado.unittests.arrays;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add license

@jjfumero
Copy link
Member

Throw exception when this type of code is executed for the SPIR-V backend. This can be done within the a compilation phase.

@mairooni
Copy link
Collaborator Author

The comments have been addressed. @jjfumero please have another look.

@jjfumero
Copy link
Member

Great. Thanks Mary!

Copy link
Member

@mikepapadim mikepapadim left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks @mairooni

Some minor refactoring to improve code with pattern matching if you have time

Comment on lines 259 to 276
public static OCLAssembler.OCLBinaryTemplate resolvePrivatePointerTemplate(JavaKind type) {
if (type == JavaKind.Int) {
return OCLAssembler.OCLBinaryTemplate.PRIVATE_INT_ARRAY_PTR;
} else if (type == JavaKind.Double) {
return OCLAssembler.OCLBinaryTemplate.PRIVATE_DOUBLE_ARRAY_PTR;
} else if (type == JavaKind.Float) {
return OCLAssembler.OCLBinaryTemplate.PRIVATE_FLOAT_ARRAY_PTR;
} else if (type == JavaKind.Short) {
return OCLAssembler.OCLBinaryTemplate.PRIVATE_SHORT_ARRAY_PTR;
} else if (type == JavaKind.Long) {
return OCLAssembler.OCLBinaryTemplate.PRIVATE_LONG_ARRAY_PTR;
} else if (type == JavaKind.Char) {
return OCLAssembler.OCLBinaryTemplate.PRIVATE_CHAR_ARRAY_PTR;
} else if (type == JavaKind.Byte) {
return OCLAssembler.OCLBinaryTemplate.PRIVATE_BYTE_ARRAY_PTR;
}
return null;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
public static OCLAssembler.OCLBinaryTemplate resolvePrivatePointerTemplate(JavaKind type) {
if (type == JavaKind.Int) {
return OCLAssembler.OCLBinaryTemplate.PRIVATE_INT_ARRAY_PTR;
} else if (type == JavaKind.Double) {
return OCLAssembler.OCLBinaryTemplate.PRIVATE_DOUBLE_ARRAY_PTR;
} else if (type == JavaKind.Float) {
return OCLAssembler.OCLBinaryTemplate.PRIVATE_FLOAT_ARRAY_PTR;
} else if (type == JavaKind.Short) {
return OCLAssembler.OCLBinaryTemplate.PRIVATE_SHORT_ARRAY_PTR;
} else if (type == JavaKind.Long) {
return OCLAssembler.OCLBinaryTemplate.PRIVATE_LONG_ARRAY_PTR;
} else if (type == JavaKind.Char) {
return OCLAssembler.OCLBinaryTemplate.PRIVATE_CHAR_ARRAY_PTR;
} else if (type == JavaKind.Byte) {
return OCLAssembler.OCLBinaryTemplate.PRIVATE_BYTE_ARRAY_PTR;
}
return null;
}
public static OCLAssembler.OCLBinaryTemplate resolvePrivatePointerTemplate(JavaKind type) {
return switch (type) {
case Int -> OCLAssembler.OCLBinaryTemplate.PRIVATE_INT_ARRAY_PTR;
case Double -> OCLAssembler.OCLBinaryTemplate.PRIVATE_DOUBLE_ARRAY_PTR;
case Float -> OCLAssembler.OCLBinaryTemplate.PRIVATE_FLOAT_ARRAY_PTR;
case Short -> OCLAssembler.OCLBinaryTemplate.PRIVATE_SHORT_ARRAY_PTR;
case Long -> OCLAssembler.OCLBinaryTemplate.PRIVATE_LONG_ARRAY_PTR;
case Char -> OCLAssembler.OCLBinaryTemplate.PRIVATE_CHAR_ARRAY_PTR;
case Byte -> OCLAssembler.OCLBinaryTemplate.PRIVATE_BYTE_ARRAY_PTR;
default -> null;
};
}
I think can be done with pattern matching for simplicity

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not sure if this suggestion is correct.

return null;
}

public static OCLAssembler.OCLBinaryTemplate resolvePrivatePointerCopyTemplate(JavaKind type) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same here

return resolvePrivatePointerCopyTemplate(type.getJavaKind());
}

public static PTXAssembler.PTXBinaryTemplate resolvePrivatePointerCopyTemplate(JavaKind type) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

here maybe also you can use pattern matching

return memoryRegister;
}

private int getOffset() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
private int getOffset() {
private int getOffset() {
PTXKind kind = PTXKind.fromResolvedJavaType(elementType);
return switch (kind) {
case F64, S64 -> 8;
case F32, S32 -> 4;
case S16 -> 2;
case U16, S8 -> 1;
default -> -1;
};
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same here

@mairooni
Copy link
Collaborator Author

The refactoring has been applied, thanks @mikepapadim!

@jjfumero
Copy link
Member

Great, so we merge this.

@jjfumero jjfumero merged commit 1be5451 into beehive-lab:develop Jun 26, 2024
2 checks passed
@mairooni mairooni deleted the feat/pointer_copies branch June 26, 2024 11:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
Development

Successfully merging this pull request may close these issues.

3 participants