-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Arm64/Sve: Predicated Abs, Predicated/UnPredicated Add, Conditional Select #100743
Conversation
Change-Id: Ie8cfe828595da9a87adbc0857c0c44c0ce12f5b2
This reverts commit e9fa735.
Still some handling around RMW is needed, but this basically works
Note regarding the
|
@tannergooding - can you take another look? I have added bunch of scenarios to use `ConditionalSelect() on unary/binary operations including:
I also tested the newly added test cases using https://github.com/a74nh/runtime/blob/api_github/sve_api/stress_tester.py and they all pass. |
{ | ||
assert(numArgs > 0); | ||
GenTree* op1 = retNode->AsHWIntrinsic()->Op(1); | ||
if (intrinsic == NI_Sve_ConditionalSelect) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not important for this PR, but this is potentially something that should be handled in gtNewSimdCndSelNode
instead and then more generally as part of morph
to capture values that don't materialize as constants until later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will do it in follow-up PR.
I'm getting a failure with the
Backtrace:
Looks like |
Also some (but not all) of the conditional tests are failing:
|
Thanks @a74nh - i will take a look. |
@a74nh - The |
Yes, it looks like the conditionalselect failures are 256bit issues. Works on 256bit machine when I restrict the vector length to 128bit. |
…elect (dotnet#100743) * JIT ARM64-SVE: Add Sve.Abs() and Sve.Add() Change-Id: Ie8cfe828595da9a87adbc0857c0c44c0ce12f5b2 * Fix sve scaling in enitIns_R_S/S_R * Revert "Fix sve scaling in enitIns_R_S/S_R" This reverts commit e9fa735. * Fix sve scaling in enitIns_R_S/S_R * Restore testing * Use NaturalScale_helper for vector load/stores * wip * Add ConditionalSelect() APIs * Handle ConditionalSelect in JIT * Add test coverage * Update the test cases * jit format * fix merge conflicts * Make predicated/unpredicated work with ConditionalSelect Still some handling around RMW is needed, but this basically works * Misc. changes * jit format * jit format * Handle all the conditions correctly * jit format * fix some spacing * Removed the assert * fix the largest vector size to 64 to fix dotnet#100366 * review feedback * wip * Add SVE feature detection for Windows * fix the check for invalid alignment * Revert "Add SVE feature detection for Windows" This reverts commit ed7c781. * Handle case where Abs() is wrapped in another conditionalSelect * jit format * fix the size comparison * HW_Flag_MaskedPredicatedOnlyOperation * Revert the change in emitarm64.cpp around INS_sve_ldr_mask/INS_sve_str_mask * Fix the condition for lowering * address review feedback for movprfx * Move the special handling of Vector<>.Zero from lowerer to importer * Rename IsEmbeddedMaskedOperation/IsOptionalEmbeddedMaskedOperation * Add more test coverage for conditionalSelect * Rename test method name * Add more test coverage for conditionalSelect:Abs * jit format * Add logging on test methods * Add the missing movprfx for abs * Add few more scenarios where falseVal is zero * Make sure LoadVector is marked as explicit needing mask * revisit the codegen logic * Remove commented code and add some other comments * jit format --------- Co-authored-by: Alan Hayward <alan.hayward@arm.com>
…elect (dotnet#100743) * JIT ARM64-SVE: Add Sve.Abs() and Sve.Add() Change-Id: Ie8cfe828595da9a87adbc0857c0c44c0ce12f5b2 * Fix sve scaling in enitIns_R_S/S_R * Revert "Fix sve scaling in enitIns_R_S/S_R" This reverts commit e9fa735. * Fix sve scaling in enitIns_R_S/S_R * Restore testing * Use NaturalScale_helper for vector load/stores * wip * Add ConditionalSelect() APIs * Handle ConditionalSelect in JIT * Add test coverage * Update the test cases * jit format * fix merge conflicts * Make predicated/unpredicated work with ConditionalSelect Still some handling around RMW is needed, but this basically works * Misc. changes * jit format * jit format * Handle all the conditions correctly * jit format * fix some spacing * Removed the assert * fix the largest vector size to 64 to fix dotnet#100366 * review feedback * wip * Add SVE feature detection for Windows * fix the check for invalid alignment * Revert "Add SVE feature detection for Windows" This reverts commit ed7c781. * Handle case where Abs() is wrapped in another conditionalSelect * jit format * fix the size comparison * HW_Flag_MaskedPredicatedOnlyOperation * Revert the change in emitarm64.cpp around INS_sve_ldr_mask/INS_sve_str_mask * Fix the condition for lowering * address review feedback for movprfx * Move the special handling of Vector<>.Zero from lowerer to importer * Rename IsEmbeddedMaskedOperation/IsOptionalEmbeddedMaskedOperation * Add more test coverage for conditionalSelect * Rename test method name * Add more test coverage for conditionalSelect:Abs * jit format * Add logging on test methods * Add the missing movprfx for abs * Add few more scenarios where falseVal is zero * Make sure LoadVector is marked as explicit needing mask * revisit the codegen logic * Remove commented code and add some other comments * jit format --------- Co-authored-by: Alan Hayward <alan.hayward@arm.com>
Based on the feedback from @tannergooding in #100134 (comment), I reworked #100134 a little bit and also included the
ConditionalSelect
I implemented in #100718.The change in design is that we do not touch the HWIntrinsic node until lowerer. In lowerer, we wrap the node in
ConditionalSelect(mask, original_operation, falseVal)
and contain the original operation. This lets us easily determine which API should map to the predicate vs. unpredicated version of the instruction.Abs
which only has predicated version, in lowerer, we wrap it inConditionalSelect(ptrueAll, Abs, zero)
. During containment analysis, forConditionalSelect
, we check if 2nd operand is scalable SVE instrinsic and if yes, we mark it as contained. During codegen, theConditionalSelect
that hasop2
as contained needs to go through the "predicated" version of the instruction.Add
that has both predicated and unpredicated versions, they are handled differently. For unpredicatedAdd
, it does not get marked as contained and in codegen, generates unpredicated version. In order to generate predicated version ofadd
, user has to write a code such thatAdd
is wrap inside a conditional, e.g.ConditionalSelect(mask, Add(x,y), b)
. If yes, then it follows same path asAbs
and generates predicate version of the instruction.There are still some handling needs to be done for RMW, but want to get feedback on the design before I move forward. The sample test case along with the output can be found in
https://gist.github.com/kunalspathak/bc4e917ced68bef793d11fcbd050162chttps://gist.github.com/kunalspathak/1fb0b17f0908ba26e46f0cd146ab05b8 .
I still need to handle cases where user writesConditionalSelect(mask, Abs(x), y)
. Currently we end of expandingAbs
with anotherConditionalSelect
.Thanks @tannergooding for the design discussion and helping me understand the concepts.