-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ARM64-SVE: Add checks for MOVPRFX sequencing (#105514) #106184
Conversation
@dotnet/arm64-contrib @kunalspathak |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suggested some changes. Also, can you also confirm all stress tests are passing?
src/coreclr/jit/emitarm64.cpp
Outdated
|
||
case IF_SVE_CC_2A: // <Zdn>.<T>, <V><m> | ||
case IF_SVE_FU_2A: // <Zda>.<T>, <Zn>.<T>, #<const> | ||
assert(secondId->idReg1() != secondId->idReg2()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i was hoping to see lot more validation for registers of firstId
as listed in #100743 (comment) or something similar:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i was hoping to see lot more validation for registers of firstId
That should be all at the bottom of the function. I'll double check that I have everything, because the text you have above is a much more verbose version than what I was using.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks like you have most of the things covered. We need to add validation for:
- MOVPRFX cannot be the last instruction in a sequence
- The predicated instruction at PC+4 must use the merging predicate. (not sure if this is taken care by the format checks)
- MOVPRFX destination register MUST be used as the destination register in the instruction at PC+4, "and is not allowed to be used in any other position other than destructive input"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- MOVPRFX cannot be the last instruction in a sequence
Done.
- The predicated instruction at PC+4 must use the merging predicate. (not sure if this is taken care by the format checks)
Looking at all the formats, they are all /M or unpredicated. I've added some extra checking to make sure they correspond to the correct movprfx instruction.
- MOVPRFX destination register MUST be used as the destination register in the instruction at PC+4, "and is not allowed to be used in any other position other than destructive input"
That's done in the switch. Eg:
assert(secondId->idReg1() != secondId->idReg3());
I had to do it there because you need the format to know what types the registers are.
Change-Id: Ib78085c631ade5e923cddfda4def7efb362b4587
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added some more comments on validation.
assert(firstId->idReg2() == secondId->idReg2()); | ||
|
||
// "and have the same maximum element size (ignoring a fixed 64-bit "wide vector" operand)," | ||
assert(firstId->idInsOpt() == secondId->idInsOpt()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can this be true for unpredicated version as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The unpredicated movprfx
doesn't care about element sizes, it is just MOVPRFX <Zd>, <Zn>
.
Change-Id: I0a421eacf4895cfb2a30aaa6c483f4a7a3f17dd1
@a74nh I'm assuming you're getting a clean run now, right? |
Normal testing passes with tiered and untiered. Need to check stress testing
I'm not - but it's not due to this PR.
I'll raise a bug and take a look at it - I have a reasonable idea of what it might be. This PR shouldn't need to be blocked on it though. |
@@ -2560,7 +2571,7 @@ void CodeGen::genHWIntrinsic(GenTreeHWIntrinsic* node) | |||
|
|||
HWIntrinsicImmOpHelper helper(this, intrin.op3, node); | |||
|
|||
for (helper.EmitBegin(); !helper.Done(); helper.EmitCaseEnd()) | |||
for (helper.EmitBegin(); !helper.Done(); helper.EmitCaseEnd(), (targetReg != op1Reg) ? 2 : 1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is this syntax doing? It looks like the result of the ternary operator isn't being used.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That change should have been on the line above - the HWIntrinsicImmOpHelper
constructor needs to take in the number of instructions that will be generated in each stage of the loop.
Got it, thanks! |
This was due to changes in #106125. I've fixed in this PR because it's related and simple enough - that PR changed the type of Also fixed up review comments. Just checking now that all stress testing passes now. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM if stress tests pass. Thanks!
Full stress testing for all APIs passing now. |
Change request is out-of-date
As per the Arm Arm,
MOVPRFX
instructions must be followed by an SVE instruction of specific type, otherwise behaviour is undefined.Add a function which checks that two instructions in sequence are valid. For now this only cares when the first instruction is a
MOVPRFX
, but the function could be expanded to check for any pairs, should we need to.Function is called after code generation in a loop which cycles through all
IG
s and all instructions.This ensures the stream is valid after peephole type optimisations.
Alternatively, this function could be called from
emitInsSanityCheck()
.EMIT_BACKWARDS_NAVIGATION
can then be used to get the previous instruction. However, defining this adds an extra variable toinstrDesc
, reducing the small immediate size from 7bits to 2bits.Alternatively, this function could be inside every
emitIns_
function, allowingemitLastIns
to be used for the previous instruction. However, that is a lot of call sites and some could easily be missed. In addition a peephole optimisation could still occur after this point.Currently requires #106125 for all SVE tests to pass.
AddSequentialAcross tests disabled until #106180 is fixed.
Resolves: #105514