-
Notifications
You must be signed in to change notification settings - Fork 12k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Thumb] Resolve FIXME: Use 'mov hi, $src; mov $dst, hi' #81908
Conversation
@llvm/pr-subscribers-backend-arm Author: AtariDreams (AtariDreams) ChangesFull diff: https://github.com/llvm/llvm-project/pull/81908.diff 1 Files Affected:
diff --git a/llvm/lib/Target/ARM/Thumb1InstrInfo.cpp b/llvm/lib/Target/ARM/Thumb1InstrInfo.cpp
index 85eabdb17ad190..01b8c76674d207 100644
--- a/llvm/lib/Target/ARM/Thumb1InstrInfo.cpp
+++ b/llvm/lib/Target/ARM/Thumb1InstrInfo.cpp
@@ -53,9 +53,6 @@ void Thumb1InstrInfo::copyPhysReg(MachineBasicBlock &MBB,
.addReg(SrcReg, getKillRegState(KillSrc))
.add(predOps(ARMCC::AL));
else {
- // FIXME: Can also use 'mov hi, $src; mov $dst, hi',
- // with hi as either r10 or r11.
-
const TargetRegisterInfo *RegInfo = st.getRegisterInfo();
if (MBB.computeRegisterLiveness(RegInfo, ARM::CPSR, I)
== MachineBasicBlock::LQR_Dead) {
@@ -65,6 +62,35 @@ void Thumb1InstrInfo::copyPhysReg(MachineBasicBlock &MBB,
return;
}
+ // Can also use 'mov hi, $src; mov $dst, hi',
+ // with hi as either r10 or r11.
+ if (MBB.computeRegisterLiveness(RegInfo, ARM::R10, I) ==
+ MachineBasicBlock::LQR_Dead) {
+ // Use high register to move source to destination
+ BuildMI(MBB, I, DL, get(ARM::tMOVr), ARM::R10)
+ .addReg(SrcReg, getKillRegState(KillSrc))
+ .add(predOps(ARMCC::AL));
+ BuildMI(MBB, I, DL, get(ARM::tMOVr), DestReg)
+ .addReg(ARM::R10, getKillRegState(KillSrc))
+ .add(predOps(ARMCC::AL))
+ ->addRegisterDead(ARM::R10, RegInfo);
+ return;
+ }
+
+ if (MBB.computeRegisterLiveness(RegInfo, ARM::R11, I) ==
+ MachineBasicBlock::LQR_Dead) {
+ // Use high register to move source to destination
+ BuildMI(MBB, I, DL, get(ARM::tMOVr), ARM::R11)
+ .addReg(SrcReg, getKillRegState(KillSrc))
+ .add(predOps(ARMCC::AL));
+ BuildMI(MBB, I, DL, get(ARM::tMOVr), DestReg)
+ .addReg(ARM::R11, getKillRegState(KillSrc))
+ .add(predOps(ARMCC::AL))
+ ->addRegisterDead(ARM::R11, RegInfo);
+ ;
+ return;
+ }
+
// 'MOV lo, lo' is unpredictable on < v6, so use the stack to do it
BuildMI(MBB, I, DL, get(ARM::tPUSH))
.add(predOps(ARMCC::AL))
|
|
b34cf5b
to
3382508
Compare
Full source if anyone is curious: |
(See https://lists.llvm.org/pipermail/llvm-dev/2014-August/075967.html for additional context.) r10 and r11 are callee-save, so you'd need to make sure they get spilled. We have code to do the spilling, but it's not triggering here. I guess this code runs too late for that to trigger? |
Pretty sure only r12 is, hence why this is not an option. |
r10 - Caller save (in ARM mode, unused in Thumb mode) |
r10 and r11 are callee-save, so you'd need to make sure they get spilled I only think this is an issue if the compiler were to randomly switch between thumb and arm mode and vice versa mid function, which is a thing that only really happens in handwritten code anyway. |
Pretty sure that's just wrong (or referring to some other calling convention). From the specification (https://github.com/ARM-software/abi-aa/releases):
|
2ddc5ab
to
fbd4a85
Compare
@efriedma-quic Okay, I fixed the issues! |
ed0251c
to
f3cdd9c
Compare
For r10 and r11, as the tests show, they're almost never available; the compiler won't naturally spill them. You can force them to spill using inline asm, but that's so rare it's not really relevant. So I don't really see the point of checking, unless you're planning some sort of followup. For r12... this is probably okay? We don't use r12 in the scavenger anymore (see ab1d73e). We still use R12 as a scratch register in a few different places, but not in a way that would conflict, I think. It would be helpful if you could take some time to verify that. If there's some way r12 can actually be live (inline asm?), we should make sure there's at least one test that actually still exercises the code to generate push/pop. |
1fb69eb
to
7b6cca2
Compare
@efriedma-quic Ping? |
a3d675b
to
168ac85
Compare
@topperc Thoughts? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mostly looks pretty sensible. Couple of comments.
@TNorthover Addressed! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks. I think this looks good now.
ad1712f
to
9ab332a
Compare
Consider the following: ldr r0, [r4] ldr r7, [r0, #4] cmp r7, r3 bhi .LBB0_6 cmp r0, r2 push {r0} pop {r4} bne .LBB0_3 movs r0, r6 pop {r4, r5, r6, r7} pop {r1} bx r1 Here is a snippet of the generated THUMB1 code of the K&R malloc function that clang currently compiles to. push {r0} ends up being popped to pop {r4}. movs r4, r0 would destroy the flags set by cmp right above. The compiler has no alternative in this case, except one: the only alternative is to transfer through a high register. However, it seems like LLVM does not consider that this is a valid approach, even though it is a free clobbering a high register. This patch addresses the FIXME so the compiler can do that when it can in r10 or r11, or r12.
Consider the following:
Here is a snippet of the generated THUMB1 code of the K&R malloc function that clang currently compiles to.
push {r0} ends up being popped to pop {r4}.
movs r4, r0 would destroy the flags set by cmp right above.
The compiler has no alternative in this case, except one:
the only alternative is to transfer through a high register.
However, it seems like LLVM does not consider that this is a valid approach, even though it is a free clobbering a high register.
This patch addresses the FIXME so the compiler can do that when it can in r10 or r11, or r12.