-
Notifications
You must be signed in to change notification settings - Fork 12.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[NFC][PowerPC] Add test to check lanemasks for subregisters. #94363
Conversation
@llvm/pr-subscribers-backend-powerpc Author: Stefan Pintilie (stefanp-ibm) ChangesThis change adds a test case to check the lane masks for a varitey of Full diff: https://github.com/llvm/llvm-project/pull/94363.diff 1 Files Affected:
diff --git a/llvm/test/CodeGen/PowerPC/subreg-lanemasks.mir b/llvm/test/CodeGen/PowerPC/subreg-lanemasks.mir
new file mode 100644
index 0000000000000..e0bef0ece116e
--- /dev/null
+++ b/llvm/test/CodeGen/PowerPC/subreg-lanemasks.mir
@@ -0,0 +1,67 @@
+# RUN: llc -mcpu=pwr10 -O3 -ppc-track-subreg-liveness -verify-machineinstrs \
+# RUN: -mtriple=powerpc64le-unknown-linux-gnu -run-pass=greedy,virtregrewriter \
+# RUN: -debug-only=regalloc -o - %s 2>&1 >/dev/null | FileCheck %s
+
+# Keep track of all of the lanemasks for various subregsiters.
+#
+# TODO: The mask for %6.sub_vsx1:accrc is the same as the mask for %10.sub_vsx1_then_sub_64:accrc
+# even though one is a 128 bit register and the other is a 64 bit subregister. This should
+# be fixed in a later patch.
+#
+# CHECK: %3 [80r,80d:0) 0@80r L0000000000000004 [80r,80d:0) 0@80r weight:0.000000e+00
+# CHECK: %4 [96r,96d:0) 0@96r L0000000000000800 [96r,96d:0) 0@96r weight:0.000000e+00
+# CHECK: %5 [112r,112d:0) 0@112r L0000000000000004 [112r,112d:0) 0@112r weight:0.000000e+00
+# CHECK: %6 [128r,128d:0) 0@128r L0000000000000800 [128r,128d:0) 0@128r weight:0.000000e+00
+# CHECK: %7 [144r,144d:0) 0@144r L0000000000000004 [144r,144d:0) 0@144r weight:0.000000e+00
+# CHECK: %8 [160r,160d:0) 0@160r L0000000000000800 [160r,160d:0) 0@160r weight:0.000000e+00
+# CHECK: %9 [176r,176d:0) 0@176r L0000000000000004 [176r,176d:0) 0@176r weight:0.000000e+00
+# CHECK: %10 [192r,192d:0) 0@192r L0000000000000800 [192r,192d:0) 0@192r weight:0.000000e+00
+# CHECK: %11 [208r,208d:0) 0@208r L0000000000001000 [208r,208d:0) 0@208r weight:0.000000e+00
+# CHECK: %12 [224r,224d:0) 0@224r L0000000000002000 [224r,224d:0) 0@224r weight:0.000000e+00
+# CHECK: %13 [240r,240d:0) 0@240r L0000000000000804 [240r,240d:0) 0@240r weight:0.000000e+00
+# CHECK: %14 [256r,256d:0) 0@256r L0000000000003000 [256r,256d:0) 0@256r weight:0.000000e+00
+
+
+# CHECK: 0B bb.0
+# CHECK-NEXT: liveins
+# CHECK-NEXT: 16B %0:vsrc = COPY $v2
+# CHECK-NEXT: 32B %float:fprc = COPY %0.sub_64:vsrc
+# CHECK-NEXT: 48B dead undef %pair.sub_vsx0:vsrprc = COPY $v2
+# CHECK-NEXT: 64B undef %15.sub_vsx1:vsrprc = COPY $v3
+# CHECK-NEXT: 80B dead undef %3.sub_vsx0:vsrprc = COPY %0:vsrc
+# CHECK-NEXT: 96B dead undef %4.sub_vsx1:vsrprc = COPY %0:vsrc
+# CHECK-NEXT: 112B dead undef %5.sub_vsx0:accrc = COPY %0:vsrc
+# CHECK-NEXT: 128B dead undef %6.sub_vsx1:accrc = COPY %0:vsrc
+# CHECK-NEXT: 144B dead undef %7.sub_64:vsrprc = COPY %float:fprc
+# CHECK-NEXT: 160B dead undef %8.sub_vsx1_then_sub_64:vsrprc = COPY %float:fprc
+# CHECK-NEXT: 176B dead undef %9.sub_64:accrc = COPY %float:fprc
+# CHECK-NEXT: 192B dead undef %10.sub_vsx1_then_sub_64:accrc = COPY %float:fprc
+# CHECK-NEXT: 208B dead undef %11.sub_pair1_then_sub_64:accrc = COPY %float:fprc
+# CHECK-NEXT: 224B dead undef %12.sub_pair1_then_sub_vsx1_then_sub_64:accrc = COPY %float:fprc
+# CHECK-NEXT: 240B dead undef %13.sub_pair0:accrc = COPY %15:vsrprc
+# CHECK-NEXT: 256B dead undef %14.sub_pair1:accrc = COPY %15:vsrprc
+
+
+---
+name: test
+tracksRegLiveness: true
+body: |
+ bb.0:
+ liveins: $v2, $v3
+ %0:vsrc = COPY $v2
+ %float:fprc = COPY %0.sub_64
+ undef %pair.sub_vsx0:vsrprc = COPY $v2
+ undef %pair.sub_vsx1:vsrprc = COPY $v3
+ undef %1.sub_vsx0:vsrprc = COPY %0
+ undef %2.sub_vsx1:vsrprc = COPY %0
+ undef %3.sub_vsx0:accrc = COPY %0
+ undef %4.sub_vsx1:accrc = COPY %0
+ undef %5.sub_64:vsrprc = COPY %float
+ undef %6.sub_vsx1_then_sub_64:vsrprc = COPY %float
+ undef %7.sub_64:accrc = COPY %float
+ undef %8.sub_vsx1_then_sub_64:accrc = COPY %float
+ undef %9.sub_pair1_then_sub_64:accrc = COPY %float
+ undef %10.sub_pair1_then_sub_vsx1_then_sub_64:accrc = COPY %float
+ undef %11.sub_pair0:accrc = COPY %pair
+ undef %12.sub_pair1:accrc = COPY %pair
+...
|
@@ -0,0 +1,67 @@ | |||
# RUN: llc -mcpu=pwr10 -O3 -ppc-track-subreg-liveness -verify-machineinstrs \ | |||
# RUN: -mtriple=powerpc64le-unknown-linux-gnu -run-pass=greedy,virtregrewriter \ | |||
# RUN: -debug-only=regalloc -o - %s 2>&1 >/dev/null | FileCheck %s |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use -filetype=null instead of outputting to /dev/null. Also don't need -O3 or -verify-machineinstrs.
Also needs REQUIRES: asserts if you're checking the debug output
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for looking at this patch.
I've made the changes that you mentioned.
9edda06
to
42a8eb3
Compare
# TODO: The mask for %6.sub_vsx1:accrc is the same as the mask for %10.sub_vsx1_then_sub_64:accrc | ||
# even though one is a 128 bit register and the other is a 64 bit subregister. This should |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just because the bitwidths are different doesn't mean this is wrong. They would have the same lane mask if one was half of the other
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm going to change the comment before I commit. I will shorten it to say that on PPC we want the mask for %6.sub_vsx1:accrc
and %10.sub_vsx1_then_sub_64:accrc
to be different.
However, I'm confused about how this is supposed to work. On PPC we have something like this:
| ------------- VSRL 128 Bits ------------ |
| --- FPR 64 Bit --- | - Not Used 64 Bit - |
The VSRL is marked as sub_vsx1:accrc
while the FPR is marked as sub_vsx1_then_sub_64:accrc
.
At one point we end up needing a COPY for a live range split and based on the lane masks we end up copying only the first 64 bits which leads to incorrect results. The solution on our end is to add a phony register for the section that is not used and this then creates different lane masks for the two subregisters. Is this how we are supposed to approach this issue? Is it not an issue in general when we generate COPY instrs that two subregsiters have different sizes?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've never really understood the generated subregister index logic. I'm not sure what "VSRL is marked as sub_vsx1:accrc" means
Removed a couple of options from the run line that were not required. Added the REQUIRES: asserts.
b265831
to
c7d3b76
Compare
This breaks tests: http://45.33.8.238/linux/140640/step_11.txt Please take a look and revert for now if it takes a while to fix. |
Also broke sanitizer bots: https://lab.llvm.org/buildbot/#/builders/169/builds/40 |
This change adds a test case to check the lane masks for a varitey of
subregisters.