[AMDGPU] Tidy SIPreAllocateWWMRegs after recent changes (NFCI) #111967

perlfu · 2024-10-11T09:09:49Z

V_SET_INACTIVE is always in WWM/WQM so can be treated like any other operation in WWM/WQM.
After encountering SI_SPILL_S32_TO_VGPR loop should bypass to avoid double processing its defs.

- V_SET_INACTIVE is always in WWM/WQM so can be treated like any other operation in WWM/WQM. - After encountering SI_SPILL_S32_TO_VGPR loop should bypass to avoid double processing its defs.

llvmbot · 2024-10-11T09:10:22Z

@llvm/pr-subscribers-backend-amdgpu

Author: Carl Ritson (perlfu)

Changes

V_SET_INACTIVE is always in WWM/WQM so can be treated like any other operation in WWM/WQM.
After encountering SI_SPILL_S32_TO_VGPR loop should bypass to avoid double processing its defs.

Full diff: https://github.com/llvm/llvm-project/pull/111967.diff

2 Files Affected:

(modified) llvm/lib/Target/AMDGPU/SIPreAllocateWWMRegs.cpp (+3-6)
(modified) llvm/test/CodeGen/AMDGPU/sgpr-spill-overlap-wwm-reserve.mir (+4)

diff --git a/llvm/lib/Target/AMDGPU/SIPreAllocateWWMRegs.cpp b/llvm/lib/Target/AMDGPU/SIPreAllocateWWMRegs.cpp
index 3bf2ea0f9e53ef..b17fb7e70826b5 100644
--- a/llvm/lib/Target/AMDGPU/SIPreAllocateWWMRegs.cpp
+++ b/llvm/lib/Target/AMDGPU/SIPreAllocateWWMRegs.cpp
@@ -215,13 +215,10 @@ bool SIPreAllocateWWMRegs::runOnMachineFunction(MachineFunction &MF) {
   for (MachineBasicBlock *MBB : RPOT) {
     bool InWWM = false;
     for (MachineInstr &MI : *MBB) {
-      if (MI.getOpcode() == AMDGPU::V_SET_INACTIVE_B32)
-        RegsAssigned |= processDef(MI.getOperand(0));
-
       if (MI.getOpcode() == AMDGPU::SI_SPILL_S32_TO_VGPR) {
-        if (!PreallocateSGPRSpillVGPRs)
-          continue;
-        RegsAssigned |= processDef(MI.getOperand(0));
+        if (PreallocateSGPRSpillVGPRs)
+          RegsAssigned |= processDef(MI.getOperand(0));
+        continue;
       }
 
       if (MI.getOpcode() == AMDGPU::ENTER_STRICT_WWM ||
diff --git a/llvm/test/CodeGen/AMDGPU/sgpr-spill-overlap-wwm-reserve.mir b/llvm/test/CodeGen/AMDGPU/sgpr-spill-overlap-wwm-reserve.mir
index bb0a707a7c90b7..94cbe568a6a447 100644
--- a/llvm/test/CodeGen/AMDGPU/sgpr-spill-overlap-wwm-reserve.mir
+++ b/llvm/test/CodeGen/AMDGPU/sgpr-spill-overlap-wwm-reserve.mir
@@ -111,7 +111,9 @@ body:             |
   ; GCN-NEXT:   $sgpr22 = IMPLICIT_DEF
   ; GCN-NEXT:   $vgpr5 = IMPLICIT_DEF
   ; GCN-NEXT:   $vgpr5 = SI_SPILL_S32_TO_VGPR $sgpr22, 0, killed $vgpr5
+  ; GCN-NEXT:   $sgpr0 = ENTER_STRICT_WWM -1, implicit-def $exec, implicit-def $scc, implicit $exec
   ; GCN-NEXT:   dead $vgpr1 = V_SET_INACTIVE_B32 0, $vgpr0, 0, 0, $sgpr_null, implicit $exec, implicit-def $scc
+  ; GCN-NEXT:   $exec_lo = EXIT_STRICT_WWM $sgpr0
   ; GCN-NEXT: {{  $}}
   ; GCN-NEXT: bb.1:
   ; GCN-NEXT:   successors: %bb.2(0x80000000)
@@ -211,7 +213,9 @@ body:             |
 
     $sgpr22 = IMPLICIT_DEF
     SI_SPILL_S32_SAVE $sgpr22, %stack.0, implicit $exec, implicit $sgpr32 :: (store (s32) into %stack.0, addrspace 5)
+    $sgpr0 = ENTER_STRICT_WWM -1, implicit-def $exec, implicit-def $scc, implicit $exec
     %0:vgpr_32 = V_SET_INACTIVE_B32 0, $vgpr0, 0, 0, $sgpr_null, implicit $exec, implicit-def $scc
+    $exec_lo = EXIT_STRICT_WWM $sgpr0
 
   bb.1:
     KILL implicit-def $vcc, implicit-def $sgpr0_sgpr1_sgpr2_sgpr3_sgpr4_sgpr5_sgpr6_sgpr7_sgpr8_sgpr9_sgpr10_sgpr11_sgpr12_sgpr13_sgpr14_sgpr15_sgpr16_sgpr17_sgpr18_sgpr19_sgpr20_sgpr21_sgpr22_sgpr23_sgpr24_sgpr25_sgpr26_sgpr27_sgpr28_sgpr29_sgpr30_sgpr31, implicit-def $sgpr32_sgpr33_sgpr34_sgpr35_sgpr36_sgpr37_sgpr38_sgpr39_sgpr40_sgpr41_sgpr42_sgpr43_sgpr44_sgpr45_sgpr46_sgpr47_sgpr48_sgpr49_sgpr50_sgpr51_sgpr52_sgpr53_sgpr54_sgpr55_sgpr56_sgpr57_sgpr58_sgpr59_sgpr60_sgpr61_sgpr62_sgpr63, implicit-def $sgpr64_sgpr65_sgpr66_sgpr67_sgpr68_sgpr69_sgpr70_sgpr71_sgpr72_sgpr73_sgpr74_sgpr75_sgpr76_sgpr77_sgpr78_sgpr79_sgpr80_sgpr81_sgpr82_sgpr83_sgpr84_sgpr85_sgpr86_sgpr87_sgpr88_sgpr89_sgpr90_sgpr91_sgpr92_sgpr93_sgpr94_sgpr95, implicit-def $sgpr96_sgpr97_sgpr98_sgpr99_sgpr100_sgpr101_sgpr102_sgpr103

jayfoad · 2024-10-11T09:14:04Z

llvm/test/CodeGen/AMDGPU/sgpr-spill-overlap-wwm-reserve.mir

@@ -111,7 +111,9 @@ body:             |
  ; GCN-NEXT:   $sgpr22 = IMPLICIT_DEF
  ; GCN-NEXT:   $vgpr5 = IMPLICIT_DEF
  ; GCN-NEXT:   $vgpr5 = SI_SPILL_S32_TO_VGPR $sgpr22, 0, killed $vgpr5
+  ; GCN-NEXT:   $sgpr0 = ENTER_STRICT_WWM -1, implicit-def $exec, implicit-def $scc, implicit $exec


How can the patch be NFCI if it is introducing these ENTER/EXIT MIs? Is there some reason they have no effect on the final codegen?

MIR test contained V_SET_INACTIVE outside of WWM, which is something we shouldn't be generating.
This updates the test input to match expected code gen.

…111967) - V_SET_INACTIVE is always in WWM/WQM so can be treated like any other operation in WWM/WQM. - After encountering SI_SPILL_S32_TO_VGPR loop should bypass to avoid double processing its defs.

[AMDGPU] Tidy SIPreAllocateWWMRegs after recent changes (NFCI)

47b7390

- V_SET_INACTIVE is always in WWM/WQM so can be treated like any other operation in WWM/WQM. - After encountering SI_SPILL_S32_TO_VGPR loop should bypass to avoid double processing its defs.

perlfu requested review from jayfoad, dstutt and piotrAMD October 11, 2024 09:09

llvmbot added the backend:AMDGPU label Oct 11, 2024

jayfoad reviewed Oct 11, 2024

View reviewed changes

jayfoad approved these changes Oct 14, 2024

View reviewed changes

perlfu merged commit 784230b into llvm:main Oct 15, 2024
10 checks passed

perlfu deleted the amdgpu-tidy-wwm-prealloc-202410 branch October 15, 2024 02:48

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[AMDGPU] Tidy SIPreAllocateWWMRegs after recent changes (NFCI) #111967

[AMDGPU] Tidy SIPreAllocateWWMRegs after recent changes (NFCI) #111967

perlfu commented Oct 11, 2024

llvmbot commented Oct 11, 2024

jayfoad Oct 11, 2024

perlfu Oct 11, 2024

[AMDGPU] Tidy SIPreAllocateWWMRegs after recent changes (NFCI) #111967

[AMDGPU] Tidy SIPreAllocateWWMRegs after recent changes (NFCI) #111967

Conversation

perlfu commented Oct 11, 2024

llvmbot commented Oct 11, 2024

jayfoad Oct 11, 2024

Choose a reason for hiding this comment

perlfu Oct 11, 2024

Choose a reason for hiding this comment