-
Notifications
You must be signed in to change notification settings - Fork 12.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[AMDGPU] Tidy SIPreAllocateWWMRegs after recent changes (NFCI) #111967
Conversation
perlfu
commented
Oct 11, 2024
- V_SET_INACTIVE is always in WWM/WQM so can be treated like any other operation in WWM/WQM.
- After encountering SI_SPILL_S32_TO_VGPR loop should bypass to avoid double processing its defs.
- V_SET_INACTIVE is always in WWM/WQM so can be treated like any other operation in WWM/WQM. - After encountering SI_SPILL_S32_TO_VGPR loop should bypass to avoid double processing its defs.
@llvm/pr-subscribers-backend-amdgpu Author: Carl Ritson (perlfu) Changes
Full diff: https://github.com/llvm/llvm-project/pull/111967.diff 2 Files Affected:
diff --git a/llvm/lib/Target/AMDGPU/SIPreAllocateWWMRegs.cpp b/llvm/lib/Target/AMDGPU/SIPreAllocateWWMRegs.cpp
index 3bf2ea0f9e53ef..b17fb7e70826b5 100644
--- a/llvm/lib/Target/AMDGPU/SIPreAllocateWWMRegs.cpp
+++ b/llvm/lib/Target/AMDGPU/SIPreAllocateWWMRegs.cpp
@@ -215,13 +215,10 @@ bool SIPreAllocateWWMRegs::runOnMachineFunction(MachineFunction &MF) {
for (MachineBasicBlock *MBB : RPOT) {
bool InWWM = false;
for (MachineInstr &MI : *MBB) {
- if (MI.getOpcode() == AMDGPU::V_SET_INACTIVE_B32)
- RegsAssigned |= processDef(MI.getOperand(0));
-
if (MI.getOpcode() == AMDGPU::SI_SPILL_S32_TO_VGPR) {
- if (!PreallocateSGPRSpillVGPRs)
- continue;
- RegsAssigned |= processDef(MI.getOperand(0));
+ if (PreallocateSGPRSpillVGPRs)
+ RegsAssigned |= processDef(MI.getOperand(0));
+ continue;
}
if (MI.getOpcode() == AMDGPU::ENTER_STRICT_WWM ||
diff --git a/llvm/test/CodeGen/AMDGPU/sgpr-spill-overlap-wwm-reserve.mir b/llvm/test/CodeGen/AMDGPU/sgpr-spill-overlap-wwm-reserve.mir
index bb0a707a7c90b7..94cbe568a6a447 100644
--- a/llvm/test/CodeGen/AMDGPU/sgpr-spill-overlap-wwm-reserve.mir
+++ b/llvm/test/CodeGen/AMDGPU/sgpr-spill-overlap-wwm-reserve.mir
@@ -111,7 +111,9 @@ body: |
; GCN-NEXT: $sgpr22 = IMPLICIT_DEF
; GCN-NEXT: $vgpr5 = IMPLICIT_DEF
; GCN-NEXT: $vgpr5 = SI_SPILL_S32_TO_VGPR $sgpr22, 0, killed $vgpr5
+ ; GCN-NEXT: $sgpr0 = ENTER_STRICT_WWM -1, implicit-def $exec, implicit-def $scc, implicit $exec
; GCN-NEXT: dead $vgpr1 = V_SET_INACTIVE_B32 0, $vgpr0, 0, 0, $sgpr_null, implicit $exec, implicit-def $scc
+ ; GCN-NEXT: $exec_lo = EXIT_STRICT_WWM $sgpr0
; GCN-NEXT: {{ $}}
; GCN-NEXT: bb.1:
; GCN-NEXT: successors: %bb.2(0x80000000)
@@ -211,7 +213,9 @@ body: |
$sgpr22 = IMPLICIT_DEF
SI_SPILL_S32_SAVE $sgpr22, %stack.0, implicit $exec, implicit $sgpr32 :: (store (s32) into %stack.0, addrspace 5)
+ $sgpr0 = ENTER_STRICT_WWM -1, implicit-def $exec, implicit-def $scc, implicit $exec
%0:vgpr_32 = V_SET_INACTIVE_B32 0, $vgpr0, 0, 0, $sgpr_null, implicit $exec, implicit-def $scc
+ $exec_lo = EXIT_STRICT_WWM $sgpr0
bb.1:
KILL implicit-def $vcc, implicit-def $sgpr0_sgpr1_sgpr2_sgpr3_sgpr4_sgpr5_sgpr6_sgpr7_sgpr8_sgpr9_sgpr10_sgpr11_sgpr12_sgpr13_sgpr14_sgpr15_sgpr16_sgpr17_sgpr18_sgpr19_sgpr20_sgpr21_sgpr22_sgpr23_sgpr24_sgpr25_sgpr26_sgpr27_sgpr28_sgpr29_sgpr30_sgpr31, implicit-def $sgpr32_sgpr33_sgpr34_sgpr35_sgpr36_sgpr37_sgpr38_sgpr39_sgpr40_sgpr41_sgpr42_sgpr43_sgpr44_sgpr45_sgpr46_sgpr47_sgpr48_sgpr49_sgpr50_sgpr51_sgpr52_sgpr53_sgpr54_sgpr55_sgpr56_sgpr57_sgpr58_sgpr59_sgpr60_sgpr61_sgpr62_sgpr63, implicit-def $sgpr64_sgpr65_sgpr66_sgpr67_sgpr68_sgpr69_sgpr70_sgpr71_sgpr72_sgpr73_sgpr74_sgpr75_sgpr76_sgpr77_sgpr78_sgpr79_sgpr80_sgpr81_sgpr82_sgpr83_sgpr84_sgpr85_sgpr86_sgpr87_sgpr88_sgpr89_sgpr90_sgpr91_sgpr92_sgpr93_sgpr94_sgpr95, implicit-def $sgpr96_sgpr97_sgpr98_sgpr99_sgpr100_sgpr101_sgpr102_sgpr103
|
@@ -111,7 +111,9 @@ body: | | |||
; GCN-NEXT: $sgpr22 = IMPLICIT_DEF | |||
; GCN-NEXT: $vgpr5 = IMPLICIT_DEF | |||
; GCN-NEXT: $vgpr5 = SI_SPILL_S32_TO_VGPR $sgpr22, 0, killed $vgpr5 | |||
; GCN-NEXT: $sgpr0 = ENTER_STRICT_WWM -1, implicit-def $exec, implicit-def $scc, implicit $exec |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How can the patch be NFCI if it is introducing these ENTER/EXIT MIs? Is there some reason they have no effect on the final codegen?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
MIR test contained V_SET_INACTIVE outside of WWM, which is something we shouldn't be generating.
This updates the test input to match expected code gen.
…111967) - V_SET_INACTIVE is always in WWM/WQM so can be treated like any other operation in WWM/WQM. - After encountering SI_SPILL_S32_TO_VGPR loop should bypass to avoid double processing its defs.
…111967) - V_SET_INACTIVE is always in WWM/WQM so can be treated like any other operation in WWM/WQM. - After encountering SI_SPILL_S32_TO_VGPR loop should bypass to avoid double processing its defs.
…111967) - V_SET_INACTIVE is always in WWM/WQM so can be treated like any other operation in WWM/WQM. - After encountering SI_SPILL_S32_TO_VGPR loop should bypass to avoid double processing its defs.