Skip to content

Commit

Permalink
[llvm][NVPTX] Fix RAUW bug in NVPTXProxyRegErasure (llvm#105871)
Browse files Browse the repository at this point in the history
Fix bug introduced in llvm#105730

The bug is in how the batch RAUW is implemented. If we have 

```
%0 = mov %src
%1 = mov %0

use %0
use %1
```

The use of `%1` is rewritten to `%0`, not `%src`. This PR just looks for
a replacement when it maps to the src register, which should
transitively propagate the replacements.
  • Loading branch information
Jeff Niu authored and dmpolukhin committed Sep 2, 2024
1 parent 186f814 commit 689ba04
Show file tree
Hide file tree
Showing 3 changed files with 103 additions and 26 deletions.
6 changes: 5 additions & 1 deletion llvm/lib/Target/NVPTX/NVPTXProxyRegErasure.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -78,7 +78,11 @@ bool NVPTXProxyRegErasure::runOnMachineFunction(MachineFunction &MF) {
assert(InOp.isReg() && "ProxyReg input should be a register.");
assert(OutOp.isReg() && "ProxyReg output should be a register.");
RemoveList.push_back(&MI);
RAUWBatch.try_emplace(OutOp.getReg(), InOp.getReg());
Register replacement = InOp.getReg();
// Check if the replacement itself has been replaced.
if (auto it = RAUWBatch.find(replacement); it != RAUWBatch.end())
replacement = it->second;
RAUWBatch.try_emplace(OutOp.getReg(), replacement);
break;
}
}
Expand Down
25 changes: 0 additions & 25 deletions llvm/test/CodeGen/NVPTX/proxy-reg-erasure-mir.ll

This file was deleted.

98 changes: 98 additions & 0 deletions llvm/test/CodeGen/NVPTX/proxy-reg-erasure.mir
Original file line number Diff line number Diff line change
@@ -0,0 +1,98 @@
# RUN: llc %s --run-pass=nvptx-proxyreg-erasure -march=nvptx64 -o - | FileCheck %s

--- |
; ModuleID = 'third-party/llvm-project/llvm/test/CodeGen/NVPTX/proxy-reg-erasure-mir.ll'
source_filename = "third-party/llvm-project/llvm/test/CodeGen/NVPTX/proxy-reg-erasure-mir.ll"
target datalayout = "e-i64:64-i128:128-v16:16-v32:32-n16:32:64"

declare <4 x i32> @callee_vec_i32()

define <4 x i32> @check_vec_i32() {
%ret = call <4 x i32> @callee_vec_i32()
ret <4 x i32> %ret
}

...
---
name: check_vec_i32
alignment: 1
exposesReturnsTwice: false
legalized: false
regBankSelected: false
selected: false
failedISel: false
tracksRegLiveness: true
hasWinCFI: false
callsEHReturn: false
callsUnwindInit: false
hasEHCatchret: false
hasEHScopes: false
hasEHFunclets: false
isOutlined: false
debugInstrRef: false
failsVerification: false
tracksDebugUserValues: false
registers:
- { id: 0, class: int32regs, preferred-register: '' }
- { id: 1, class: int32regs, preferred-register: '' }
- { id: 2, class: int32regs, preferred-register: '' }
- { id: 3, class: int32regs, preferred-register: '' }
- { id: 4, class: int32regs, preferred-register: '' }
- { id: 5, class: int32regs, preferred-register: '' }
- { id: 6, class: int32regs, preferred-register: '' }
- { id: 7, class: int32regs, preferred-register: '' }
- { id: 8, class: int32regs, preferred-register: '' }
- { id: 9, class: int32regs, preferred-register: '' }
- { id: 10, class: int32regs, preferred-register: '' }
- { id: 11, class: int32regs, preferred-register: '' }
liveins: []
frameInfo:
isFrameAddressTaken: false
isReturnAddressTaken: false
hasStackMap: false
hasPatchPoint: false
stackSize: 0
offsetAdjustment: 0
maxAlignment: 1
adjustsStack: false
hasCalls: true
stackProtector: ''
functionContext: ''
maxCallFrameSize: 4294967295
cvBytesOfCalleeSavedRegisters: 0
hasOpaqueSPAdjustment: false
hasVAStart: false
hasMustTailInVarArgFunc: false
hasTailCall: false
isCalleeSavedInfoValid: false
localFrameSize: 0
savePoint: ''
restorePoint: ''
fixedStack: []
stack: []
entry_values: []
callSites: []
debugValueSubstitutions: []
constants: []
machineFunctionInfo: {}
body: |
bb.0:
%0:int32regs, %1:int32regs, %2:int32regs, %3:int32regs = LoadParamMemV4I32 0
; CHECK-NOT: ProxyReg
%4:int32regs = ProxyRegI32 killed %0
%5:int32regs = ProxyRegI32 killed %1
%6:int32regs = ProxyRegI32 killed %2
%7:int32regs = ProxyRegI32 killed %3
; CHECK: StoreRetvalV4I32 killed %0, killed %1, killed %2, killed %3
StoreRetvalV4I32 killed %4, killed %5, killed %6, killed %7, 0
%8:int32regs = LoadParamMemI32 0
; CHECK-NOT: ProxyReg
%9:int32regs = ProxyRegI32 killed %8
%10:int32regs = ProxyRegI32 killed %9
%11:int32regs = ProxyRegI32 killed %10
; CHECK: StoreRetvalI32 killed %8
StoreRetvalI32 killed %11, 0
Return
...

0 comments on commit 689ba04

Please sign in to comment.