-
Notifications
You must be signed in to change notification settings - Fork 12.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[ArgPromotion] Remove incorrect TranspBlocks set for loads. #84835
Conversation
The TranspBlocks set was used to cache aliasing decision for all processed loads in the parent loop. This is incorrect, because each load can access a different location, which means one load not being modified in a block doesn't translate to another load not being modified in the same block. All loads access the same underlying object, so we could perhaps use a location without size for all loads and retain the cache, but that would mean we loose precision. For now, just drop the cache. Fixes llvm#84807
@llvm/pr-subscribers-llvm-transforms Author: Florian Hahn (fhahn) ChangesThe TranspBlocks set was used to cache aliasing decision for all processed loads in the parent loop. This is incorrect, because each load can access a different location, which means one load not being modified in a block doesn't translate to another load not being modified in the same block. All loads access the same underlying object, so we could perhaps use a location without size for all loads and retain the cache, but that would mean we loose precision. For now, just drop the cache. Fixes #84807 Full diff: https://github.com/llvm/llvm-project/pull/84835.diff 2 Files Affected:
diff --git a/llvm/lib/Transforms/IPO/ArgumentPromotion.cpp b/llvm/lib/Transforms/IPO/ArgumentPromotion.cpp
index e89ec353487eef..3aa8ea3f514713 100644
--- a/llvm/lib/Transforms/IPO/ArgumentPromotion.cpp
+++ b/llvm/lib/Transforms/IPO/ArgumentPromotion.cpp
@@ -653,10 +653,6 @@ static bool findArgParts(Argument *Arg, const DataLayout &DL, AAResults &AAR,
// check to see if the pointer is guaranteed to not be modified from entry of
// the function to each of the load instructions.
- // Because there could be several/many load instructions, remember which
- // blocks we know to be transparent to the load.
- df_iterator_default_set<BasicBlock *, 16> TranspBlocks;
-
for (LoadInst *Load : Loads) {
// Check to see if the load is invalidated from the start of the block to
// the load itself.
@@ -670,7 +666,7 @@ static bool findArgParts(Argument *Arg, const DataLayout &DL, AAResults &AAR,
// To do this, we perform a depth first search on the inverse CFG from the
// loading block.
for (BasicBlock *P : predecessors(BB)) {
- for (BasicBlock *TranspBB : inverse_depth_first_ext(P, TranspBlocks))
+ for (BasicBlock *TranspBB : inverse_depth_first(P))
if (AAR.canBasicBlockModify(*TranspBB, Loc))
return false;
}
diff --git a/llvm/test/Transforms/ArgumentPromotion/aliasing-and-non-aliasing-loads-with-clobber.ll b/llvm/test/Transforms/ArgumentPromotion/aliasing-and-non-aliasing-loads-with-clobber.ll
index 69385a7ea51a74..1c771d550b2192 100644
--- a/llvm/test/Transforms/ArgumentPromotion/aliasing-and-non-aliasing-loads-with-clobber.ll
+++ b/llvm/test/Transforms/ArgumentPromotion/aliasing-and-non-aliasing-loads-with-clobber.ll
@@ -14,10 +14,7 @@ define i32 @caller1(i1 %c) {
; CHECK-LABEL: define i32 @caller1(
; CHECK-SAME: i1 [[C:%.*]]) {
; CHECK-NEXT: entry:
-; CHECK-NEXT: [[F_VAL:%.*]] = load i16, ptr @f, align 8
-; CHECK-NEXT: [[TMP0:%.*]] = getelementptr i8, ptr @f, i64 8
-; CHECK-NEXT: [[F_VAL1:%.*]] = load i64, ptr [[TMP0]], align 8
-; CHECK-NEXT: call void @callee1(i16 [[F_VAL]], i64 [[F_VAL1]], i1 [[C]])
+; CHECK-NEXT: call void @callee1(ptr noundef nonnull @f, i1 [[C]])
; CHECK-NEXT: ret i32 0
;
entry:
@@ -27,13 +24,16 @@ entry:
define internal void @callee1(ptr nocapture noundef readonly %q, i1 %c) {
; CHECK-LABEL: define internal void @callee1(
-; CHECK-SAME: i16 [[Q_0_VAL:%.*]], i64 [[Q_8_VAL:%.*]], i1 [[C:%.*]]) {
+; CHECK-SAME: ptr nocapture noundef readonly [[Q:%.*]], i1 [[C:%.*]]) {
; CHECK-NEXT: entry:
; CHECK-NEXT: br i1 [[C]], label [[THEN:%.*]], label [[EXIT:%.*]]
; CHECK: then:
; CHECK-NEXT: store i16 123, ptr @f, align 8
; CHECK-NEXT: br label [[EXIT]]
; CHECK: exit:
+; CHECK-NEXT: [[Q_0_VAL:%.*]] = load i16, ptr [[Q]], align 8
+; CHECK-NEXT: [[GEP_8:%.*]] = getelementptr inbounds i8, ptr [[Q]], i64 8
+; CHECK-NEXT: [[Q_8_VAL:%.*]] = load i64, ptr [[GEP_8]], align 8
; CHECK-NEXT: call void @use(i16 [[Q_0_VAL]], i64 [[Q_8_VAL]])
; CHECK-NEXT: ret void
;
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
I don't think this code is particularly hot, but it would probably make sense to move it to use BatchAA at least.
Maybe it's worth looking into fixing the API of inverse_depth_first_ext/etc. to be less error-prone? For example, maybe we can make it automatically clear the storage when it's constructed, or something like that. |
Sounds good, we would need to add the functions used by ArgumentPromotion to BatchAA.
I am not sure, I had a look at the other sues of |
The TranspBlocks set was used to cache aliasing decision for all processed loads in the parent loop. This is incorrect, because each load can access a different location, which means one load not being modified in a block doesn't translate to another load not being modified in the same block. All loads access the same underlying object, so we could perhaps use a location without size for all loads and retain the cache, but that would mean we loose precision. For now, just drop the cache. Fixes llvm#84807 PR: llvm#84835 (cherry picked from commit bba4a1d)
The TranspBlocks set was used to cache aliasing decision for all processed loads in the parent loop. This is incorrect, because each load can access a different location, which means one load not being modified in a block doesn't translate to another load not being modified in the same block. All loads access the same underlying object, so we could perhaps use a location without size for all loads and retain the cache, but that would mean we loose precision. For now, just drop the cache. Fixes llvm#84807 PR: llvm#84835 (cherry picked from commit bba4a1d)
The TranspBlocks set was used to cache aliasing decision for all processed loads in the parent loop. This is incorrect, because each load can access a different location, which means one load not being modified in a block doesn't translate to another load not being modified in the same block.
All loads access the same underlying object, so we could perhaps use a location without size for all loads and retain the cache, but that would mean we loose precision.
For now, just drop the cache.
Fixes #84807