-
Notifications
You must be signed in to change notification settings - Fork 12.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[MSSA] Don't check non-AA memory effects when using template access #76142
Conversation
Sometimes, we create a MemoryAccess for an instruction, which is later simplified (e.g. via devirtualization) such that the new instruction has no memory effects anymore. If we later clone the instruction (e.g. during unswitching), then MSSA will not create a MemoryAccess for the new instruction, breaking MSSAU assumptions. There are principally two ways to solve this. The first is to remove the assumption that a MemoryAccess will be created from MSSAU. I initially pursued this approach, but it is not as simple as it sounds. The core problem is that if there is another later access in the block, we'll have to find a new defining access for it. This is easy if there is another MemoryDef before the MemoryDef we're no longer creating, but becomes complicated if it was the first access in the block. In that case we might have to insert a MemoryPhi and perform the necessary rewrites, which sounds like all too much hassle for this edge-case. The alternative implemented here is to make sure that the template access is always used if it is specified -- even if we have obtained better information in the meantime. This was already the case wrt AA queries, but the separate non-AA check was performed unconditionally beforehand. Move this check into the non-template branch only.
@llvm/pr-subscribers-llvm-transforms @llvm/pr-subscribers-llvm-analysis Author: Nikita Popov (nikic) ChangesSometimes, we create a MemoryAccess for an instruction, which is later simplified (e.g. via devirtualization) such that the new instruction has no memory effects anymore. If we later clone the instruction (e.g. during unswitching), then MSSA will not create a MemoryAccess for the new instruction, breaking MSSAU assumptions. There are principally two ways to solve this. The first is to remove the assumption that a MemoryAccess will be created from MSSAU. I initially pursued this approach, but it is not as simple as it sounds. The core problem is that if there is another later access in the block, we'll have to find a new defining access for it. This is easy if there is another MemoryDef before the MemoryDef we're no longer creating, but becomes complicated if it was the first access in the block. In that case we might have to insert a MemoryPhi and perform the necessary rewrites, which sounds like all too much hassle for this edge-case. The alternative implemented here is to make sure that the template access is always used if it is specified -- even if we have obtained better information in the meantime. This was already the case wrt AA queries, but the separate non-AA check was performed unconditionally beforehand. Move this check into the non-template branch only. Full diff: https://github.com/llvm/llvm-project/pull/76142.diff 2 Files Affected:
diff --git a/llvm/lib/Analysis/MemorySSA.cpp b/llvm/lib/Analysis/MemorySSA.cpp
index 2cf92ceba01032..ce4942ea1477af 100644
--- a/llvm/lib/Analysis/MemorySSA.cpp
+++ b/llvm/lib/Analysis/MemorySSA.cpp
@@ -1732,12 +1732,6 @@ MemoryUseOrDef *MemorySSA::createNewAccess(Instruction *I,
}
}
- // Using a nonstandard AA pipelines might leave us with unexpected modref
- // results for I, so add a check to not model instructions that may not read
- // from or write to memory. This is necessary for correctness.
- if (!I->mayReadFromMemory() && !I->mayWriteToMemory())
- return nullptr;
-
bool Def, Use;
if (Template) {
Def = isa<MemoryDef>(Template);
@@ -1757,6 +1751,12 @@ MemoryUseOrDef *MemorySSA::createNewAccess(Instruction *I,
}
#endif
} else {
+ // Using a nonstandard AA pipelines might leave us with unexpected modref
+ // results for I, so add a check to not model instructions that may not read
+ // from or write to memory. This is necessary for correctness.
+ if (!I->mayReadFromMemory() && !I->mayWriteToMemory())
+ return nullptr;
+
// Find out what affect this instruction has on memory.
ModRefInfo ModRef = AAP->getModRefInfo(I, std::nullopt);
// The isOrdered check is used to ensure that volatiles end up as defs
diff --git a/llvm/test/Transforms/SimpleLoopUnswitch/memssa-readnone-access.ll b/llvm/test/Transforms/SimpleLoopUnswitch/memssa-readnone-access.ll
new file mode 100644
index 00000000000000..2aaf777683e116
--- /dev/null
+++ b/llvm/test/Transforms/SimpleLoopUnswitch/memssa-readnone-access.ll
@@ -0,0 +1,117 @@
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --version 4
+; RUN: opt -S -passes="loop-mssa(loop-instsimplify,simple-loop-unswitch<nontrivial>)" < %s | FileCheck %s
+
+@vtable = constant ptr @foo
+
+declare void @foo() memory(none)
+declare void @bar()
+
+; The call becomes known readnone after simplification, but still have a
+; MemoryAccess. Make sure this does not lead to an assertion failure.
+define void @test(i1 %c) {
+; CHECK-LABEL: define void @test(
+; CHECK-SAME: i1 [[C:%.*]]) {
+; CHECK-NEXT: [[C_FR:%.*]] = freeze i1 [[C]]
+; CHECK-NEXT: br i1 [[C_FR]], label [[DOTSPLIT_US:%.*]], label [[DOTSPLIT:%.*]]
+; CHECK: .split.us:
+; CHECK-NEXT: br label [[LOOP_US:%.*]]
+; CHECK: loop.us:
+; CHECK-NEXT: call void @foo()
+; CHECK-NEXT: br label [[EXIT_SPLIT_US:%.*]]
+; CHECK: exit.split.us:
+; CHECK-NEXT: br label [[EXIT:%.*]]
+; CHECK: .split:
+; CHECK-NEXT: br label [[LOOP:%.*]]
+; CHECK: loop:
+; CHECK-NEXT: call void @foo()
+; CHECK-NEXT: br label [[LOOP]]
+; CHECK: exit:
+; CHECK-NEXT: ret void
+;
+ br label %loop
+
+loop:
+ %fn = load ptr, ptr @vtable, align 8
+ call void %fn()
+ br i1 %c, label %exit, label %loop
+
+exit:
+ ret void
+}
+
+; Variant with another access after the call.
+define void @test2(i1 %c, ptr %p) {
+; CHECK-LABEL: define void @test2(
+; CHECK-SAME: i1 [[C:%.*]], ptr [[P:%.*]]) {
+; CHECK-NEXT: [[C_FR:%.*]] = freeze i1 [[C]]
+; CHECK-NEXT: br i1 [[C_FR]], label [[DOTSPLIT_US:%.*]], label [[DOTSPLIT:%.*]]
+; CHECK: .split.us:
+; CHECK-NEXT: br label [[LOOP_US:%.*]]
+; CHECK: loop.us:
+; CHECK-NEXT: call void @foo()
+; CHECK-NEXT: call void @bar()
+; CHECK-NEXT: br label [[EXIT_SPLIT_US:%.*]]
+; CHECK: exit.split.us:
+; CHECK-NEXT: br label [[EXIT:%.*]]
+; CHECK: .split:
+; CHECK-NEXT: br label [[LOOP:%.*]]
+; CHECK: loop:
+; CHECK-NEXT: call void @foo()
+; CHECK-NEXT: call void @bar()
+; CHECK-NEXT: br label [[LOOP]]
+; CHECK: exit:
+; CHECK-NEXT: ret void
+;
+ br label %loop
+
+loop:
+ %fn = load ptr, ptr @vtable, align 8
+ call void %fn()
+ call void @bar()
+ br i1 %c, label %exit, label %loop
+
+exit:
+ ret void
+}
+
+; Variant with another access after the call and no access before the call.
+define void @test3(i1 %c, ptr %p) {
+; CHECK-LABEL: define void @test3(
+; CHECK-SAME: i1 [[C:%.*]], ptr [[P:%.*]]) {
+; CHECK-NEXT: [[C_FR:%.*]] = freeze i1 [[C]]
+; CHECK-NEXT: br i1 [[C_FR]], label [[DOTSPLIT_US:%.*]], label [[DOTSPLIT:%.*]]
+; CHECK: .split.us:
+; CHECK-NEXT: br label [[LOOP_US:%.*]]
+; CHECK: loop.us:
+; CHECK-NEXT: br label [[SPLIT_US:%.*]]
+; CHECK: split.us:
+; CHECK-NEXT: call void @foo()
+; CHECK-NEXT: call void @bar()
+; CHECK-NEXT: br label [[EXIT_SPLIT_US:%.*]]
+; CHECK: exit.split.us:
+; CHECK-NEXT: br label [[EXIT:%.*]]
+; CHECK: .split:
+; CHECK-NEXT: br label [[LOOP:%.*]]
+; CHECK: loop:
+; CHECK-NEXT: br label [[SPLIT:%.*]]
+; CHECK: split:
+; CHECK-NEXT: call void @foo()
+; CHECK-NEXT: call void @bar()
+; CHECK-NEXT: br label [[LOOP]]
+; CHECK: exit:
+; CHECK-NEXT: ret void
+;
+ br label %loop
+
+loop:
+ %fn = load ptr, ptr @vtable, align 8
+ br label %split
+
+split:
+ call void %fn()
+ call void @bar()
+ br i1 %c, label %exit, label %loop
+
+exit:
+ ret void
+}
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this makes sense as the simple way to address this.
There's also the option to generalize what to do in the case of accesses that are optimized further.
I think that's possible but I may be missing edge cases not covered by current tests. Something like this resolves the testcases in this PR:
diff --git llvm/lib/Analysis/MemorySSAUpdater.cpp llvm/lib/Analysis/MemorySSAUpdater.cpp
index 9ad60f774e9f..0e533cb54f15 100644
--- llvm/lib/Analysis/MemorySSAUpdater.cpp
+++ llvm/lib/Analysis/MemorySSAUpdater.cpp
@@ -578,18 +578,9 @@ static MemoryAccess *getNewDefiningAccessForClone(MemoryAccess *MA,
if (Instruction *NewDefMUDI =
cast_or_null<Instruction>(VMap.lookup(DefMUDI))) {
InsnDefining = MSSA->getMemoryAccess(NewDefMUDI);
- if (!CloneWasSimplified)
- assert(InsnDefining && "Defining instruction cannot be nullptr.");
- else if (!InsnDefining || isa<MemoryUse>(InsnDefining)) {
+ if (!InsnDefining || isa<MemoryUse>(InsnDefining)) {
// The clone was simplified, it's no longer a MemoryDef, look up.
- auto DefIt = DefMUD->getDefsIterator();
- // Since simplified clones only occur in single block cloning, a
- // previous definition must exist, otherwise NewDefMUDI would not
- // have been found in VMap.
- assert(DefIt != MSSA->getBlockDefs(DefMUD->getBlock())->begin() &&
- "Previous def must exist");
- InsnDefining = getNewDefiningAccessForClone(
- &*(--DefIt), VMap, MPhiMap, CloneWasSimplified, MSSA);
+ InsnDefining = getNewDefiningAccessForClone(DefMUD->getDefiningAccess(), VMap, MPhiMap, CloneWasSimplified, MSSA);
}
}
}
@@ -626,7 +617,7 @@ void MemorySSAUpdater::cloneUsesAndDefs(BasicBlock *BB, BasicBlock *NewBB,
getNewDefiningAccessForClone(MUD->getDefiningAccess(), VMap,
MPhiMap, CloneWasSimplified, MSSA),
/*Template=*/CloneWasSimplified ? nullptr : MUD,
- /*CreationMustSucceed=*/CloneWasSimplified ? false : true);
+ /*CreationMustSucceed=*/false);
if (NewUseOrDef)
MSSA->insertIntoListsForBlock(NewUseOrDef, NewBB, MemorySSA::End);
}
But needs cleanup then of "CloneSimplified" that's now losing its purpose.
If you move fwd with creating accesses for non-memory affecting instructions, a follow up optimization could be to remember those accesses that were still created using a template, then call removeMemoryAccess
(leaving optimizePhis to false) on those, after cloning and new phi assignment is complete. This should redirect the defining accesses properly.
I think this may chose an incorrect defining access in some cases. Let's say we have the following initial structure:
The important part is that the defining access of Then, I think your new code would set the defining access of clone(reads y) to clone(writes x), even though the correct one would be clone(writes y).
Something we could also do is to already clean up such accesses inside LoopInstSimplify. That is, if we see that a callee has been replaced by a constant, check whether the access can be removed. This has the benefit that we'll simplify the original access as well, not just the cloned access. (Though I don't think we should rely on this happening, as there's just too many ways in that AA results can change.) |
ping |
In your example I'm ok with moving fwd with this solution to resolve the asserts. I haven't found a counter example for the generalization yet, but I need more time to check it's safe, it's still possible it's missing cases where it's not correct. |
Ah yes, good point.
Hm, could it happen that the new defining access ends up in a different block, for which we haven't cloned accesses yet? |
I don't think that's possible. The cloning logic works in RPO, so all clones are completed for defining accesses. |
I went ahead and put up another PR with your variant: #76819 You convinced me that this approach is correct, and I think it's a bit cleaner overall. |
Sometimes, we create a MemoryAccess for an instruction, which is later simplified (e.g. via devirtualization) such that the new instruction has no memory effects anymore. If we later clone the instruction (e.g. during unswitching), then MSSA will not create a MemoryAccess for the new instruction, triggering an assert. Disable the assertion (by passing CreationMustSucceed=false) and adjust getDefiningAccessForClone() to work correctly in that case. This PR implements the alternative suggestion by alinas from #76142.
Closing in favor of #76819. |
Sometimes, we create a MemoryAccess for an instruction, which is later simplified (e.g. via devirtualization) such that the new instruction has no memory effects anymore. If we later clone the instruction (e.g. during unswitching), then MSSA will not create a MemoryAccess for the new instruction, triggering an assert. Disable the assertion (by passing CreationMustSucceed=false) and adjust getDefiningAccessForClone() to work correctly in that case. This PR implements the alternative suggestion by alinas from llvm#76142. (cherry picked from commit d02c793)
Sometimes, we create a MemoryAccess for an instruction, which is later simplified (e.g. via devirtualization) such that the new instruction has no memory effects anymore. If we later clone the instruction (e.g. during unswitching), then MSSA will not create a MemoryAccess for the new instruction, triggering an assert. Disable the assertion (by passing CreationMustSucceed=false) and adjust getDefiningAccessForClone() to work correctly in that case. This PR implements the alternative suggestion by alinas from llvm#76142.
Sometimes, we create a MemoryAccess for an instruction, which is later simplified (e.g. via devirtualization) such that the new instruction has no memory effects anymore. If we later clone the instruction (e.g. during unswitching), then MSSA will not create a MemoryAccess for the new instruction, triggering an assert. Disable the assertion (by passing CreationMustSucceed=false) and adjust getDefiningAccessForClone() to work correctly in that case. This PR implements the alternative suggestion by alinas from llvm#76142. (cherry picked from commit d02c793)
Sometimes, we create a MemoryAccess for an instruction, which is later simplified (e.g. via devirtualization) such that the new instruction has no memory effects anymore.
If we later clone the instruction (e.g. during unswitching), then MSSA will not create a MemoryAccess for the new instruction, breaking MSSAU assumptions.
There are principally two ways to solve this. The first is to remove the assumption that a MemoryAccess will be created from MSSAU. I initially pursued this approach, but it is not as simple as it sounds. The core problem is that if there is another later access in the block, we'll have to find a new defining access for it. This is easy if there is another MemoryDef before the MemoryDef we're no longer creating, but becomes complicated if it was the first access in the block. In that case we might have to insert a MemoryPhi and perform the necessary rewrites, which sounds like all too much hassle for this edge-case.
The alternative implemented here is to make sure that the template access is always used if it is specified -- even if we have obtained better information in the meantime. This was already the case wrt AA queries, but the separate non-AA check was performed unconditionally beforehand. Move this check into the non-template branch only.