-
Notifications
You must be signed in to change notification settings - Fork 12.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[DebugInfo][InstCombine] Do not overwrite prior DILocation for new Insts #108565
Conversation
@llvm/pr-subscribers-llvm-transforms @llvm/pr-subscribers-debuginfo Author: Stephen Tozer (SLTozer) ChangesWhen InstCombine replaces an old instruction with a new instruction, it copies !dbg and !annotation metadata from old to new. For some InstCombine patterns we set a specific DILocation on the new instruction prior to insertion, however, which more accurately reflects the new instruction. This more specific DILocation may be overwritten on insertion by a less appropriate one, resulting in a less correct line mapping. This patch changes this behaviour to only copy the DILocation from old to new if the new instruction has no existing DILocation (which will always be the case for a new instruction unless InstCombine has specifically set one). This isn't a perfect solution. In the test case given, we create a load instruction and give it the merged location of the two existing load instructions it replaces. Prior to this patch, if the PHI that is being RAUWd has a DILocation, it will overwrite the merged location on the new load, which is bad. However even with this patch, if either of the load instructions has no DILocation, then the new load won't have one either, and it will thus receive the DILocation from the PHI. This is probably bad - the PHI could have a valid DILocation corresponding to a different line, meaning that if the load results in an invalid memory access then we will attribute the resulting crash to the PHI's line, which could be worse than giving no attribution at all. I don't see any easy way to solve this problem however without fundamentally changing how InstCombine assigns DILocations; the most likely fix would be to exclusively assign DILocations within the InstCombine patterns, or to otherwise allow them to specify whether or not a DILocation ought to be copied from the original instruction or not. Full diff: https://github.com/llvm/llvm-project/pull/108565.diff 2 Files Affected:
diff --git a/llvm/lib/Transforms/InstCombine/InstructionCombining.cpp b/llvm/lib/Transforms/InstCombine/InstructionCombining.cpp
index 8195e0539305cc..bd9af1ded431e9 100644
--- a/llvm/lib/Transforms/InstCombine/InstructionCombining.cpp
+++ b/llvm/lib/Transforms/InstCombine/InstructionCombining.cpp
@@ -5163,8 +5163,13 @@ bool InstCombinerImpl::run() {
LLVM_DEBUG(dbgs() << "IC: Old = " << *I << '\n'
<< " New = " << *Result << '\n');
- Result->copyMetadata(*I,
- {LLVMContext::MD_dbg, LLVMContext::MD_annotation});
+ // We copy the old instruction's DebugLoc to the new instruction, unless
+ // InstCombine already assigned a DebugLoc to it, in which case we
+ // should trust the more specifically selected DebugLoc.
+ if (!Result->getDebugLoc())
+ Result->setDebugLoc(I->getDebugLoc());
+ // We also copy annotation metadata to the new instruction.
+ Result->copyMetadata(*I, LLVMContext::MD_annotation);
// Everything uses the new instruction now.
I->replaceAllUsesWith(Result);
diff --git a/llvm/test/Transforms/InstCombine/new-inst-dbgloc-overwrite.ll b/llvm/test/Transforms/InstCombine/new-inst-dbgloc-overwrite.ll
new file mode 100644
index 00000000000000..c8ce8d3b433f4d
--- /dev/null
+++ b/llvm/test/Transforms/InstCombine/new-inst-dbgloc-overwrite.ll
@@ -0,0 +1,77 @@
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --check-globals smart
+;; Tests that when InstCombine sets a DILocation on the new instruction when it
+;; is created, we do not try to overwrite that DILocation when we later insert
+;; the new instruction.
+;; In this test, InstCombine replaces two loads joined by a PHI into a PHI of
+;; the addresses followed by a load, and gives the new load a merge of the two
+;; incoming load DILocations. This test verifies that the new load keeps that
+;; DILocation, and doesn't have it overwritten by the DILocation of the original
+;; PHI. It will, however, receive the !annotation attached to the original PHI.
+; RUN: opt < %s -passes=instcombine -S | FileCheck %s
+
+define void @test(ptr %xfA, ptr %xfB, i1 %cmp5) {
+; CHECK-LABEL: @test(
+; CHECK-NEXT: entry:
+; CHECK-NEXT: br i1 [[CMP5:%.*]], label [[IF_ELSE:%.*]], label [[IF_THEN6:%.*]]
+; CHECK: if.then6:
+; CHECK-NEXT: br label [[IF_END11:%.*]]
+; CHECK: if.else:
+; CHECK-NEXT: br label [[IF_END11]]
+; CHECK: if.end11:
+; CHECK-NEXT: [[XFA_PN:%.*]] = phi ptr [ [[XFA:%.*]], [[IF_ELSE]] ], [ [[XFB:%.*]], [[IF_THEN6]] ]
+; CHECK-NEXT: [[XF1_SROA_8_0_IN:%.*]] = getelementptr i8, ptr [[XFA_PN]], i64 4
+; CHECK-NEXT: [[XF1_SROA_8_0:%.*]] = load float, ptr [[XF1_SROA_8_0_IN]], align 4, !dbg [[DBG3:![0-9]+]], !annotation [[META7:![0-9]+]]
+; CHECK-NEXT: [[CMP_I:%.*]] = fcmp ugt float [[XF1_SROA_8_0]], 0.000000e+00
+; CHECK-NEXT: br i1 [[CMP_I]], label [[IF_END_I:%.*]], label [[IF_THEN_I:%.*]]
+; CHECK: if.then.i:
+; CHECK-NEXT: br label [[IF_END_I]]
+; CHECK: if.end.i:
+; CHECK-NEXT: ret void
+;
+entry:
+ br i1 %cmp5, label %if.else, label %if.then6
+
+if.then6: ; preds = %entry
+ %xf1.sroa.8.0.xfB.sroa_idx = getelementptr i8, ptr %xfB, i64 4
+ %xf1.sroa.8.0.copyload = load float, ptr %xf1.sroa.8.0.xfB.sroa_idx, align 4, !dbg !3
+ br label %if.end11
+
+if.else: ; preds = %entry
+ %xf1.sroa.8.0.xfA.sroa_idx = getelementptr i8, ptr %xfA, i64 4
+ %xf1.sroa.8.0.copyload494 = load float, ptr %xf1.sroa.8.0.xfA.sroa_idx, align 4, !dbg !7
+ br label %if.end11
+
+if.end11: ; preds = %if.else, %if.then6
+ %xf1.sroa.8.0 = phi float [ %xf1.sroa.8.0.copyload494, %if.else ], [ %xf1.sroa.8.0.copyload, %if.then6 ], !dbg !8, !annotation !9
+ %cmp.i = fcmp ugt float %xf1.sroa.8.0, 0.000000e+00
+ br i1 %cmp.i, label %if.end.i, label %if.then.i
+
+if.then.i: ; preds = %if.end11
+ br label %if.end.i
+
+if.end.i: ; preds = %if.then.i, %if.end11
+ ret void
+}
+
+!llvm.dbg.cu = !{!0}
+!llvm.module.flags = !{!2}
+
+!0 = distinct !DICompileUnit(language: DW_LANG_C_plus_plus, file: !1, producer: "clang version 20.0.0git", isOptimized: true, runtimeVersion: 0, emissionKind: FullDebug)
+!1 = !DIFile(filename: "test.cpp", directory: "/tmp")
+!2 = !{i32 2, !"Debug Info Version", i32 3}
+!3 = !DILocation(line: 63, column: 12, scope: !4)
+!4 = distinct !DISubprogram(name: "operator=", linkageName: "_ZN11btMatrix3x3aSERKS_", scope: null, file: !1, line: 61, type: !5, scopeLine: 62, spFlags: DISPFlagDefinition, unit: !0, retainedNodes: !6)
+!5 = distinct !DISubroutineType(types: !6)
+!6 = !{}
+!7 = !DILocation(line: 63, column: 15, scope: !4)
+!8 = !DILocation(line: 64, scope: !4)
+!9 = !{!"Test String"}
+;.
+; CHECK: [[META0:![0-9]+]] = distinct !DICompileUnit(language: DW_LANG_C_plus_plus, file: [[META1:![0-9]+]], producer: "{{.*}}clang version {{.*}}", isOptimized: true, runtimeVersion: 0, emissionKind: FullDebug)
+; CHECK: [[META1]] = !DIFile(filename: "test.cpp", directory: {{.*}})
+; CHECK: [[DBG3]] = !DILocation(line: 64, scope: [[META4:![0-9]+]])
+; CHECK: [[META4]] = distinct !DISubprogram(name: "operator=", linkageName: "_ZN11btMatrix3x3aSERKS_", scope: null, file: [[META1]], line: 61, type: [[META5:![0-9]+]], scopeLine: 62, spFlags: DISPFlagDefinition, unit: [[META0]], retainedNodes: [[META6:![0-9]+]])
+; CHECK: [[META5]] = distinct !DISubroutineType(types: [[META6]])
+; CHECK: [[META6]] = !{}
+; CHECK: [[META7]] = !{!"Test String"}
+;.
|
28e2b39
to
aa961ec
Compare
If we merge locations on two instructions, but only one has a location, the merged result is no location? How often do we get a load with no DILocation in practice? It seems like that would be unusual. |
Correct; when we merge locations, we choose the innermost scope that contains both; if either has no location whatsoever (i.e. has no scope), then we can't assign any meaningful location that isn't lying about one of them.
Hard to say, but I uploaded another patch earlier today that fixes a case where we generate loads without a location (#108531); there are also all sorts of ways that DILocations can be dropped right now, and although I'm trying to eliminate as many as possible right now it seems safer to be defensive about the possibility. As a small and not-totally-reliable data point, I found ~45 loads generated (at some point) without a DILocation while building bullet. Edit: And also, in at least one place we explicitly drop locations for a load instruction (hoisting a load into the loop header in LICM). |
(the "unread" list in my inbox is looong,) I'm not fully understanding the second paragraph about the bad case: in one mode the DILocation of a PHI is given to the new load instruction, and that's considered bad. In the second mode, there's a risk the new load "...will thus receive the DILocation from the PHI" -- so the same location as before, so there's no regression? In principle I think this is a good patch that tries that keep source locations as specific as possible, I just want to understand the use case more. |
That's correct - there's no regression from this patch, I'm simply noting that it's not a perfect solution. A more perfect solution would be one where we could state explicitly in this case, "Do not copy the original instruction's DebugLoc to the replacement instruction", but that would require making more intrusive changes to InstCombine with probably a large diff footprint, hence I'm considering this a worthwhile partial solution. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM but with the dumbest nit: it looks like the test is checking we get line number 63, instead of line number 64, which is fine. Could we replace the line numbers with something much more obvious, such as line number 1 to line number 55555 perhaps: IMO it makes the test much more readable and the relevant parts stand out.
…w Insts When InstCombine replaces an old instruction with a new instruction, it copies !dbg and !annotation metadata from old to new. For some InstCombine patterns we set a specific DILocation on the new instruction prior to insertion, however, which more accurately reflects the new instruction. This more specific DILocation may be overwritten on insertion by a less appropriate one, resulting in a less correct line mapping. This patch changes this behaviour to only copy the DILocation from old to new if the new instruction has no existing DILocation (which will always be the case for a new instruction unless InstCombine has specifically set one).
aa961ec
to
1860431
Compare
…sts (llvm#108565) When InstCombine replaces an old instruction with a new instruction, it copies !dbg and !annotation metadata from old to new. For some InstCombine patterns we set a specific DILocation on the new instruction prior to insertion, however, which more accurately reflects the new instruction. This more specific DILocation may be overwritten on insertion by a less appropriate one, resulting in a less correct line mapping. This patch changes this behaviour to only copy the DILocation from old to new if the new instruction has no existing DILocation (which will always be the case for a new instruction unless InstCombine has specifically set one).
When InstCombine replaces an old instruction with a new instruction, it copies !dbg and !annotation metadata from old to new. For some InstCombine patterns we set a specific DILocation on the new instruction prior to insertion, however, which more accurately reflects the new instruction. This more specific DILocation may be overwritten on insertion by a less appropriate one, resulting in a less correct line mapping. This patch changes this behaviour to only copy the DILocation from old to new if the new instruction has no existing DILocation (which will always be the case for a new instruction unless InstCombine has specifically set one).
This isn't a perfect solution. In the test case given, we create a load instruction and give it the merged location of the two existing load instructions it replaces. Prior to this patch, if the PHI that is being RAUWd has a DILocation, it will overwrite the merged location on the new load, which is bad. However even with this patch, if either of the load instructions has no DILocation, then the new load won't have one either, and it will thus receive the DILocation from the PHI. This is probably bad - the PHI could have a valid DILocation corresponding to a different line, meaning that if the load results in an invalid memory access then we will attribute the resulting crash to the PHI's line, which could be worse than giving no attribution at all. I don't see any easy way to solve this problem however without fundamentally changing how InstCombine assigns DILocations; the most likely fix would be to exclusively assign DILocations within the InstCombine patterns, or to otherwise allow them to specify whether or not a DILocation ought to be copied from the original instruction or not.