-
Notifications
You must be signed in to change notification settings - Fork 12.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[instCombine][bugfix] Fix crash caused by using of cast in instCombineSVECmpNE #102472
Conversation
@llvm/pr-subscribers-llvm-transforms @llvm/pr-subscribers-backend-aarch64 Author: None (cceerczw) Changes…eSVECmpNE. Func instCombineSVECmpNE is used to identify specific pattern of instruction 'svecmene', Full diff: https://github.com/llvm/llvm-project/pull/102472.diff 2 Files Affected:
diff --git a/llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp b/llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp
index 79c0e45e3aa5b5..c27c56d4557b14 100644
--- a/llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp
+++ b/llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp
@@ -1174,7 +1174,8 @@ static std::optional<Instruction *> instCombineSVECmpNE(InstCombiner &IC,
return std::nullopt;
// Where the dupq is a lane 0 replicate of a vector insert
- if (!cast<ConstantInt>(DupQLane->getArgOperand(1))->isZero())
+ auto *DupQLaneOp1 = dyn_cast<ConstantInt>(DupQLane->getArgOperand(1));
+ if (!DupQLaneOp1 || !DupQLaneOp1->isZero())
return std::nullopt;
auto *VecIns = dyn_cast<IntrinsicInst>(DupQLane->getArgOperand(0));
diff --git a/llvm/test/Transforms/InstCombine/AArch64/sve-inst-combine-cmpne.ll b/llvm/test/Transforms/InstCombine/AArch64/sve-inst-combine-cmpne.ll
new file mode 100644
index 00000000000000..b21f2f538d3cab
--- /dev/null
+++ b/llvm/test/Transforms/InstCombine/AArch64/sve-inst-combine-cmpne.ll
@@ -0,0 +1,58 @@
+; RUN: opt -S -mtriple=aarch64-unknown-linux-gnu -O2 < %s | FileCheck %s
+
+; Function Attrs: nofree nosync nounwind readnone uwtable vscale_range(1,16)
+define dso_local i32 @testInstCombineSVECmpNE() local_unnamed_addr #0 {
+entry:
+ %0 = tail call <vscale x 16 x i8> @llvm.aarch64.sve.index.nxv16i8(i8 42, i8 1)
+ %1 = tail call <vscale x 16 x i1> @llvm.aarch64.sve.ptrue.nxv16i1(i32 31)
+ br label %for.body
+
+for.cond.cleanup: ; preds = %for.inc
+ %2 = tail call i1 @llvm.aarch64.sve.ptest.any.nxv16i1(<vscale x 16 x i1> %1, <vscale x 16 x i1> %cmp_rslt.1)
+ %not. = xor i1 %2, true
+ %. = zext i1 %not. to i32
+ ret i32 %.
+
+for.body: ; preds = %entry, %for.inc
+ %i.010 = phi i64 [ 0, %entry ], [ %inc, %for.inc ]
+ %cmp1 = icmp ugt i64 %i.010, 32
+ %3 = tail call <vscale x 16 x i8> @llvm.aarch64.sve.dupq.lane.nxv16i8(<vscale x 16 x i8> %0, i64 %i.010)
+ br i1 %cmp1, label %if.then, label %if.else
+
+if.then: ; preds = %for.body
+ %4 = tail call <vscale x 16 x i1> @llvm.aarch64.sve.cmpne.nxv16i8(<vscale x 16 x i1> %1, <vscale x 16 x i8> %3, <vscale x 16 x i8> zeroinitializer)
+ br label %for.inc
+ ; CHECK: %4 = tail call <vscale x 16 x i1> @llvm.aarch64.sve.cmpne.nxv16i8(<vscale x 16 x i1> %1, <vscale x 16 x i8> %3, <vscale x 16 x i8> zeroinitializer)
+ ; CHECK-NEXT: br label %for.inc
+
+if.else: ; preds = %for.body
+ %5 = tail call <vscale x 16 x i1> @llvm.aarch64.sve.cmpne.nxv16i8(<vscale x 16 x i1> %1, <vscale x 16 x i8> %3, <vscale x 16 x i8> shufflevector (<vscale x 16 x i8> insertelement (<vscale x 16 x i8> poison, i8 1, i32 0), <vscale x 16 x i8> poison, <vscale x 16 x i32> zeroinitializer))
+ br label %for.inc
+
+for.inc: ; preds = %if.then, %if.else
+ %cmp_rslt.1 = phi <vscale x 16 x i1> [ %4, %if.then ], [ %5, %if.else ]
+ %inc = add nuw nsw i64 %i.010, 1
+ %exitcond.not = icmp eq i64 %inc, 63
+ br i1 %exitcond.not, label %for.cond.cleanup, label %for.body, !llvm.loop !6
+}
+
+; Function Attrs: mustprogress nocallback nofree nosync nounwind readnone willreturn
+declare <vscale x 16 x i8> @llvm.aarch64.sve.index.nxv16i8(i8, i8) #1
+
+; Function Attrs: mustprogress nocallback nofree nosync nounwind readnone willreturn
+declare <vscale x 16 x i1> @llvm.aarch64.sve.ptrue.nxv16i1(i32 immarg) #1
+
+; Function Attrs: mustprogress nocallback nofree nosync nounwind readnone willreturn
+declare <vscale x 16 x i8> @llvm.aarch64.sve.dupq.lane.nxv16i8(<vscale x 16 x i8>, i64) #1
+
+; Function Attrs: mustprogress nocallback nofree nosync nounwind readnone willreturn
+declare <vscale x 16 x i1> @llvm.aarch64.sve.cmpne.nxv16i8(<vscale x 16 x i1>, <vscale x 16 x i8>, <vscale x 16 x i8>) #1
+
+; Function Attrs: mustprogress nocallback nofree nosync nounwind readnone willreturn
+declare i1 @llvm.aarch64.sve.ptest.any.nxv16i1(<vscale x 16 x i1>, <vscale x 16 x i1>) #1
+
+attributes #0 = { nofree nosync nounwind readnone uwtable vscale_range(1,16) "frame-pointer"="non-leaf" "min-legal-vector-width"="0" "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-cpu"="generic" "target-features"="+neon,+sve,+v8.2a" }
+attributes #1 = { mustprogress nocallback nofree nosync nounwind readnone willreturn }
+
+!6 = distinct !{!6, !7}
+!7 = !{!"llvm.loop.mustprogress"}
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fix looks good but the test case can be significantly reduced. It looks like you can just add to llvm/test/Transforms/InstCombine/AArch64/sve-intrinsic-opts-cmpne.ll
by cloning dupq_b_0
as dupq_b_idx
whereby it takes an %idx
function parameter that it uses when calling sve.dupq.lane
.
@@ -1174,7 +1174,8 @@ static std::optional<Instruction *> instCombineSVECmpNE(InstCombiner &IC, | |||
return std::nullopt; | |||
|
|||
// Where the dupq is a lane 0 replicate of a vector insert | |||
if (!cast<ConstantInt>(DupQLane->getArgOperand(1))->isZero()) | |||
auto *DupQLaneOp1 = dyn_cast<ConstantInt>(DupQLane->getArgOperand(1)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Up to you but I think DupQLaneIdx
would be more readable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, fixed.
Thank you for your review! I have simplified the case in 'sve-inst-combine-cmpne.ll'. But it can't be merged into sve-intrinsic-opts-cmpne.ll, because the compile command of my new test case must contain '-mtriple=aarch64-unknown-linux-gnu' and '-O2'. |
0b7ff03
to
3771bbf
Compare
Are you sure?
|
When I try your suggestion, the compiler didn't generate any failure. I have also test in https://www.godbolt.org/ using 'opt(trunk)'. Besides, the original failure which I want to fix is like: `PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace.
|
Sorry I really don't mean to be a pain here and I'll accept this version of the PR if you continue to see something different to me, but is it possible you're not using a debug or asserts build? https://www.godbolt.org/z/8McfPMqTz shows the same error quite clearly. |
The result of https://www.godbolt.org/z/8McfPMqTz shows that the recommended case crashes when compiled with the debug version of the compiler, but it does not crash when compiled with the release version. On the other hand, the case I uploaded crashes even when compiled with the release version of the compiler. So what I don't understand is, if this case can only detect the crash error when compiled with the debug version of the compiler, is this case really meaningful? If I receive your confirmation, I will make the necessary changes. Once again, thank you for your review! |
Yes, having a test case that only exposes a failure when asserts are enable is legitimate and expected because all active LLVM development should be done when asserts are enabled, otherwise you're minimising your test coverage. Whilst the other test case crashes with a release compiler today, a crash is inherently unexplained and so the same test might not crash in the future. For example, you already mention the need for |
Thank you for your explanation. I will modify the case. |
…eSVECmpNE. Func instCombineSVECmpNE is used to identify specific pattern of instruction 'svecmene', and then predict its result, use the result to replace instruction 'svecmene'. The specific pattern can be descriped below: 1.The svecmpne must compare all elements of vec. 2.The svecmpne inst compare its ves with zero. 3.The vec in svecmpne inst is generated by inst dupqlane, and the copy value of this dupqlane must be zero. In NO.3 above, func instCombineSVECmpNE uses 'cast' to transform op1 of dupqlane without checking if the cast is success, then generate a crash in some situation.
3771bbf
to
902b0d3
Compare
Hi,@paulwalker-arm, thanks for the review again, can this patch merge? |
I see no reason why not. The windows failure looks unrelated to this work. |
yes, so can you help me to merge this patch? |
Func instCombineSVECmpNE is used to identify specific pattern of instruction 'svecmene',
and then predict its result, use the result to replace instruction 'svecmene'. The specific pattern can be descriped below:
1.The svecmpne must compare all elements of vec.
2.The svecmpne inst compare its ves with zero.
3.The vec in svecmpne inst is generated by inst dupqlane, and the copy value of this dupqlane must be zero.
In NO.3 above, func instCombineSVECmpNE uses 'cast' to transform op1 of dupqlane without checking if the cast is success, then generate a crash in some situation.