Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

InstCombine: extend select-equiv to support vectors #111966

Merged
merged 1 commit into from
Oct 15, 2024

Conversation

artagnon
Copy link
Contributor

@artagnon artagnon commented Oct 11, 2024

foldSelectEquivalence currently doesn't support GVN-like replacements on vector types. Put in the checks for potentially lane-crossing operations, and lift the limitation.

@llvmbot
Copy link
Member

llvmbot commented Oct 11, 2024

@llvm/pr-subscribers-llvm-analysis

@llvm/pr-subscribers-llvm-transforms

Author: Ramkumar Ramachandra (artagnon)

Changes

foldSelectEquivalence currently doesn't support comparisons on vector types due to correctness concerns. Note that the only concern is lane-crossing; ShuffleVector is the only possible lane-crossing instruction, and ShuffleVectorInst::{isIdentity,isShuffle} are the exact properties that should not be broken for valid vector-replacements. Put in the checks, and lift the limitation.

-- 8< --
Based on #111694.


Full diff: https://github.com/llvm/llvm-project/pull/111966.diff

4 Files Affected:

  • (modified) llvm/lib/Transforms/InstCombine/InstCombineSelect.cpp (+8-2)
  • (modified) llvm/test/Transforms/InstCombine/and-or-icmps.ll (+1-1)
  • (modified) llvm/test/Transforms/InstCombine/select-binop-cmp.ll (+5-5)
  • (added) llvm/test/Transforms/InstCombine/select-value-equivalence.ll (+267)
diff --git a/llvm/lib/Transforms/InstCombine/InstCombineSelect.cpp b/llvm/lib/Transforms/InstCombine/InstCombineSelect.cpp
index 3f780285efe423..47dfe356cdf771 100644
--- a/llvm/lib/Transforms/InstCombine/InstCombineSelect.cpp
+++ b/llvm/lib/Transforms/InstCombine/InstCombineSelect.cpp
@@ -1288,6 +1288,13 @@ bool InstCombinerImpl::replaceInInstruction(Value *V, Value *Old, Value *New,
       !isSafeToSpeculativelyExecuteWithVariableReplaced(I))
     return false;
 
+  // ShuffleVector is the only possible lane-crossing instruction. isIdentity()
+  // or isSelect() must be satisfied for length-preserving non-lane-crossing
+  // shuffles.
+  auto *Shuffle = dyn_cast<ShuffleVectorInst>(I);
+  if (Shuffle && !Shuffle->isIdentity() && !Shuffle->isSelect())
+    return false;
+
   bool Changed = false;
   for (Use &U : I->operands()) {
     if (U == Old) {
@@ -1366,9 +1373,8 @@ Instruction *InstCombinerImpl::foldSelectValueEquivalence(SelectInst &Sel,
     // with different operands, which should not cause side-effects or trigger
     // undefined behavior). Only do this if CmpRHS is a constant, as
     // profitability is not clear for other cases.
-    // FIXME: Support vectors.
     if (OldOp == CmpLHS && match(NewOp, m_ImmConstant()) &&
-        !match(OldOp, m_Constant()) && !Cmp.getType()->isVectorTy() &&
+        !match(OldOp, m_Constant()) &&
         isGuaranteedNotToBeUndef(NewOp, SQ.AC, &Sel, &DT))
       if (replaceInInstruction(TrueVal, OldOp, NewOp))
         return &Sel;
diff --git a/llvm/test/Transforms/InstCombine/and-or-icmps.ll b/llvm/test/Transforms/InstCombine/and-or-icmps.ll
index ad28ad980de5b4..eb4723c86542de 100644
--- a/llvm/test/Transforms/InstCombine/and-or-icmps.ll
+++ b/llvm/test/Transforms/InstCombine/and-or-icmps.ll
@@ -983,7 +983,7 @@ define <2 x i1> @substitute_constant_or_ne_slt_swap_vec_poison(<2 x i8> %x, <2 x
 define <2 x i1> @substitute_constant_or_ne_slt_swap_vec_logical(<2 x i8> %x, <2 x i8> %y) {
 ; CHECK-LABEL: @substitute_constant_or_ne_slt_swap_vec_logical(
 ; CHECK-NEXT:    [[C1:%.*]] = icmp ne <2 x i8> [[X:%.*]], <i8 42, i8 poison>
-; CHECK-NEXT:    [[C2:%.*]] = icmp slt <2 x i8> [[Y:%.*]], [[X]]
+; CHECK-NEXT:    [[C2:%.*]] = icmp slt <2 x i8> [[Y:%.*]], <i8 42, i8 poison>
 ; CHECK-NEXT:    [[R:%.*]] = select <2 x i1> [[C1]], <2 x i1> <i1 true, i1 true>, <2 x i1> [[C2]]
 ; CHECK-NEXT:    ret <2 x i1> [[R]]
 ;
diff --git a/llvm/test/Transforms/InstCombine/select-binop-cmp.ll b/llvm/test/Transforms/InstCombine/select-binop-cmp.ll
index 647287ef5ebad1..cd8c29ba4cd819 100644
--- a/llvm/test/Transforms/InstCombine/select-binop-cmp.ll
+++ b/llvm/test/Transforms/InstCombine/select-binop-cmp.ll
@@ -552,12 +552,12 @@ define i32 @select_xor_icmp_bad_6(i32 %x, i32 %y, i32 %z) {
   ret i32 %C
 }
 
-; Value equivalence substitution is all-or-nothing, so needs a scalar compare.
+; Value equivalence substitution is valid.
 
-define <2 x i8> @select_xor_icmp_vec_bad(<2 x i8> %x, <2 x i8> %y, <2 x i8> %z) {
-; CHECK-LABEL: @select_xor_icmp_vec_bad(
+define <2 x i8> @select_xor_icmp_vec_equivalence(<2 x i8> %x, <2 x i8> %y, <2 x i8> %z) {
+; CHECK-LABEL: @select_xor_icmp_vec_equivalence(
 ; CHECK-NEXT:    [[A:%.*]] = icmp eq <2 x i8> [[X:%.*]], <i8 5, i8 3>
-; CHECK-NEXT:    [[B:%.*]] = xor <2 x i8> [[X]], [[Z:%.*]]
+; CHECK-NEXT:    [[B:%.*]] = xor <2 x i8> [[Z:%.*]], <i8 5, i8 3>
 ; CHECK-NEXT:    [[C:%.*]] = select <2 x i1> [[A]], <2 x i8> [[B]], <2 x i8> [[Y:%.*]]
 ; CHECK-NEXT:    ret <2 x i8> [[C]]
 ;
@@ -567,7 +567,7 @@ define <2 x i8> @select_xor_icmp_vec_bad(<2 x i8> %x, <2 x i8> %y, <2 x i8> %z)
   ret <2 x i8>  %C
 }
 
-; Value equivalence substitution is all-or-nothing, so needs a scalar compare.
+; Value equivalence substitution is invalid due to lane-crossing shufflevector.
 
 define <2 x i32> @vec_select_no_equivalence(<2 x i32> %x) {
 ; CHECK-LABEL: @vec_select_no_equivalence(
diff --git a/llvm/test/Transforms/InstCombine/select-value-equivalence.ll b/llvm/test/Transforms/InstCombine/select-value-equivalence.ll
new file mode 100644
index 00000000000000..da8970a7630f28
--- /dev/null
+++ b/llvm/test/Transforms/InstCombine/select-value-equivalence.ll
@@ -0,0 +1,267 @@
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --version 5
+; RUN: opt -passes=instcombine -S %s | FileCheck %s
+
+define <2 x i8> @select_icmp_insertelement_eq(<2 x i8> %x, <2 x i8> %y, i8 %i) {
+; CHECK-LABEL: define <2 x i8> @select_icmp_insertelement_eq(
+; CHECK-SAME: <2 x i8> [[X:%.*]], <2 x i8> [[Y:%.*]], i8 [[I:%.*]]) {
+; CHECK-NEXT:    [[CMP:%.*]] = icmp eq <2 x i8> [[Y]], <i8 2, i8 2>
+; CHECK-NEXT:    [[INSERT:%.*]] = insertelement <2 x i8> <i8 2, i8 2>, i8 0, i8 [[I]]
+; CHECK-NEXT:    [[RETVAL:%.*]] = select <2 x i1> [[CMP]], <2 x i8> [[INSERT]], <2 x i8> [[X]]
+; CHECK-NEXT:    ret <2 x i8> [[RETVAL]]
+;
+  %cmp = icmp eq <2 x i8> %y, <i8 2, i8 2>
+  %insert = insertelement <2 x i8> %y, i8 0, i8 %i
+  %retval = select <2 x i1> %cmp, <2 x i8> %insert, <2 x i8> %x
+  ret <2 x i8> %retval
+}
+
+define <2 x i8> @select_icmp_insertelement_ne(<2 x i8> %x, <2 x i8> %y, i8 %i) {
+; CHECK-LABEL: define <2 x i8> @select_icmp_insertelement_ne(
+; CHECK-SAME: <2 x i8> [[X:%.*]], <2 x i8> [[Y:%.*]], i8 [[I:%.*]]) {
+; CHECK-NEXT:    [[CMP_NOT:%.*]] = icmp eq <2 x i8> [[Y]], <i8 2, i8 2>
+; CHECK-NEXT:    [[INSERT:%.*]] = insertelement <2 x i8> <i8 2, i8 2>, i8 0, i8 [[I]]
+; CHECK-NEXT:    [[RETVAL:%.*]] = select <2 x i1> [[CMP_NOT]], <2 x i8> [[INSERT]], <2 x i8> [[X]]
+; CHECK-NEXT:    ret <2 x i8> [[RETVAL]]
+;
+  %cmp = icmp ne <2 x i8> %y, <i8 2, i8 2>
+  %insert = insertelement <2 x i8> %y, i8 0, i8 %i
+  %retval = select <2 x i1> %cmp, <2 x i8> %x, <2 x i8> %insert
+  ret <2 x i8> %retval
+}
+
+define <2 x i8> @select_icmp_shufflevector_identity(<2 x i8> %x, <2 x i8> %y) {
+; CHECK-LABEL: define <2 x i8> @select_icmp_shufflevector_identity(
+; CHECK-SAME: <2 x i8> [[X:%.*]], <2 x i8> [[Y:%.*]]) {
+; CHECK-NEXT:    [[CMP:%.*]] = icmp eq <2 x i8> [[Y]], <i8 2, i8 2>
+; CHECK-NEXT:    [[RETVAL:%.*]] = select <2 x i1> [[CMP]], <2 x i8> <i8 2, i8 2>, <2 x i8> [[X]]
+; CHECK-NEXT:    ret <2 x i8> [[RETVAL]]
+;
+  %cmp = icmp eq <2 x i8> %y, <i8 2, i8 2>
+  %shuffle = shufflevector <2 x i8> %y, <2 x i8> poison, <2 x i32> <i32 0, i32 1>
+  %retval = select <2 x i1> %cmp, <2 x i8> %shuffle, <2 x i8> %x
+  ret <2 x i8> %retval
+}
+
+define <4 x i8> @select_icmp_shufflevector_select(<4 x i8> %x, <4 x i8> %y, <4 x i8> %z) {
+; CHECK-LABEL: define <4 x i8> @select_icmp_shufflevector_select(
+; CHECK-SAME: <4 x i8> [[X:%.*]], <4 x i8> [[Y:%.*]], <4 x i8> [[Z:%.*]]) {
+; CHECK-NEXT:    [[CMP:%.*]] = icmp eq <4 x i8> [[Y]], <i8 2, i8 2, i8 2, i8 2>
+; CHECK-NEXT:    [[SHUFFLE:%.*]] = shufflevector <4 x i8> [[Z]], <4 x i8> <i8 poison, i8 2, i8 poison, i8 2>, <4 x i32> <i32 0, i32 5, i32 2, i32 7>
+; CHECK-NEXT:    [[RETVAL:%.*]] = select <4 x i1> [[CMP]], <4 x i8> [[SHUFFLE]], <4 x i8> [[X]]
+; CHECK-NEXT:    ret <4 x i8> [[RETVAL]]
+;
+  %cmp = icmp eq <4 x i8> %y, <i8 2, i8 2, i8 2, i8 2>
+  %shuffle = shufflevector <4 x i8> %y, <4 x i8> %z, <4 x i32> <i32 4, i32 1, i32 6, i32 3>
+  %retval = select <4 x i1> %cmp, <4 x i8> %shuffle, <4 x i8> %x
+  ret <4 x i8> %retval
+}
+
+define <2 x i8> @select_icmp_shufflevector_lanecrossing(<2 x i8> %x, <2 x i8> %y) {
+; CHECK-LABEL: define <2 x i8> @select_icmp_shufflevector_lanecrossing(
+; CHECK-SAME: <2 x i8> [[X:%.*]], <2 x i8> [[Y:%.*]]) {
+; CHECK-NEXT:    [[CMP:%.*]] = icmp eq <2 x i8> [[Y]], <i8 2, i8 2>
+; CHECK-NEXT:    [[SHUFFLE:%.*]] = shufflevector <2 x i8> [[Y]], <2 x i8> poison, <2 x i32> <i32 1, i32 0>
+; CHECK-NEXT:    [[RETVAL:%.*]] = select <2 x i1> [[CMP]], <2 x i8> [[SHUFFLE]], <2 x i8> [[X]]
+; CHECK-NEXT:    ret <2 x i8> [[RETVAL]]
+;
+  %cmp = icmp eq <2 x i8> %y, <i8 2, i8 2>
+  %shuffle = shufflevector <2 x i8> %y, <2 x i8> poison, <2 x i32> <i32 1, i32 0>
+  %retval = select <2 x i1> %cmp, <2 x i8> %shuffle, <2 x i8> %x
+  ret <2 x i8> %retval
+}
+
+define i8 @select_icmp_udiv(i8 %x, i8 %y) {
+; CHECK-LABEL: define i8 @select_icmp_udiv(
+; CHECK-SAME: i8 [[X:%.*]], i8 [[Y:%.*]]) {
+; CHECK-NEXT:    [[CMP:%.*]] = icmp eq i8 [[Y]], 2
+; CHECK-NEXT:    [[UDIV:%.*]] = udiv i8 [[X]], [[Y]]
+; CHECK-NEXT:    [[RETVAL:%.*]] = select i1 [[CMP]], i8 [[UDIV]], i8 [[X]]
+; CHECK-NEXT:    ret i8 [[RETVAL]]
+;
+  %cmp = icmp eq i8 %y, 2
+  %udiv = udiv i8 %x, %y
+  %retval = select i1 %cmp, i8 %udiv, i8 %x
+  ret i8 %retval
+}
+
+define i8 @select_icmp_urem(i8 %x, i8 %y) {
+; CHECK-LABEL: define i8 @select_icmp_urem(
+; CHECK-SAME: i8 [[X:%.*]], i8 [[Y:%.*]]) {
+; CHECK-NEXT:    [[CMP:%.*]] = icmp eq i8 [[Y]], 2
+; CHECK-NEXT:    [[UREM:%.*]] = urem i8 [[X]], [[Y]]
+; CHECK-NEXT:    [[RETVAL:%.*]] = select i1 [[CMP]], i8 [[UREM]], i8 [[X]]
+; CHECK-NEXT:    ret i8 [[RETVAL]]
+;
+  %cmp = icmp eq i8 %y, 2
+  %urem = urem i8 %x, %y
+  %retval = select i1 %cmp, i8 %urem, i8 %x
+  ret i8 %retval
+}
+
+define <2 x i8> @select_icmp_udiv_vec(<2 x i8> %x, <2 x i8> %y) {
+; CHECK-LABEL: define <2 x i8> @select_icmp_udiv_vec(
+; CHECK-SAME: <2 x i8> [[X:%.*]], <2 x i8> [[Y:%.*]]) {
+; CHECK-NEXT:    [[CMP:%.*]] = icmp eq <2 x i8> [[Y]], <i8 2, i8 2>
+; CHECK-NEXT:    [[UDIV:%.*]] = udiv <2 x i8> [[X]], [[Y]]
+; CHECK-NEXT:    [[RETVAL:%.*]] = select <2 x i1> [[CMP]], <2 x i8> [[UDIV]], <2 x i8> [[X]]
+; CHECK-NEXT:    ret <2 x i8> [[RETVAL]]
+;
+  %cmp = icmp eq <2 x i8> %y, <i8 2, i8 2>
+  %udiv = udiv <2 x i8> %x, %y
+  %retval = select <2 x i1> %cmp, <2 x i8> %udiv, <2 x i8> %x
+  ret <2 x i8> %retval
+}
+
+define <2 x i8> @select_icmp_urem_vec(<2 x i8> %x, <2 x i8> %y) {
+; CHECK-LABEL: define <2 x i8> @select_icmp_urem_vec(
+; CHECK-SAME: <2 x i8> [[X:%.*]], <2 x i8> [[Y:%.*]]) {
+; CHECK-NEXT:    [[CMP:%.*]] = icmp eq <2 x i8> [[Y]], <i8 2, i8 2>
+; CHECK-NEXT:    [[UREM:%.*]] = urem <2 x i8> [[X]], [[Y]]
+; CHECK-NEXT:    [[RETVAL:%.*]] = select <2 x i1> [[CMP]], <2 x i8> [[UREM]], <2 x i8> [[X]]
+; CHECK-NEXT:    ret <2 x i8> [[RETVAL]]
+;
+  %cmp = icmp eq <2 x i8> %y, <i8 2, i8 2>
+  %urem = urem <2 x i8> %x, %y
+  %retval = select <2 x i1> %cmp, <2 x i8> %urem, <2 x i8> %x
+  ret <2 x i8> %retval
+}
+
+define <2 x double> @select_fcmp_fadd_oeq_not_zero_vec(<2 x double> %x, <2 x double> %y) {
+; CHECK-LABEL: define <2 x double> @select_fcmp_fadd_oeq_not_zero_vec(
+; CHECK-SAME: <2 x double> [[X:%.*]], <2 x double> [[Y:%.*]]) {
+; CHECK-NEXT:    [[FCMP:%.*]] = fcmp oeq <2 x double> [[Y]], <double 2.000000e+00, double 2.000000e+00>
+; CHECK-NEXT:    [[FADD:%.*]] = fadd <2 x double> [[X]], [[Y]]
+; CHECK-NEXT:    [[RETVAL:%.*]] = select <2 x i1> [[FCMP]], <2 x double> [[FADD]], <2 x double> [[X]]
+; CHECK-NEXT:    ret <2 x double> [[RETVAL]]
+;
+  %fcmp = fcmp oeq <2 x double> %y, <double 2.0, double 2.0>
+  %fadd = fadd <2 x double> %x, %y
+  %retval = select <2 x i1> %fcmp, <2 x double> %fadd, <2 x double> %x
+  ret <2 x double> %retval
+}
+
+define <2 x double> @select_fcmp_fadd_une_not_zero_vec(<2 x double> %x, <2 x double> %y) {
+; CHECK-LABEL: define <2 x double> @select_fcmp_fadd_une_not_zero_vec(
+; CHECK-SAME: <2 x double> [[X:%.*]], <2 x double> [[Y:%.*]]) {
+; CHECK-NEXT:    [[FCMP:%.*]] = fcmp une <2 x double> [[Y]], <double 2.000000e+00, double 2.000000e+00>
+; CHECK-NEXT:    [[FADD:%.*]] = fadd <2 x double> [[X]], [[Y]]
+; CHECK-NEXT:    [[RETVAL:%.*]] = select <2 x i1> [[FCMP]], <2 x double> [[X]], <2 x double> [[FADD]]
+; CHECK-NEXT:    ret <2 x double> [[RETVAL]]
+;
+  %fcmp = fcmp une <2 x double> %y, <double 2.0, double 2.0>
+  %fadd = fadd <2 x double> %x, %y
+  %retval = select <2 x i1> %fcmp, <2 x double> %x, <2 x double> %fadd
+  ret <2 x double> %retval
+}
+
+define <2 x double> @select_fcmp_fadd_ueq_nnan_not_zero_vec(<2 x double> %x, <2 x double> %y) {
+; CHECK-LABEL: define <2 x double> @select_fcmp_fadd_ueq_nnan_not_zero_vec(
+; CHECK-SAME: <2 x double> [[X:%.*]], <2 x double> [[Y:%.*]]) {
+; CHECK-NEXT:    [[FCMP:%.*]] = fcmp nnan ueq <2 x double> [[Y]], <double 2.000000e+00, double 2.000000e+00>
+; CHECK-NEXT:    [[FADD:%.*]] = fadd <2 x double> [[X]], [[Y]]
+; CHECK-NEXT:    [[RETVAL:%.*]] = select <2 x i1> [[FCMP]], <2 x double> [[FADD]], <2 x double> [[X]]
+; CHECK-NEXT:    ret <2 x double> [[RETVAL]]
+;
+  %fcmp = fcmp nnan ueq <2 x double> %y, <double 2.0, double 2.0>
+  %fadd = fadd <2 x double> %x, %y
+  %retval = select <2 x i1> %fcmp, <2 x double> %fadd, <2 x double> %x
+  ret <2 x double> %retval
+}
+
+define <2 x double> @select_fcmp_fadd_one_nnan_not_zero_vec(<2 x double> %x, <2 x double> %y) {
+; CHECK-LABEL: define <2 x double> @select_fcmp_fadd_one_nnan_not_zero_vec(
+; CHECK-SAME: <2 x double> [[X:%.*]], <2 x double> [[Y:%.*]]) {
+; CHECK-NEXT:    [[FCMP:%.*]] = fcmp nnan one <2 x double> [[Y]], <double 2.000000e+00, double 2.000000e+00>
+; CHECK-NEXT:    [[FADD:%.*]] = fadd <2 x double> [[X]], [[Y]]
+; CHECK-NEXT:    [[RETVAL:%.*]] = select <2 x i1> [[FCMP]], <2 x double> [[X]], <2 x double> [[FADD]]
+; CHECK-NEXT:    ret <2 x double> [[RETVAL]]
+;
+  %fcmp = fcmp nnan one <2 x double> %y, <double 2.0, double 2.0>
+  %fadd = fadd <2 x double> %x, %y
+  %retval = select <2 x i1> %fcmp, <2 x double> %x, <2 x double> %fadd
+  ret <2 x double> %retval
+}
+
+define <2 x double> @select_fcmp_fadd_ueq_vec(<2 x double> %x, <2 x double> %y) {
+; CHECK-LABEL: define <2 x double> @select_fcmp_fadd_ueq_vec(
+; CHECK-SAME: <2 x double> [[X:%.*]], <2 x double> [[Y:%.*]]) {
+; CHECK-NEXT:    [[FCMP:%.*]] = fcmp ueq <2 x double> [[Y]], <double 2.000000e+00, double 2.000000e+00>
+; CHECK-NEXT:    [[FADD:%.*]] = fadd <2 x double> [[X]], [[Y]]
+; CHECK-NEXT:    [[RETVAL:%.*]] = select <2 x i1> [[FCMP]], <2 x double> [[FADD]], <2 x double> [[X]]
+; CHECK-NEXT:    ret <2 x double> [[RETVAL]]
+;
+  %fcmp = fcmp ueq <2 x double> %y, <double 2.0, double 2.0>
+  %fadd = fadd <2 x double> %x, %y
+  %retval = select <2 x i1> %fcmp, <2 x double> %fadd, <2 x double> %x
+  ret <2 x double> %retval
+}
+
+define <2 x double> @select_fcmp_fadd_one_vec(<2 x double> %x, <2 x double> %y) {
+; CHECK-LABEL: define <2 x double> @select_fcmp_fadd_one_vec(
+; CHECK-SAME: <2 x double> [[X:%.*]], <2 x double> [[Y:%.*]]) {
+; CHECK-NEXT:    [[FCMP:%.*]] = fcmp one <2 x double> [[Y]], <double 2.000000e+00, double 2.000000e+00>
+; CHECK-NEXT:    [[FADD:%.*]] = fadd <2 x double> [[X]], [[Y]]
+; CHECK-NEXT:    [[RETVAL:%.*]] = select <2 x i1> [[FCMP]], <2 x double> [[X]], <2 x double> [[FADD]]
+; CHECK-NEXT:    ret <2 x double> [[RETVAL]]
+;
+  %fcmp = fcmp one <2 x double> %y, <double 2.0, double 2.0>
+  %fadd = fadd <2 x double> %x, %y
+  %retval = select <2 x i1> %fcmp, <2 x double> %x, <2 x double> %fadd
+  ret <2 x double> %retval
+}
+
+define <2 x double> @select_fcmp_fadd_oeq_zero_vec(<2 x double> %x, <2 x double> %y) {
+; CHECK-LABEL: define <2 x double> @select_fcmp_fadd_oeq_zero_vec(
+; CHECK-SAME: <2 x double> [[X:%.*]], <2 x double> [[Y:%.*]]) {
+; CHECK-NEXT:    [[FCMP:%.*]] = fcmp oeq <2 x double> [[Y]], zeroinitializer
+; CHECK-NEXT:    [[FADD:%.*]] = fadd <2 x double> [[X]], [[Y]]
+; CHECK-NEXT:    [[RETVAL:%.*]] = select <2 x i1> [[FCMP]], <2 x double> [[FADD]], <2 x double> [[X]]
+; CHECK-NEXT:    ret <2 x double> [[RETVAL]]
+;
+  %fcmp = fcmp oeq <2 x double> %y, zeroinitializer
+  %fadd = fadd <2 x double> %x, %y
+  %retval = select <2 x i1> %fcmp, <2 x double> %fadd, <2 x double> %x
+  ret <2 x double> %retval
+}
+
+define <2 x double> @select_fcmp_fadd_une_zero_vec(<2 x double> %x, <2 x double> %y) {
+; CHECK-LABEL: define <2 x double> @select_fcmp_fadd_une_zero_vec(
+; CHECK-SAME: <2 x double> [[X:%.*]], <2 x double> [[Y:%.*]]) {
+; CHECK-NEXT:    [[FCMP:%.*]] = fcmp une <2 x double> [[Y]], zeroinitializer
+; CHECK-NEXT:    [[FADD:%.*]] = fadd <2 x double> [[X]], [[Y]]
+; CHECK-NEXT:    [[RETVAL:%.*]] = select <2 x i1> [[FCMP]], <2 x double> [[X]], <2 x double> [[FADD]]
+; CHECK-NEXT:    ret <2 x double> [[RETVAL]]
+;
+  %fcmp = fcmp une <2 x double> %y, zeroinitializer
+  %fadd = fadd <2 x double> %x, %y
+  %retval = select <2 x i1> %fcmp, <2 x double> %x, <2 x double> %fadd
+  ret <2 x double> %retval
+}
+
+define <2 x double> @select_fcmp_fdiv_oeq_not_zero_vec(<2 x double> %x, <2 x double> %y) {
+; CHECK-LABEL: define <2 x double> @select_fcmp_fdiv_oeq_not_zero_vec(
+; CHECK-SAME: <2 x double> [[X:%.*]], <2 x double> [[Y:%.*]]) {
+; CHECK-NEXT:    [[FCMP:%.*]] = fcmp oeq <2 x double> [[Y]], <double 2.000000e+00, double 2.000000e+00>
+; CHECK-NEXT:    [[DIV:%.*]] = fdiv <2 x double> [[X]], [[Y]]
+; CHECK-NEXT:    [[RETVAL:%.*]] = select <2 x i1> [[FCMP]], <2 x double> [[DIV]], <2 x double> [[X]]
+; CHECK-NEXT:    ret <2 x double> [[RETVAL]]
+;
+  %fcmp = fcmp oeq <2 x double> %y, <double 2.0, double 2.0>
+  %div = fdiv <2 x double> %x, %y
+  %retval = select <2 x i1> %fcmp, <2 x double> %div, <2 x double> %x
+  ret <2 x double> %retval
+}
+
+define <2 x double> @select_fcmp_fdiv_une_not_zero_vec(<2 x double> %x, <2 x double> %y) {
+; CHECK-LABEL: define <2 x double> @select_fcmp_fdiv_une_not_zero_vec(
+; CHECK-SAME: <2 x double> [[X:%.*]], <2 x double> [[Y:%.*]]) {
+; CHECK-NEXT:    [[FCMP:%.*]] = fcmp une <2 x double> [[Y]], <double 2.000000e+00, double 2.000000e+00>
+; CHECK-NEXT:    [[DIV:%.*]] = fdiv <2 x double> [[X]], [[Y]]
+; CHECK-NEXT:    [[RETVAL:%.*]] = select <2 x i1> [[FCMP]], <2 x double> [[X]], <2 x double> [[DIV]]
+; CHECK-NEXT:    ret <2 x double> [[RETVAL]]
+;
+  %fcmp = fcmp une <2 x double> %y, <double 2.0, double 2.0>
+  %div = fdiv <2 x double> %x, %y
+  %retval = select <2 x i1> %fcmp, <2 x double> %x, <2 x double> %div
+  ret <2 x double> %retval
+}

Copy link
Contributor

@nikic nikic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shufflevector is not the only lane-crossing operation, see

// For vector types, the simplification must hold per-lane, so forbid
// potentially cross-lane operations like shufflevector.
if (!I->getType()->isVectorTy() || isa<ShuffleVectorInst>(I) ||
isa<CallBase>(I) || isa<BitCastInst>(I))
return nullptr;
for a more accurate heuristic.

@goldsteinn
Copy link
Contributor

Shufflevector is not the only lane-crossing operation, see

// For vector types, the simplification must hold per-lane, so forbid
// potentially cross-lane operations like shufflevector.
if (!I->getType()->isVectorTy() || isa<ShuffleVectorInst>(I) ||
isa<CallBase>(I) || isa<BitCastInst>(I))
return nullptr;

for a more accurate heuristic.

Might be nice to make this a helper

@nikic
Copy link
Contributor

nikic commented Oct 11, 2024

Yes, we should move that code plus

// Try to fold intrinsic into select operands. This is legal if:
// * The intrinsic is speculatable.
// * The select condition is not a vector, or the intrinsic does not
// perform cross-lane operations.
switch (IID) {
case Intrinsic::ctlz:
case Intrinsic::cttz:
case Intrinsic::ctpop:
case Intrinsic::umin:
case Intrinsic::umax:
case Intrinsic::smin:
case Intrinsic::smax:
case Intrinsic::usub_sat:
case Intrinsic::uadd_sat:
case Intrinsic::ssub_sat:
case Intrinsic::sadd_sat:
into a ValueTracking function isLanewiseOperation() or so.

@artagnon
Copy link
Contributor Author

Rebased, ready to review now.

Copy link
Contributor

@nikic nikic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

foldSelectEquivalence currently doesn't support comparisons on vector
types due to correctness concerns. Note that the only concern is
lane-crossing; ShuffleVector is the only possible lane-crossing
instruction, and ShuffleVectorInst::{isIdentity,isShuffle} are the exact
properties that should not be broken for valid vector-replacements. Put
in the checks, and lift the limitation.
@artagnon artagnon force-pushed the ic-select-equiv-vec branch from 866830e to 8e764f1 Compare October 15, 2024 10:09
@artagnon artagnon merged commit 1c6c850 into llvm:main Oct 15, 2024
5 of 7 checks passed
@artagnon artagnon deleted the ic-select-equiv-vec branch October 15, 2024 10:10
DanielCChen pushed a commit to DanielCChen/llvm-project that referenced this pull request Oct 16, 2024
foldSelectEquivalence currently doesn't support GVN-like replacements on
vector types. Put in the checks for potentially lane-crossing
operations, and lift the limitation.
bricknerb pushed a commit to bricknerb/llvm-project that referenced this pull request Oct 17, 2024
foldSelectEquivalence currently doesn't support GVN-like replacements on
vector types. Put in the checks for potentially lane-crossing
operations, and lift the limitation.
EricWF pushed a commit to efcs/llvm-project that referenced this pull request Oct 22, 2024
foldSelectEquivalence currently doesn't support GVN-like replacements on
vector types. Put in the checks for potentially lane-crossing
operations, and lift the limitation.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants