Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[IR] Add zext nneg flag #67982

Merged
merged 3 commits into from
Oct 30, 2023
Merged

[IR] Add zext nneg flag #67982

merged 3 commits into from
Oct 30, 2023

Conversation

nikic
Copy link
Contributor

@nikic nikic commented Oct 2, 2023

Add an nneg flag to the zext instruction, which specifies that the argument is non-negative. Otherwise, the result is a poison value.

The primary use-case for the flag is to preserve information when sext gets replaced with zext due to range-based canonicalization. The nneg flag allows us to convert the zext back into an sext later. This is useful for some optimizations (e.g. a signed icmp can fold with sext but not zext), as well as some targets (e.g. RISCV prefers sext over zext).

This patch is based on https://reviews.llvm.org/D156444 by @Panagiotis156, with some implementation simplifications and additional tests.

@llvmbot
Copy link
Collaborator

llvmbot commented Oct 2, 2023

@llvm/pr-subscribers-llvm-ir

@llvm/pr-subscribers-llvm-transforms

Changes

Add an nneg flag to the zext instruction, which specifies that the argument is non-negative. Otherwise, the result is a poison value.

The primary use-case for the flag is to preserve information when sext gets replaced with zext due to range-based canonicalization. The nneg flag allows us to convert the zext back into an sext later. This is useful for some optimizations (e.g. a signed icmp can fold with sext but not zext), as well as some targets (e.g. RISCV prefers sext over zext).

This patch is based on https://reviews.llvm.org/D156444, with some implementation simplifications and additional tests.


Full diff: https://github.com/llvm/llvm-project/pull/67982.diff

16 Files Affected:

  • (modified) llvm/docs/LangRef.rst (+6)
  • (modified) llvm/include/llvm/AsmParser/LLToken.h (+1)
  • (modified) llvm/include/llvm/Bitcode/LLVMBitCodes.h (+3)
  • (modified) llvm/include/llvm/IR/InstrTypes.h (+14)
  • (modified) llvm/include/llvm/IR/Instruction.h (+7)
  • (modified) llvm/lib/AsmParser/LLLexer.cpp (+1)
  • (modified) llvm/lib/AsmParser/LLParser.cpp (+10-1)
  • (modified) llvm/lib/Bitcode/Reader/BitcodeReader.cpp (+8-3)
  • (modified) llvm/lib/Bitcode/Writer/BitcodeWriter.cpp (+22)
  • (modified) llvm/lib/IR/AsmWriter.cpp (+3)
  • (modified) llvm/lib/IR/Instruction.cpp (+24)
  • (modified) llvm/lib/IR/Operator.cpp (+4)
  • (modified) llvm/test/Assembler/flags.ll (+6)
  • (modified) llvm/test/Bitcode/flags.ll (+4)
  • (modified) llvm/test/Transforms/InstCombine/freeze.ll (+11)
  • (modified) llvm/test/Transforms/SimplifyCFG/HoistCode.ll (+30)
diff --git a/llvm/docs/LangRef.rst b/llvm/docs/LangRef.rst
index c8f74c19bd6b3cf..4c4f3cbbaf0d541 100644
--- a/llvm/docs/LangRef.rst
+++ b/llvm/docs/LangRef.rst
@@ -11156,6 +11156,9 @@ until it reaches the size of the destination type, ``ty2``.
 
 When zero extending from i1, the result will always be either 0 or 1.
 
+If the ``nneg`` flag is set, and the ``zext`` argument is negative, the result
+is a poison value.
+
 Example:
 """"""""
 
@@ -11165,6 +11168,9 @@ Example:
       %Y = zext i1 true to i32              ; yields i32:1
       %Z = zext <2 x i16> <i16 8, i16 7> to <2 x i32> ; yields <i32 8, i32 7>
 
+      %a = zext nneg i8 127 to i16 ; yields i16 127
+      %b = zext nneg i8 -1 to i16  ; yields i16 poison
+
 .. _i_sext:
 
 '``sext .. to``' Instruction
diff --git a/llvm/include/llvm/AsmParser/LLToken.h b/llvm/include/llvm/AsmParser/LLToken.h
index 673dc58ce6451e3..ae566672b97986b 100644
--- a/llvm/include/llvm/AsmParser/LLToken.h
+++ b/llvm/include/llvm/AsmParser/LLToken.h
@@ -110,6 +110,7 @@ enum Kind {
   kw_nsw,
   kw_exact,
   kw_inbounds,
+  kw_nneg,
   kw_inrange,
   kw_addrspace,
   kw_section,
diff --git a/llvm/include/llvm/Bitcode/LLVMBitCodes.h b/llvm/include/llvm/Bitcode/LLVMBitCodes.h
index 52e76356a892e45..6f11bb0b30c379a 100644
--- a/llvm/include/llvm/Bitcode/LLVMBitCodes.h
+++ b/llvm/include/llvm/Bitcode/LLVMBitCodes.h
@@ -505,6 +505,9 @@ enum FastMathMap {
   AllowReassoc    = (1 << 7)
 };
 
+/// Flags for serializing NonNegInstruction's SubclassOptionalData contents.
+enum NonNegInstructionOptionalFlags { NNI_NON_NEG = 0 };
+
 /// PossiblyExactOperatorOptionalFlags - Flags for serializing
 /// PossiblyExactOperator's SubclassOptionalData contents.
 enum PossiblyExactOperatorOptionalFlags { PEO_EXACT = 0 };
diff --git a/llvm/include/llvm/IR/InstrTypes.h b/llvm/include/llvm/IR/InstrTypes.h
index 6095b0a1be69cb3..b6da2bf836cb56f 100644
--- a/llvm/include/llvm/IR/InstrTypes.h
+++ b/llvm/include/llvm/IR/InstrTypes.h
@@ -692,6 +692,20 @@ class CastInst : public UnaryInstruction {
   }
 };
 
+/// Instruction that can have a nneg flag (only zext).
+class NonNegInstruction : public CastInst {
+public:
+  enum { NonNeg = (1 << 0) };
+
+  static bool classof(const Instruction *I) {
+    return I->getOpcode() == Instruction::ZExt;
+  }
+
+  static bool classof(const Value *V) {
+    return isa<Instruction>(V) && classof(cast<Instruction>(V));
+  }
+};
+
 //===----------------------------------------------------------------------===//
 //                               CmpInst Class
 //===----------------------------------------------------------------------===//
diff --git a/llvm/include/llvm/IR/Instruction.h b/llvm/include/llvm/IR/Instruction.h
index 69c3af5b76103f6..781a69e586a9e6b 100644
--- a/llvm/include/llvm/IR/Instruction.h
+++ b/llvm/include/llvm/IR/Instruction.h
@@ -407,12 +407,19 @@ class Instruction : public User,
   /// which supports this flag. See LangRef.html for the meaning of this flag.
   void setIsExact(bool b = true);
 
+  /// Set or clear the nneg flag on this instruction, which must be a zext
+  /// instruction.
+  void setNonNeg(bool b = true);
+
   /// Determine whether the no unsigned wrap flag is set.
   bool hasNoUnsignedWrap() const LLVM_READONLY;
 
   /// Determine whether the no signed wrap flag is set.
   bool hasNoSignedWrap() const LLVM_READONLY;
 
+  /// Determine whether the the nneg flag is set.
+  bool hasNonNeg() const LLVM_READONLY;
+
   /// Return true if this operator has flags which may cause this instruction
   /// to evaluate to poison despite having non-poison inputs.
   bool hasPoisonGeneratingFlags() const LLVM_READONLY;
diff --git a/llvm/lib/AsmParser/LLLexer.cpp b/llvm/lib/AsmParser/LLLexer.cpp
index 466bdebc001f589..02e1a1dce3c01b5 100644
--- a/llvm/lib/AsmParser/LLLexer.cpp
+++ b/llvm/lib/AsmParser/LLLexer.cpp
@@ -565,6 +565,7 @@ lltok::Kind LLLexer::LexIdentifier() {
   KEYWORD(nsw);
   KEYWORD(exact);
   KEYWORD(inbounds);
+  KEYWORD(nneg);
   KEYWORD(inrange);
   KEYWORD(addrspace);
   KEYWORD(section);
diff --git a/llvm/lib/AsmParser/LLParser.cpp b/llvm/lib/AsmParser/LLParser.cpp
index 04eabc94cfc6abe..2416814c53c0156 100644
--- a/llvm/lib/AsmParser/LLParser.cpp
+++ b/llvm/lib/AsmParser/LLParser.cpp
@@ -6381,8 +6381,17 @@ int LLParser::parseInstruction(Instruction *&Inst, BasicBlock *BB,
   }
 
   // Casts.
+  case lltok::kw_zext: {
+    bool NonNeg = EatIfPresent(lltok::kw_nneg);
+    bool Res = parseCast(Inst, PFS, KeywordVal);
+    if (NonNeg)
+      Inst->setNonNeg();
+    if (Res != 0) {
+      return Res;
+    }
+    return 0;
+  }
   case lltok::kw_trunc:
-  case lltok::kw_zext:
   case lltok::kw_sext:
   case lltok::kw_fptrunc:
   case lltok::kw_fpext:
diff --git a/llvm/lib/Bitcode/Reader/BitcodeReader.cpp b/llvm/lib/Bitcode/Reader/BitcodeReader.cpp
index e56291859022eec..28ba79e48e7697d 100644
--- a/llvm/lib/Bitcode/Reader/BitcodeReader.cpp
+++ b/llvm/lib/Bitcode/Reader/BitcodeReader.cpp
@@ -4875,12 +4875,13 @@ Error BitcodeReader::parseFunctionBody(Function *F) {
       Value *Op;
       unsigned OpTypeID;
       if (getValueTypePair(Record, OpNum, NextValueNo, Op, OpTypeID, CurBB) ||
-          OpNum+2 != Record.size())
+          OpNum + 1 > Record.size())
         return error("Invalid record");
 
-      ResTypeID = Record[OpNum];
+      ResTypeID = Record[OpNum++];
       Type *ResTy = getTypeByID(ResTypeID);
-      int Opc = getDecodedCastOpcode(Record[OpNum + 1]);
+      int Opc = getDecodedCastOpcode(Record[OpNum++]);
+
       if (Opc == -1 || !ResTy)
         return error("Invalid record");
       Instruction *Temp = nullptr;
@@ -4892,10 +4893,14 @@ Error BitcodeReader::parseFunctionBody(Function *F) {
         }
       } else {
         auto CastOp = (Instruction::CastOps)Opc;
+
         if (!CastInst::castIsValid(CastOp, Op, ResTy))
           return error("Invalid cast");
         I = CastInst::Create(CastOp, Op, ResTy);
       }
+      if (OpNum < Record.size() && isa<NonNegInstruction>(I) &&
+          (Record[OpNum] & (1 << bitc::NNI_NON_NEG)))
+        I->setNonNeg(true);
       InstructionList.push_back(I);
       break;
     }
diff --git a/llvm/lib/Bitcode/Writer/BitcodeWriter.cpp b/llvm/lib/Bitcode/Writer/BitcodeWriter.cpp
index e991d055f33474b..75ddc058c5bcd79 100644
--- a/llvm/lib/Bitcode/Writer/BitcodeWriter.cpp
+++ b/llvm/lib/Bitcode/Writer/BitcodeWriter.cpp
@@ -123,6 +123,7 @@ enum {
   FUNCTION_INST_BINOP_ABBREV,
   FUNCTION_INST_BINOP_FLAGS_ABBREV,
   FUNCTION_INST_CAST_ABBREV,
+  FUNCTION_INST_CAST_FLAGS_ABBREV,
   FUNCTION_INST_RET_VOID_ABBREV,
   FUNCTION_INST_RET_VAL_ABBREV,
   FUNCTION_INST_UNREACHABLE_ABBREV,
@@ -1549,6 +1550,9 @@ static uint64_t getOptimizationFlags(const Value *V) {
       Flags |= bitc::AllowContract;
     if (FPMO->hasApproxFunc())
       Flags |= bitc::ApproxFunc;
+  } else if (const auto *NNI = dyn_cast<NonNegInstruction>(V)) {
+    if (NNI->hasNonNeg())
+      Flags |= 1 << bitc::NNI_NON_NEG;
   }
 
   return Flags;
@@ -2825,6 +2829,12 @@ void ModuleBitcodeWriter::writeInstruction(const Instruction &I,
         AbbrevToUse = FUNCTION_INST_CAST_ABBREV;
       Vals.push_back(VE.getTypeID(I.getType()));
       Vals.push_back(getEncodedCastOpcode(I.getOpcode()));
+      uint64_t Flags = getOptimizationFlags(&I);
+      if (Flags != 0) {
+        if (AbbrevToUse == FUNCTION_INST_CAST_ABBREV)
+          AbbrevToUse = FUNCTION_INST_CAST_FLAGS_ABBREV;
+        Vals.push_back(Flags);
+      }
     } else {
       assert(isa<BinaryOperator>(I) && "Unknown instruction!");
       Code = bitc::FUNC_CODE_INST_BINOP;
@@ -3646,6 +3656,18 @@ void ModuleBitcodeWriter::writeBlockInfo() {
         FUNCTION_INST_CAST_ABBREV)
       llvm_unreachable("Unexpected abbrev ordering!");
   }
+  { // INST_CAST_FLAGS abbrev for FUNCTION_BLOCK.
+    auto Abbv = std::make_shared<BitCodeAbbrev>();
+    Abbv->Add(BitCodeAbbrevOp(bitc::FUNC_CODE_INST_CAST));
+    Abbv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::VBR, 6)); // OpVal
+    Abbv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::Fixed,    // dest ty
+                              VE.computeBitsRequiredForTypeIndicies()));
+    Abbv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::Fixed, 4)); // opc
+    Abbv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::Fixed, 8)); // flags
+    if (Stream.EmitBlockInfoAbbrev(bitc::FUNCTION_BLOCK_ID, Abbv) !=
+        FUNCTION_INST_CAST_FLAGS_ABBREV)
+      llvm_unreachable("Unexpected abbrev ordering!");
+  }
 
   { // INST_RET abbrev for FUNCTION_BLOCK.
     auto Abbv = std::make_shared<BitCodeAbbrev>();
diff --git a/llvm/lib/IR/AsmWriter.cpp b/llvm/lib/IR/AsmWriter.cpp
index e190d82127908db..abcee3f599da633 100644
--- a/llvm/lib/IR/AsmWriter.cpp
+++ b/llvm/lib/IR/AsmWriter.cpp
@@ -1347,6 +1347,9 @@ static void WriteOptimizationInfo(raw_ostream &Out, const User *U) {
   } else if (const GEPOperator *GEP = dyn_cast<GEPOperator>(U)) {
     if (GEP->isInBounds())
       Out << " inbounds";
+  } else if (const auto *NNI = dyn_cast<NonNegInstruction>(U)) {
+    if (NNI->hasNonNeg())
+      Out << " nneg";
   }
 }
 
diff --git a/llvm/lib/IR/Instruction.cpp b/llvm/lib/IR/Instruction.cpp
index b497951a598cc50..e7c8d1090d95ac1 100644
--- a/llvm/lib/IR/Instruction.cpp
+++ b/llvm/lib/IR/Instruction.cpp
@@ -172,6 +172,12 @@ void Instruction::setIsExact(bool b) {
   cast<PossiblyExactOperator>(this)->setIsExact(b);
 }
 
+void Instruction::setNonNeg(bool b) {
+  assert(isa<NonNegInstruction>(this) && "Must be zext");
+  SubclassOptionalData = (SubclassOptionalData & ~NonNegInstruction::NonNeg) |
+                         (b * NonNegInstruction::NonNeg);
+}
+
 bool Instruction::hasNoUnsignedWrap() const {
   return cast<OverflowingBinaryOperator>(this)->hasNoUnsignedWrap();
 }
@@ -180,6 +186,11 @@ bool Instruction::hasNoSignedWrap() const {
   return cast<OverflowingBinaryOperator>(this)->hasNoSignedWrap();
 }
 
+bool Instruction::hasNonNeg() const {
+  assert(isa<NonNegInstruction>(this) && "Must be zext");
+  return (SubclassOptionalData & NonNegInstruction::NonNeg) != 0;
+}
+
 bool Instruction::hasPoisonGeneratingFlags() const {
   return cast<Operator>(this)->hasPoisonGeneratingFlags();
 }
@@ -204,7 +215,12 @@ void Instruction::dropPoisonGeneratingFlags() {
   case Instruction::GetElementPtr:
     cast<GetElementPtrInst>(this)->setIsInBounds(false);
     break;
+
+  case Instruction::ZExt:
+    setNonNeg(false);
+    break;
   }
+
   if (isa<FPMathOperator>(this)) {
     setHasNoNaNs(false);
     setHasNoInfs(false);
@@ -379,6 +395,10 @@ void Instruction::copyIRFlags(const Value *V, bool IncludeWrapFlags) {
   if (auto *SrcGEP = dyn_cast<GetElementPtrInst>(V))
     if (auto *DestGEP = dyn_cast<GetElementPtrInst>(this))
       DestGEP->setIsInBounds(SrcGEP->isInBounds() || DestGEP->isInBounds());
+
+  if (auto *NNI = dyn_cast<NonNegInstruction>(V))
+    if (isa<NonNegInstruction>(this))
+      setNonNeg(NNI->hasNonNeg());
 }
 
 void Instruction::andIRFlags(const Value *V) {
@@ -404,6 +424,10 @@ void Instruction::andIRFlags(const Value *V) {
   if (auto *SrcGEP = dyn_cast<GetElementPtrInst>(V))
     if (auto *DestGEP = dyn_cast<GetElementPtrInst>(this))
       DestGEP->setIsInBounds(SrcGEP->isInBounds() && DestGEP->isInBounds());
+
+  if (auto *NNI = dyn_cast<NonNegInstruction>(V))
+    if (isa<NonNegInstruction>(this))
+      setNonNeg(hasNonNeg() && NNI->hasNonNeg());
 }
 
 const char *Instruction::getOpcodeName(unsigned OpCode) {
diff --git a/llvm/lib/IR/Operator.cpp b/llvm/lib/IR/Operator.cpp
index d2a1f2eb49dafed..a8be4233991d362 100644
--- a/llvm/lib/IR/Operator.cpp
+++ b/llvm/lib/IR/Operator.cpp
@@ -37,6 +37,10 @@ bool Operator::hasPoisonGeneratingFlags() const {
     // Note: inrange exists on constexpr only
     return GEP->isInBounds() || GEP->getInRangeIndex() != std::nullopt;
   }
+  case Instruction::ZExt:
+    if (auto *NNI = dyn_cast<NonNegInstruction>(this))
+      return NNI->hasNonNeg();
+    return false;
   default:
     if (const auto *FP = dyn_cast<FPMathOperator>(this))
       return FP->hasNoNaNs() || FP->hasNoInfs();
diff --git a/llvm/test/Assembler/flags.ll b/llvm/test/Assembler/flags.ll
index 3b54b06b81d4e2b..8331edf52a1699d 100644
--- a/llvm/test/Assembler/flags.ll
+++ b/llvm/test/Assembler/flags.ll
@@ -260,3 +260,9 @@ define i64 @mul_unsigned_ce() {
 	ret i64 mul nuw (i64 ptrtoint (ptr @addr to i64), i64 91)
 }
 
+define i64 @test_zext(i32 %a) {
+; CHECK: %res = zext nneg i32 %a to i64
+  %res = zext nneg i32 %a to i64
+  ret i64 %res
+}
+
diff --git a/llvm/test/Bitcode/flags.ll b/llvm/test/Bitcode/flags.ll
index 6febaa6b40df863..a6e368b7e76327f 100644
--- a/llvm/test/Bitcode/flags.ll
+++ b/llvm/test/Bitcode/flags.ll
@@ -16,6 +16,8 @@ second:                                           ; preds = %first
   %s = add nsw i32 %a, 0                          ; <i32> [#uses=0]
   %us = add nuw nsw i32 %a, 0                     ; <i32> [#uses=0]
   %z = add i32 %a, 0                              ; <i32> [#uses=0]
+  %hh = zext nneg i32 %a to i64
+  %ll = zext i32 %s to i64
   unreachable
 
 first:                                            ; preds = %entry
@@ -24,5 +26,7 @@ first:                                            ; preds = %entry
   %ss = add nsw i32 %a, 0                         ; <i32> [#uses=0]
   %uuss = add nuw nsw i32 %a, 0                   ; <i32> [#uses=0]
   %zz = add i32 %a, 0                             ; <i32> [#uses=0]
+  %kk = zext nneg i32 %a to i64
+  %rr = zext i32 %ss to i64
   br label %second
 }
diff --git a/llvm/test/Transforms/InstCombine/freeze.ll b/llvm/test/Transforms/InstCombine/freeze.ll
index f8e9a757e2bc938..3fde49d08481278 100644
--- a/llvm/test/Transforms/InstCombine/freeze.ll
+++ b/llvm/test/Transforms/InstCombine/freeze.ll
@@ -1116,6 +1116,17 @@ define i32 @freeze_ctpop(i32 %x) {
   ret i32 %fr
 }
 
+define i32 @freeze_zext_nneg(i8 %x) {
+; CHECK-LABEL: @freeze_zext_nneg(
+; CHECK-NEXT:    [[X_FR:%.*]] = freeze i8 [[X:%.*]]
+; CHECK-NEXT:    [[ZEXT:%.*]] = zext i8 [[X_FR]] to i32
+; CHECK-NEXT:    ret i32 [[ZEXT]]
+;
+  %zext = zext nneg i8 %x to i32
+  %fr = freeze i32 %zext
+  ret i32 %fr
+}
+
 !0 = !{}
 !1 = !{i64 4}
 !2 = !{i32 0, i32 100}
diff --git a/llvm/test/Transforms/SimplifyCFG/HoistCode.ll b/llvm/test/Transforms/SimplifyCFG/HoistCode.ll
index 4088ecfc818982f..08cf6cd5be80cf7 100644
--- a/llvm/test/Transforms/SimplifyCFG/HoistCode.ll
+++ b/llvm/test/Transforms/SimplifyCFG/HoistCode.ll
@@ -94,3 +94,33 @@ end:
   %cond = phi fast float [ 0.0, %bb0 ], [ %x, %bb1 ], [ %x, %bb2 ]
   ret float %cond
 }
+
+define i32 @hoist_zext_flags_preserve(i1 %C, i8 %x) {
+; CHECK-LABEL: @hoist_zext_flags_preserve(
+; CHECK-NEXT:  common.ret:
+; CHECK-NEXT:    [[Z1:%.*]] = zext nneg i8 [[X:%.*]] to i32
+; CHECK-NEXT:    ret i32 [[Z1]]
+;
+  br i1 %C, label %T, label %F
+T:
+  %z1 = zext nneg i8 %x to i32
+  ret i32 %z1
+F:
+  %z2 = zext nneg i8 %x to i32
+  ret i32 %z2
+}
+
+define i32 @hoist_zext_flags_drop(i1 %C, i8 %x) {
+; CHECK-LABEL: @hoist_zext_flags_drop(
+; CHECK-NEXT:  common.ret:
+; CHECK-NEXT:    [[Z1:%.*]] = zext i8 [[X:%.*]] to i32
+; CHECK-NEXT:    ret i32 [[Z1]]
+;
+  br i1 %C, label %T, label %F
+T:
+  %z1 = zext nneg i8 %x to i32
+  ret i32 %z1
+F:
+  %z2 = zext i8 %x to i32
+  ret i32 %z2
+}

@nikic
Copy link
Contributor Author

nikic commented Oct 2, 2023

Wonder whether the flag should also be supported on sext. I don't think it would be terribly useful (as we canonicalize sext -> zext, not the other way around), but it might make sense in the interest of consistency.

(The counter-argument is that it increases the places whether flags needs to be preserved by a factor of two, for a case where it's not worthwhile.)

@nikic
Copy link
Contributor Author

nikic commented Oct 2, 2023

Just saw https://discourse.llvm.org/t/aggressive-conversion-of-sext-to-zext-blocks-indvarsimplify/61561/9?u=nikic, which proposed to call this zext nsw instead (and the equivalent sext would be sext nuw). I can definitely see the appeal in reusing existing nowrap flags, though it seems somewhat awkward in this instance, especially with one of the flags always being implied.

@leo-ard
Copy link
Contributor

leo-ard commented Oct 2, 2023

Thanks for doing this PR, I am the author of #67594 and this is very useful. I just had some ideas from your though/comments and wanted to tag along

Wonder whether the flag should also be supported on sext.

I don't think this would be useful, even for consistency, as zext nneg and sext nneg are exactly the same instruction (pad zeros at the front of the value, and we know that the argument is positive). I think a better question is : should we add the nneg attribute to zext or sext ? or even : Is there an underlying instruction, such as ext that could represent what we want ? We could have ext sign ... ext zero ... and ext nneg .... All of the tree scenarios (signed, zero and zero but we know its positive) seems to be mutually exclusive. But at the end, probably that zext nneg is the easiest to implement (compared to renaming sext -> ext sign and zext -> ext zero) and makes the most sens as we are zero-expanding

@nikic
Copy link
Contributor Author

nikic commented Oct 3, 2023

I think a better question is : should we add the nneg attribute to zext or sext ? or even : Is there an underlying instruction, such as ext that could represent what we want ? We could have ext sign ... ext zero ... and ext nneg .... All of the tree scenarios (signed, zero and zero but we know its positive) seems to be mutually exclusive. But at the end, probably that zext nneg is the easiest to implement (compared to renaming sext -> ext sign and zext -> ext zero) and makes the most sens as we are zero-expanding

Yeah, I think this basically comes down the how to hard it is to implement. Adding a new flag to zext is based on the well-understood existing mechanism of poison-generating flags and degrades gracefully: All existing zext folds continue working. If a fold doesn't consider the nneg flag (as will be the case for all folds initially), the worst that can happen is that the flag is dropped or a fold that could make use of it doesn't, but we'll never end up doing something worse than we do now. This allows gradual introduction without breaking anything.

The main disadvantage to zext nneg I see is that zext continues to be "privileged" as our choice of canonicalization. Optimizations expecting a zext can just look for a zext. Optimizations expecting an sext may have to look for a "sext or zext nneg".


For the "common ext instruction" approach, the way I would frame this is probably that there are ext zero, ext sign and ext zero sign, where the latter is both a zero and a sign extend, returning poison if those two options conflict. The "m_ZExt" would match any ext with zero flag (either ext zero or ext zero sign) and m_SExt would do the same with the sign flag. Two exts would be considered "identical if defined" if they have at least one common flag. Flag intersection would then be that common flag. This means that ext zero sign could be combined with both ext zero and ext sign by dropping the appropriate flag. The "drop poison-generating flags" operation would still have to make an arbitrary choice, such that freeze(ext zero sign) would presumably become ext zero (freeze).

The nice thing about this approach is that zext is no longer privileged, the model forces handling zext and sext on the same footing. On the other hand, this would be a major IR change, and it doesn't quite fit into the usual paradigm of poison-generating flags (because these "flags" aren't the kind that can be simply dropped).

(A further extension of this would be to also have plain ext which is extension without forcing the high bits to be zero or sign. In other words, this represents the anyext operation. But I don't think we actually want something like this at the IR level, because this would have to be based on undef bits, which is a concept we'd rather get rid off.)

@nikic
Copy link
Contributor Author

nikic commented Oct 6, 2023

I've submitted an RFC for this change: https://discourse.llvm.org/t/rfc-add-zext-nneg-flag/73914

@@ -11156,6 +11156,9 @@ until it reaches the size of the destination type, ``ty2``.

When zero extending from i1, the result will always be either 0 or 1.

If the ``nneg`` flag is set, and the ``zext`` argument is negative, the result
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The description of nneg is very minimal, vs the more detailed description in the RFC and PR. Maybe add to "Overview" something like "The nneg (non-negative) flag, if present, specifies that the operand is non-negative. This property may be used by optimization passes to later convert the zext to a semantically equivalent sext."

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done! (I dropped the "semantically equivalent" bit because it's a refinement, not an equivalence.)

@nikic
Copy link
Contributor Author

nikic commented Oct 16, 2023

ping

@goldsteinn
Copy link
Contributor

The main disadvantage to zext nneg I see is that zext continues to be "privileged" as our choice of canonicalization. Optimizations expecting a zext can just look for a zext. Optimizations expecting an sext may have to look for a "sext or zext nneg".

This seems easy enough to fix with a new pattern m_SExtLike. Few transforms actually check that opcode is sext and those could almost certainly be refactored relatively easily.

@@ -692,6 +692,20 @@ class CastInst : public UnaryInstruction {
}
};

/// Instruction that can have a nneg flag (only zext).
class NonNegInstruction : public CastInst {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Personally not a fan of NonNegInstruction as the name. To me atleast, that seems to imply a lot more coverage than just zext.
Maybe a bit more specific NonNegCastInstruction?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe it would be best to just drop it entirely? Given that we have just the one instruction supporting the flag, it might be better to stick with isa<ZExtInst> over isa<NonNegInstruction>. There is some risk that this may require future refactoring if we add the flag on additional instructions -- but that might be better than doing an unnecessary over-generalization now.

(We might want to add nneg on GEPs, but in that case the semantics of the flag would be sufficiently different that they could not share any common handling anyway.)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this meant to mirror how we use OverflowingBinaryOperator now?
But I'd say if there is even a change we might expand this, keeping the class makes sense but would leave it's name as NonNegCastInstruction for now. The refactor to rename NonNegCastInstruction -> NonNegInstruction will probably be cheaper than if we skip it entirely.
I guess when I made the comment I wasn't thinking the flag may be applied to other instructions, which makes me less opposed to the current name.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this meant to mirror how we use OverflowingBinaryOperator now?

Yes. I guess the name is confusing in that it sounds like "an instruction with the nneg flag", while in reality this is "an instruction that may or may not have the nneg flag". A more accurate name, taking inspiration from PossiblyExactOperator, would be PossiblyNonNegInstruction. Or I guess PossiblyNonNegCastInst taking your suggestion into account.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, I would go with PossiblyNonNegInstruction. Think with that name (since its so explicitly a query) the CastInst isn't so important.

if (!CastInst::castIsValid(CastOp, Op, ResTy))
return error("Invalid cast");
I = CastInst::Create(CastOp, Op, ResTy);
}
if (OpNum < Record.size() && isa<NonNegInstruction>(I) &&
(Record[OpNum] & (1 << bitc::NNI_NON_NEG)))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: think bitc:NNI_NON_NEG can just be 1 << bitc::NNI_NON_NEG (as opposed to the log). All the uses I see are as the mask and that seems to fit the pattern of other enums better as well.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If there is a reason to keep as log then this should be ResTypeID(1) << ... and below should be uint64_t(1) << ...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is just following the same convention used for the other flags (OBO and PEO):

if (OpNum < Record.size()) {
if (Opc == Instruction::Add ||
Opc == Instruction::Sub ||
Opc == Instruction::Mul ||
Opc == Instruction::Shl) {
if (Record[OpNum] & (1 << bitc::OBO_NO_SIGNED_WRAP))
cast<BinaryOperator>(I)->setHasNoSignedWrap(true);
if (Record[OpNum] & (1 << bitc::OBO_NO_UNSIGNED_WRAP))
cast<BinaryOperator>(I)->setHasNoUnsignedWrap(true);
} else if (Opc == Instruction::SDiv ||
Opc == Instruction::UDiv ||
Opc == Instruction::LShr ||
Opc == Instruction::AShr) {
if (Record[OpNum] & (1 << bitc::PEO_EXACT))
cast<BinaryOperator>(I)->setIsExact(true);

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay to keep as log then (although I'd say preferably all would be changed to be the mask), but then can you cast 1 to the proper shift width. Otherwise just feels like bug waiting to happen.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the "bug waiting to happen" that we are going to add 30 more flags to this instructions? That seems like a pretty contrived concern, especially when instructions can only have up to 7 flags.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair enough

karouzakisp and others added 3 commits October 18, 2023 09:35
Add an nneg flag to the zext instruction, which specifies that the
argument is non-negative. Otherwise, the result is a poison value.

The primary use-case for the flag is to preserve information when
sext gets replaced with zext due to range-based canonicalization.
The nneg flag allows us to convert the zext back into an sext
later. This is useful for some optimizations (e.g. a signed icmp
can fold with sext but not zext), as well as some targets (e.g.
RISCV prefers sext over zext).

Co-authored-by: Nikita Popov <npopov@redhat.com>
@nikic
Copy link
Contributor Author

nikic commented Oct 26, 2023

ping

@goldsteinn
Copy link
Contributor

LGTM, but if another could also check it this is not an area I'm fully qualified to review.

@@ -37,6 +37,10 @@ bool Operator::hasPoisonGeneratingFlags() const {
// Note: inrange exists on constexpr only
return GEP->isInBounds() || GEP->getInRangeIndex() != std::nullopt;
}
case Instruction::ZExt:
if (auto *NNI = dyn_cast<PossiblyNonNegInst>(this))
Copy link
Collaborator

@topperc topperc Oct 27, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need a dyn_cast instead of a cast like the other cases?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is in case it's a constant expression. I haven't completely finished the zext expr removal yet.

Copy link
Collaborator

@topperc topperc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@nikic nikic merged commit ed3f06b into llvm:main Oct 30, 2023
4 checks passed
preames added a commit that referenced this pull request Oct 31, 2023
zext nneg was recently added to the IR in #67982. Teaching SCEVExpander
to emit nneg when possible is valuable since SCEV may have proved
non-trivial facts about loop bounds which would otherwise be lost when
materializing the value.
preames added a commit to preames/llvm-project that referenced this pull request Nov 2, 2023
This fixes a miscompile introduced in the recent llvm#67982, and likely exposed
in changes since to infer and leverage the same.  No active bug reports as
of yet.

This was noticed in llvm#70858 (comment).
preames added a commit that referenced this pull request Nov 2, 2023
This fixes a miscompile introduced in the recent #67982, and likely
exposed in changes since to infer and leverage the same. No active bug
reports as of yet.

This was noticed in
#70858 (comment).
preames added a commit to preames/llvm-project that referenced this pull request Nov 2, 2023
zext nneg was recently added to the IR in llvm#67982.   This patch teaches
demanded bits and known bits about the semantics of the instruction, and
adds a couple of test cases to illustrate basic functionality.
preames added a commit that referenced this pull request Nov 3, 2023
zext nneg was recently added to the IR in #67982.  This patch teaches
SimplifyIndVars to prefer zext nneg over *both* sext and plain zext,
when a local SCEV query indicates the source is non-negative.

The choice to prefer zext nneg over sext looks slightly aggressive
here, but probably isn't so much in practice.  For cases where we'd
"remember" the range fact, instcombine would convert the sext into
a zext nneg anyways.  The only cases where this produces a different
result overall are when SCEV knows a non-local fact, and it doesn't
get materialized into the IR.  Those are exactly the cases where
using zext nneg are most useful.  We do run the risk of e.g. a
missing combine - since we haven't updated most of them yet - but
that seems like a manageable risk.

Note that there are much deeper algorithmic changes we could make
to this code to exploit zext nneg, but this seemed like a reasonable
and low risk starting point.
preames added a commit that referenced this pull request Nov 7, 2023
zext nneg was recently added to the IR in #67982.   This patch teaches
demanded bits and known bits about the semantics of the instruction, and
adds a couple of test cases to illustrate basic functionality.
dtcxzyw pushed a commit that referenced this pull request Nov 12, 2023
This PR fixes #55013 : the
max intrinsics is not generated for this simple loop case :
https://godbolt.org/z/hxz1xhMPh. This is caused by a ICMP not being
folded into a select, thus not generating the max intrinsics.

For the story :

Since LLVM 14, SCCP pass got smarter by folding sext into zext for
positive ranges : https://reviews.llvm.org/D81756. After this change,
InstCombine was sometimes unable to fold ICMP correctly as both of the
arguments pointed to mismatched zext/sext. To fix this, @rotateright
implemented this fix : https://reviews.llvm.org/D124419 that tries to
resolve the mismatch by knowing if the argument of a zext is positive
(in which case, it is like a sext) by using ValueTracking, however
ValueTracking is not smart enough to infer that the value is positive in
some cases. Recently, @nikic implemented #67982 which keeps the
information that a zext is non-negative. This PR simply uses this
information to do the folding accordingly.

TLDR : This PR uses the recent nneg tag on zext to fold the icmp
accordingly in instcombine.

This PR also contains test cases for sext/zext folding with InstCombine
as well as a x86 regression tests for the max/min case.
zahiraam pushed a commit to zahiraam/llvm-project that referenced this pull request Nov 20, 2023
This PR fixes llvm#55013 : the
max intrinsics is not generated for this simple loop case :
https://godbolt.org/z/hxz1xhMPh. This is caused by a ICMP not being
folded into a select, thus not generating the max intrinsics.

For the story :

Since LLVM 14, SCCP pass got smarter by folding sext into zext for
positive ranges : https://reviews.llvm.org/D81756. After this change,
InstCombine was sometimes unable to fold ICMP correctly as both of the
arguments pointed to mismatched zext/sext. To fix this, @rotateright
implemented this fix : https://reviews.llvm.org/D124419 that tries to
resolve the mismatch by knowing if the argument of a zext is positive
(in which case, it is like a sext) by using ValueTracking, however
ValueTracking is not smart enough to infer that the value is positive in
some cases. Recently, @nikic implemented llvm#67982 which keeps the
information that a zext is non-negative. This PR simply uses this
information to do the folding accordingly.

TLDR : This PR uses the recent nneg tag on zext to fold the icmp
accordingly in instcombine.

This PR also contains test cases for sext/zext folding with InstCombine
as well as a x86 regression tests for the max/min case.
nikic pushed a commit that referenced this pull request Nov 29, 2023
…tions (#73592)

This flag was added in #67982, but was not yet accessible via the C API.
This commit adds a getter/setter for this flag, and a test for it.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants