Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[GlobalIsel] Import GEP flags #93850

Merged
merged 6 commits into from
Jun 14, 2024
Merged

[GlobalIsel] Import GEP flags #93850

merged 6 commits into from
Jun 14, 2024

Conversation

tschuett
Copy link

@llvmbot
Copy link
Member

llvmbot commented May 30, 2024

@llvm/pr-subscribers-backend-amdgpu
@llvm/pr-subscribers-backend-x86
@llvm/pr-subscribers-llvm-globalisel

@llvm/pr-subscribers-backend-aarch64

Author: Thorsten Schütt (tschuett)

Changes

#90824


Patch is 275.90 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/93850.diff

3 Files Affected:

  • (modified) llvm/lib/CodeGen/GlobalISel/IRTranslator.cpp (+2-4)
  • (modified) llvm/lib/CodeGen/MachineInstr.cpp (+5)
  • (modified) llvm/test/CodeGen/AArch64/GlobalISel/arm64-irtranslator.ll (+4262-627)
diff --git a/llvm/lib/CodeGen/GlobalISel/IRTranslator.cpp b/llvm/lib/CodeGen/GlobalISel/IRTranslator.cpp
index 5289b993476db..a93856257dfba 100644
--- a/llvm/lib/CodeGen/GlobalISel/IRTranslator.cpp
+++ b/llvm/lib/CodeGen/GlobalISel/IRTranslator.cpp
@@ -1583,10 +1583,8 @@ bool IRTranslator::translateGetElementPtr(const User &U,
   LLT OffsetTy = getLLTForType(*OffsetIRTy, *DL);
 
   uint32_t Flags = 0;
-  if (isa<Instruction>(U)) {
-    const Instruction &I = cast<Instruction>(U);
-    Flags = MachineInstr::copyFlagsFromInstruction(I);
-  }
+  if (const Instruction *I = dyn_cast<Instruction>(&U))
+    Flags = MachineInstr::copyFlagsFromInstruction(*I);
 
   // Normalize Vector GEP - all scalar operands should be converted to the
   // splat vector.
diff --git a/llvm/lib/CodeGen/MachineInstr.cpp b/llvm/lib/CodeGen/MachineInstr.cpp
index 02479f31f0b69..b3c0abe4688eb 100644
--- a/llvm/lib/CodeGen/MachineInstr.cpp
+++ b/llvm/lib/CodeGen/MachineInstr.cpp
@@ -576,6 +576,11 @@ uint32_t MachineInstr::copyFlagsFromInstruction(const Instruction &I) {
       MIFlags |= MachineInstr::MIFlag::NoSWrap;
     if (TI->hasNoUnsignedWrap())
       MIFlags |= MachineInstr::MIFlag::NoUWrap;
+  } else if (const GetElementPtrInst *GEP = dyn_cast<GetElementPtrInst>(&I)) {
+    if (GEP->hasNoUnsignedSignedWrap())
+      MIFlags |= MachineInstr::MIFlag::NoSWrap;
+    if (GEP->hasNoUnsignedWrap())
+      MIFlags |= MachineInstr::MIFlag::NoUWrap;
   }
 
   // Copy the nonneg flag.
diff --git a/llvm/test/CodeGen/AArch64/GlobalISel/arm64-irtranslator.ll b/llvm/test/CodeGen/AArch64/GlobalISel/arm64-irtranslator.ll
index a61931b898aea..28c4965d647d7 100644
--- a/llvm/test/CodeGen/AArch64/GlobalISel/arm64-irtranslator.ll
+++ b/llvm/test/CodeGen/AArch64/GlobalISel/arm64-irtranslator.ll
@@ -1,3 +1,4 @@
+; NOTE: Assertions have been autogenerated by utils/update_mir_test_checks.py UTC_ARGS: --version 5
 ; RUN: llc -O0 -aarch64-enable-atomic-cfg-tidy=0 -mattr=+lse -stop-after=irtranslator -global-isel -verify-machineinstrs %s -o - 2>&1 | FileCheck %s
 ; RUN: llc -O3 -aarch64-enable-atomic-cfg-tidy=0 -mattr=+lse -stop-after=irtranslator -global-isel -verify-machineinstrs %s -o - 2>&1 | FileCheck %s --check-prefix=O3
 
@@ -14,6 +15,25 @@ target triple = "aarch64--"
 ; CHECK-NEXT: $x0 = COPY [[RES]]
 ; CHECK-NEXT: RET_ReallyLR implicit $x0
 define i64 @addi64(i64 %arg1, i64 %arg2) {
+  ; CHECK-LABEL: name: addi64
+  ; CHECK: bb.1 (%ir-block.0):
+  ; CHECK-NEXT:   liveins: $x0, $x1
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT:   [[COPY:%[0-9]+]]:_(s64) = COPY $x0
+  ; CHECK-NEXT:   [[COPY1:%[0-9]+]]:_(s64) = COPY $x1
+  ; CHECK-NEXT:   [[ADD:%[0-9]+]]:_(s64) = G_ADD [[COPY]], [[COPY1]]
+  ; CHECK-NEXT:   $x0 = COPY [[ADD]](s64)
+  ; CHECK-NEXT:   RET_ReallyLR implicit $x0
+  ;
+  ; O3-LABEL: name: addi64
+  ; O3: bb.1 (%ir-block.0):
+  ; O3-NEXT:   liveins: $x0, $x1
+  ; O3-NEXT: {{  $}}
+  ; O3-NEXT:   [[COPY:%[0-9]+]]:_(s64) = COPY $x0
+  ; O3-NEXT:   [[COPY1:%[0-9]+]]:_(s64) = COPY $x1
+  ; O3-NEXT:   [[ADD:%[0-9]+]]:_(s64) = G_ADD [[COPY]], [[COPY1]]
+  ; O3-NEXT:   $x0 = COPY [[ADD]](s64)
+  ; O3-NEXT:   RET_ReallyLR implicit $x0
   %res = add i64 %arg1, %arg2
   ret i64 %res
 }
@@ -25,6 +45,25 @@ define i64 @addi64(i64 %arg1, i64 %arg2) {
 ; CHECK-NEXT: $x0 = COPY [[RES]]
 ; CHECK-NEXT: RET_ReallyLR implicit $x0
 define i64 @muli64(i64 %arg1, i64 %arg2) {
+  ; CHECK-LABEL: name: muli64
+  ; CHECK: bb.1 (%ir-block.0):
+  ; CHECK-NEXT:   liveins: $x0, $x1
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT:   [[COPY:%[0-9]+]]:_(s64) = COPY $x0
+  ; CHECK-NEXT:   [[COPY1:%[0-9]+]]:_(s64) = COPY $x1
+  ; CHECK-NEXT:   [[MUL:%[0-9]+]]:_(s64) = G_MUL [[COPY]], [[COPY1]]
+  ; CHECK-NEXT:   $x0 = COPY [[MUL]](s64)
+  ; CHECK-NEXT:   RET_ReallyLR implicit $x0
+  ;
+  ; O3-LABEL: name: muli64
+  ; O3: bb.1 (%ir-block.0):
+  ; O3-NEXT:   liveins: $x0, $x1
+  ; O3-NEXT: {{  $}}
+  ; O3-NEXT:   [[COPY:%[0-9]+]]:_(s64) = COPY $x0
+  ; O3-NEXT:   [[COPY1:%[0-9]+]]:_(s64) = COPY $x1
+  ; O3-NEXT:   [[MUL:%[0-9]+]]:_(s64) = G_MUL [[COPY]], [[COPY1]]
+  ; O3-NEXT:   $x0 = COPY [[MUL]](s64)
+  ; O3-NEXT:   RET_ReallyLR implicit $x0
   %res = mul i64 %arg1, %arg2
   ret i64 %res
 }
@@ -47,6 +86,21 @@ define i64 @muli64(i64 %arg1, i64 %arg2) {
 ; CHECK: %{{[0-9]+}}:_(p0) = G_FRAME_INDEX %stack.2.ptr3
 ; CHECK: %{{[0-9]+}}:_(p0) = G_FRAME_INDEX %stack.3.ptr4
 define void @allocai64() {
+  ; CHECK-LABEL: name: allocai64
+  ; CHECK: bb.1 (%ir-block.0):
+  ; CHECK-NEXT:   [[FRAME_INDEX:%[0-9]+]]:_(p0) = G_FRAME_INDEX %stack.0.ptr1
+  ; CHECK-NEXT:   [[FRAME_INDEX1:%[0-9]+]]:_(p0) = G_FRAME_INDEX %stack.1.ptr2
+  ; CHECK-NEXT:   [[FRAME_INDEX2:%[0-9]+]]:_(p0) = G_FRAME_INDEX %stack.2.ptr3
+  ; CHECK-NEXT:   [[FRAME_INDEX3:%[0-9]+]]:_(p0) = G_FRAME_INDEX %stack.3.ptr4
+  ; CHECK-NEXT:   RET_ReallyLR
+  ;
+  ; O3-LABEL: name: allocai64
+  ; O3: bb.1 (%ir-block.0):
+  ; O3-NEXT:   [[FRAME_INDEX:%[0-9]+]]:_(p0) = G_FRAME_INDEX %stack.0.ptr1
+  ; O3-NEXT:   [[FRAME_INDEX1:%[0-9]+]]:_(p0) = G_FRAME_INDEX %stack.1.ptr2
+  ; O3-NEXT:   [[FRAME_INDEX2:%[0-9]+]]:_(p0) = G_FRAME_INDEX %stack.2.ptr3
+  ; O3-NEXT:   [[FRAME_INDEX3:%[0-9]+]]:_(p0) = G_FRAME_INDEX %stack.3.ptr4
+  ; O3-NEXT:   RET_ReallyLR
   %ptr1 = alloca i64
   %ptr2 = alloca i64, align 1
   %ptr3 = alloca i64, i32 16
@@ -75,6 +129,23 @@ define void @allocai64() {
 ; CHECK-NEXT: successors: %[[END]](0x80000000)
 ; CHECK: G_BR %[[END]]
 define void @uncondbr() {
+  ; CHECK-LABEL: name: uncondbr
+  ; CHECK: bb.1.entry:
+  ; CHECK-NEXT:   successors: %bb.3(0x80000000)
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT:   G_BR %bb.3
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT: bb.2.end:
+  ; CHECK-NEXT:   RET_ReallyLR
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT: bb.3.bb2:
+  ; CHECK-NEXT:   successors: %bb.2(0x80000000)
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT:   G_BR %bb.2
+  ;
+  ; O3-LABEL: name: uncondbr
+  ; O3: bb.1.entry:
+  ; O3-NEXT:   RET_ReallyLR
 entry:
   br label %bb2
 end:
@@ -90,6 +161,21 @@ bb2:
 ; CHECK: [[END]].{{[a-zA-Z0-9.]+}}:
 ; CHECK-NEXT: RET_ReallyLR
 define void @uncondbr_fallthrough() {
+  ; CHECK-LABEL: name: uncondbr_fallthrough
+  ; CHECK: bb.1.entry:
+  ; CHECK-NEXT:   successors: %bb.2(0x80000000)
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT:   G_BR %bb.2
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT: bb.2.end:
+  ; CHECK-NEXT:   RET_ReallyLR
+  ;
+  ; O3-LABEL: name: uncondbr_fallthrough
+  ; O3: bb.1.entry:
+  ; O3-NEXT:   successors: %bb.2(0x80000000)
+  ; O3-NEXT: {{  $}}
+  ; O3-NEXT: bb.2.end:
+  ; O3-NEXT:   RET_ReallyLR
 entry:
   br label %end
 end:
@@ -119,6 +205,37 @@ end:
 ; CHECK: [[FALSE]].{{[a-zA-Z0-9.]+}}:
 ; CHECK-NEXT: RET_ReallyLR
 define void @condbr(ptr %tstaddr) {
+  ; CHECK-LABEL: name: condbr
+  ; CHECK: bb.1 (%ir-block.0):
+  ; CHECK-NEXT:   successors: %bb.2(0x40000000), %bb.3(0x40000000)
+  ; CHECK-NEXT:   liveins: $x0
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT:   [[COPY:%[0-9]+]]:_(p0) = COPY $x0
+  ; CHECK-NEXT:   [[LOAD:%[0-9]+]]:_(s1) = G_LOAD [[COPY]](p0) :: (load (s1) from %ir.tstaddr)
+  ; CHECK-NEXT:   G_BRCOND [[LOAD]](s1), %bb.2
+  ; CHECK-NEXT:   G_BR %bb.3
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT: bb.2.true:
+  ; CHECK-NEXT:   RET_ReallyLR
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT: bb.3.false:
+  ; CHECK-NEXT:   RET_ReallyLR
+  ;
+  ; O3-LABEL: name: condbr
+  ; O3: bb.1 (%ir-block.0):
+  ; O3-NEXT:   successors: %bb.2(0x40000000), %bb.3(0x40000000)
+  ; O3-NEXT:   liveins: $x0
+  ; O3-NEXT: {{  $}}
+  ; O3-NEXT:   [[COPY:%[0-9]+]]:_(p0) = COPY $x0
+  ; O3-NEXT:   [[LOAD:%[0-9]+]]:_(s1) = G_LOAD [[COPY]](p0) :: (load (s1) from %ir.tstaddr)
+  ; O3-NEXT:   G_BRCOND [[LOAD]](s1), %bb.2
+  ; O3-NEXT:   G_BR %bb.3
+  ; O3-NEXT: {{  $}}
+  ; O3-NEXT: bb.2.true:
+  ; O3-NEXT:   RET_ReallyLR
+  ; O3-NEXT: {{  $}}
+  ; O3-NEXT: bb.3.false:
+  ; O3-NEXT:   RET_ReallyLR
   %tst = load i1, ptr %tstaddr
   br i1 %tst, label %true, label %false
 true:
@@ -149,6 +266,58 @@ false:
 @indirectbr.L = internal unnamed_addr constant [3 x ptr] [ptr blockaddress(@indirectbr, %L1), ptr blockaddress(@indirectbr, %L2), ptr null], align 8
 
 define void @indirectbr() {
+  ; CHECK-LABEL: name: indirectbr
+  ; CHECK: bb.1.entry:
+  ; CHECK-NEXT:   successors: %bb.2(0x80000000)
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT:   [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 1
+  ; CHECK-NEXT:   [[GV:%[0-9]+]]:_(p0) = G_GLOBAL_VALUE @indirectbr.L
+  ; CHECK-NEXT:   [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 0
+  ; CHECK-NEXT:   G_BR %bb.2
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT: bb.2.L1 (ir-block-address-taken %ir-block.L1):
+  ; CHECK-NEXT:   successors: %bb.2(0x40000000), %bb.3(0x40000000)
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT:   [[PHI:%[0-9]+]]:_(s32) = G_PHI [[C1]](s32), %bb.1, %2(s32), %bb.2
+  ; CHECK-NEXT:   [[ADD:%[0-9]+]]:_(s32) = G_ADD [[PHI]], [[C]]
+  ; CHECK-NEXT:   [[ZEXT:%[0-9]+]]:_(s64) = G_ZEXT [[PHI]](s32)
+  ; CHECK-NEXT:   [[C2:%[0-9]+]]:_(s64) = G_CONSTANT i64 8
+  ; CHECK-NEXT:   [[MUL:%[0-9]+]]:_(s64) = G_MUL [[ZEXT]], [[C2]]
+  ; CHECK-NEXT:   [[PTR_ADD:%[0-9]+]]:_(p0) = G_PTR_ADD [[GV]], [[MUL]](s64)
+  ; CHECK-NEXT:   [[COPY:%[0-9]+]]:_(p0) = COPY [[PTR_ADD]](p0)
+  ; CHECK-NEXT:   [[LOAD:%[0-9]+]]:_(p0) = G_LOAD [[COPY]](p0) :: (load (p0) from %ir.arrayidx)
+  ; CHECK-NEXT:   G_BRINDIRECT [[LOAD]](p0)
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT: bb.3.L2 (ir-block-address-taken %ir-block.L2):
+  ; CHECK-NEXT:   RET_ReallyLR
+  ;
+  ; O3-LABEL: name: indirectbr
+  ; O3: bb.1.entry:
+  ; O3-NEXT:   successors: %bb.3(0x80000000)
+  ; O3-NEXT: {{  $}}
+  ; O3-NEXT:   [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 1
+  ; O3-NEXT:   [[GV:%[0-9]+]]:_(p0) = G_GLOBAL_VALUE @indirectbr.L
+  ; O3-NEXT:   [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 0
+  ; O3-NEXT:   G_BR %bb.3
+  ; O3-NEXT: {{  $}}
+  ; O3-NEXT: bb.2.L1 (ir-block-address-taken %ir-block.L1):
+  ; O3-NEXT:   successors: %bb.3(0x80000000)
+  ; O3-NEXT: {{  $}}
+  ; O3-NEXT: bb.3..split:
+  ; O3-NEXT:   successors: %bb.2(0x40000000), %bb.4(0x40000000)
+  ; O3-NEXT: {{  $}}
+  ; O3-NEXT:   [[PHI:%[0-9]+]]:_(s32) = G_PHI %2(s32), %bb.2, [[C1]](s32), %bb.1
+  ; O3-NEXT:   [[ADD:%[0-9]+]]:_(s32) = G_ADD [[PHI]], [[C]]
+  ; O3-NEXT:   [[ZEXT:%[0-9]+]]:_(s64) = G_ZEXT [[PHI]](s32)
+  ; O3-NEXT:   [[C2:%[0-9]+]]:_(s64) = G_CONSTANT i64 8
+  ; O3-NEXT:   [[MUL:%[0-9]+]]:_(s64) = G_MUL [[ZEXT]], [[C2]]
+  ; O3-NEXT:   [[PTR_ADD:%[0-9]+]]:_(p0) = G_PTR_ADD [[GV]], [[MUL]](s64)
+  ; O3-NEXT:   [[COPY:%[0-9]+]]:_(p0) = COPY [[PTR_ADD]](p0)
+  ; O3-NEXT:   [[LOAD:%[0-9]+]]:_(p0) = G_LOAD [[COPY]](p0) :: (invariant load (p0) from %ir.arrayidx)
+  ; O3-NEXT:   G_BRINDIRECT [[LOAD]](p0)
+  ; O3-NEXT: {{  $}}
+  ; O3-NEXT: bb.4.L2 (ir-block-address-taken %ir-block.L2):
+  ; O3-NEXT:   RET_ReallyLR
 entry:
   br label %L1
 L1:                                               ; preds = %entry, %L1
@@ -170,6 +339,25 @@ L2:                                               ; preds = %L1
 ; CHECK-NEXT: $x0 = COPY [[RES]]
 ; CHECK-NEXT: RET_ReallyLR implicit $x0
 define i64 @ori64(i64 %arg1, i64 %arg2) {
+  ; CHECK-LABEL: name: ori64
+  ; CHECK: bb.1 (%ir-block.0):
+  ; CHECK-NEXT:   liveins: $x0, $x1
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT:   [[COPY:%[0-9]+]]:_(s64) = COPY $x0
+  ; CHECK-NEXT:   [[COPY1:%[0-9]+]]:_(s64) = COPY $x1
+  ; CHECK-NEXT:   [[OR:%[0-9]+]]:_(s64) = G_OR [[COPY]], [[COPY1]]
+  ; CHECK-NEXT:   $x0 = COPY [[OR]](s64)
+  ; CHECK-NEXT:   RET_ReallyLR implicit $x0
+  ;
+  ; O3-LABEL: name: ori64
+  ; O3: bb.1 (%ir-block.0):
+  ; O3-NEXT:   liveins: $x0, $x1
+  ; O3-NEXT: {{  $}}
+  ; O3-NEXT:   [[COPY:%[0-9]+]]:_(s64) = COPY $x0
+  ; O3-NEXT:   [[COPY1:%[0-9]+]]:_(s64) = COPY $x1
+  ; O3-NEXT:   [[OR:%[0-9]+]]:_(s64) = G_OR [[COPY]], [[COPY1]]
+  ; O3-NEXT:   $x0 = COPY [[OR]](s64)
+  ; O3-NEXT:   RET_ReallyLR implicit $x0
   %res = or i64 %arg1, %arg2
   ret i64 %res
 }
@@ -181,6 +369,25 @@ define i64 @ori64(i64 %arg1, i64 %arg2) {
 ; CHECK-NEXT: $w0 = COPY [[RES]]
 ; CHECK-NEXT: RET_ReallyLR implicit $w0
 define i32 @ori32(i32 %arg1, i32 %arg2) {
+  ; CHECK-LABEL: name: ori32
+  ; CHECK: bb.1 (%ir-block.0):
+  ; CHECK-NEXT:   liveins: $w0, $w1
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT:   [[COPY:%[0-9]+]]:_(s32) = COPY $w0
+  ; CHECK-NEXT:   [[COPY1:%[0-9]+]]:_(s32) = COPY $w1
+  ; CHECK-NEXT:   [[OR:%[0-9]+]]:_(s32) = G_OR [[COPY]], [[COPY1]]
+  ; CHECK-NEXT:   $w0 = COPY [[OR]](s32)
+  ; CHECK-NEXT:   RET_ReallyLR implicit $w0
+  ;
+  ; O3-LABEL: name: ori32
+  ; O3: bb.1 (%ir-block.0):
+  ; O3-NEXT:   liveins: $w0, $w1
+  ; O3-NEXT: {{  $}}
+  ; O3-NEXT:   [[COPY:%[0-9]+]]:_(s32) = COPY $w0
+  ; O3-NEXT:   [[COPY1:%[0-9]+]]:_(s32) = COPY $w1
+  ; O3-NEXT:   [[OR:%[0-9]+]]:_(s32) = G_OR [[COPY]], [[COPY1]]
+  ; O3-NEXT:   $w0 = COPY [[OR]](s32)
+  ; O3-NEXT:   RET_ReallyLR implicit $w0
   %res = or i32 %arg1, %arg2
   ret i32 %res
 }
@@ -193,6 +400,25 @@ define i32 @ori32(i32 %arg1, i32 %arg2) {
 ; CHECK-NEXT: $x0 = COPY [[RES]]
 ; CHECK-NEXT: RET_ReallyLR implicit $x0
 define i64 @xori64(i64 %arg1, i64 %arg2) {
+  ; CHECK-LABEL: name: xori64
+  ; CHECK: bb.1 (%ir-block.0):
+  ; CHECK-NEXT:   liveins: $x0, $x1
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT:   [[COPY:%[0-9]+]]:_(s64) = COPY $x0
+  ; CHECK-NEXT:   [[COPY1:%[0-9]+]]:_(s64) = COPY $x1
+  ; CHECK-NEXT:   [[XOR:%[0-9]+]]:_(s64) = G_XOR [[COPY]], [[COPY1]]
+  ; CHECK-NEXT:   $x0 = COPY [[XOR]](s64)
+  ; CHECK-NEXT:   RET_ReallyLR implicit $x0
+  ;
+  ; O3-LABEL: name: xori64
+  ; O3: bb.1 (%ir-block.0):
+  ; O3-NEXT:   liveins: $x0, $x1
+  ; O3-NEXT: {{  $}}
+  ; O3-NEXT:   [[COPY:%[0-9]+]]:_(s64) = COPY $x0
+  ; O3-NEXT:   [[COPY1:%[0-9]+]]:_(s64) = COPY $x1
+  ; O3-NEXT:   [[XOR:%[0-9]+]]:_(s64) = G_XOR [[COPY]], [[COPY1]]
+  ; O3-NEXT:   $x0 = COPY [[XOR]](s64)
+  ; O3-NEXT:   RET_ReallyLR implicit $x0
   %res = xor i64 %arg1, %arg2
   ret i64 %res
 }
@@ -204,6 +430,25 @@ define i64 @xori64(i64 %arg1, i64 %arg2) {
 ; CHECK-NEXT: $w0 = COPY [[RES]]
 ; CHECK-NEXT: RET_ReallyLR implicit $w0
 define i32 @xori32(i32 %arg1, i32 %arg2) {
+  ; CHECK-LABEL: name: xori32
+  ; CHECK: bb.1 (%ir-block.0):
+  ; CHECK-NEXT:   liveins: $w0, $w1
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT:   [[COPY:%[0-9]+]]:_(s32) = COPY $w0
+  ; CHECK-NEXT:   [[COPY1:%[0-9]+]]:_(s32) = COPY $w1
+  ; CHECK-NEXT:   [[XOR:%[0-9]+]]:_(s32) = G_XOR [[COPY]], [[COPY1]]
+  ; CHECK-NEXT:   $w0 = COPY [[XOR]](s32)
+  ; CHECK-NEXT:   RET_ReallyLR implicit $w0
+  ;
+  ; O3-LABEL: name: xori32
+  ; O3: bb.1 (%ir-block.0):
+  ; O3-NEXT:   liveins: $w0, $w1
+  ; O3-NEXT: {{  $}}
+  ; O3-NEXT:   [[COPY:%[0-9]+]]:_(s32) = COPY $w0
+  ; O3-NEXT:   [[COPY1:%[0-9]+]]:_(s32) = COPY $w1
+  ; O3-NEXT:   [[XOR:%[0-9]+]]:_(s32) = G_XOR [[COPY]], [[COPY1]]
+  ; O3-NEXT:   $w0 = COPY [[XOR]](s32)
+  ; O3-NEXT:   RET_ReallyLR implicit $w0
   %res = xor i32 %arg1, %arg2
   ret i32 %res
 }
@@ -216,6 +461,25 @@ define i32 @xori32(i32 %arg1, i32 %arg2) {
 ; CHECK-NEXT: $x0 = COPY [[RES]]
 ; CHECK-NEXT: RET_ReallyLR implicit $x0
 define i64 @andi64(i64 %arg1, i64 %arg2) {
+  ; CHECK-LABEL: name: andi64
+  ; CHECK: bb.1 (%ir-block.0):
+  ; CHECK-NEXT:   liveins: $x0, $x1
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT:   [[COPY:%[0-9]+]]:_(s64) = COPY $x0
+  ; CHECK-NEXT:   [[COPY1:%[0-9]+]]:_(s64) = COPY $x1
+  ; CHECK-NEXT:   [[AND:%[0-9]+]]:_(s64) = G_AND [[COPY]], [[COPY1]]
+  ; CHECK-NEXT:   $x0 = COPY [[AND]](s64)
+  ; CHECK-NEXT:   RET_ReallyLR implicit $x0
+  ;
+  ; O3-LABEL: name: andi64
+  ; O3: bb.1 (%ir-block.0):
+  ; O3-NEXT:   liveins: $x0, $x1
+  ; O3-NEXT: {{  $}}
+  ; O3-NEXT:   [[COPY:%[0-9]+]]:_(s64) = COPY $x0
+  ; O3-NEXT:   [[COPY1:%[0-9]+]]:_(s64) = COPY $x1
+  ; O3-NEXT:   [[AND:%[0-9]+]]:_(s64) = G_AND [[COPY]], [[COPY1]]
+  ; O3-NEXT:   $x0 = COPY [[AND]](s64)
+  ; O3-NEXT:   RET_ReallyLR implicit $x0
   %res = and i64 %arg1, %arg2
   ret i64 %res
 }
@@ -227,6 +491,25 @@ define i64 @andi64(i64 %arg1, i64 %arg2) {
 ; CHECK-NEXT: $w0 = COPY [[RES]]
 ; CHECK-NEXT: RET_ReallyLR implicit $w0
 define i32 @andi32(i32 %arg1, i32 %arg2) {
+  ; CHECK-LABEL: name: andi32
+  ; CHECK: bb.1 (%ir-block.0):
+  ; CHECK-NEXT:   liveins: $w0, $w1
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT:   [[COPY:%[0-9]+]]:_(s32) = COPY $w0
+  ; CHECK-NEXT:   [[COPY1:%[0-9]+]]:_(s32) = COPY $w1
+  ; CHECK-NEXT:   [[AND:%[0-9]+]]:_(s32) = G_AND [[COPY]], [[COPY1]]
+  ; CHECK-NEXT:   $w0 = COPY [[AND]](s32)
+  ; CHECK-NEXT:   RET_ReallyLR implicit $w0
+  ;
+  ; O3-LABEL: name: andi32
+  ; O3: bb.1 (%ir-block.0):
+  ; O3-NEXT:   liveins: $w0, $w1
+  ; O3-NEXT: {{  $}}
+  ; O3-NEXT:   [[COPY:%[0-9]+]]:_(s32) = COPY $w0
+  ; O3-NEXT:   [[COPY1:%[0-9]+]]:_(s32) = COPY $w1
+  ; O3-NEXT:   [[AND:%[0-9]+]]:_(s32) = G_AND [[COPY]], [[COPY1]]
+  ; O3-NEXT:   $w0 = COPY [[AND]](s32)
+  ; O3-NEXT:   RET_ReallyLR implicit $w0
   %res = and i32 %arg1, %arg2
   ret i32 %res
 }
@@ -239,6 +522,25 @@ define i32 @andi32(i32 %arg1, i32 %arg2) {
 ; CHECK-NEXT: $x0 = COPY [[RES]]
 ; CHECK-NEXT: RET_ReallyLR implicit $x0
 define i64 @subi64(i64 %arg1, i64 %arg2) {
+  ; CHECK-LABEL: name: subi64
+  ; CHECK: bb.1 (%ir-block.0):
+  ; CHECK-NEXT:   liveins: $x0, $x1
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT:   [[COPY:%[0-9]+]]:_(s64) = COPY $x0
+  ; CHECK-NEXT:   [[COPY1:%[0-9]+]]:_(s64) = COPY $x1
+  ; CHECK-NEXT:   [[SUB:%[0-9]+]]:_(s64) = G_SUB [[COPY]], [[COPY1]]
+  ; CHECK-NEXT:   $x0 = COPY [[SUB]](s64)
+  ; CHECK-NEXT:   RET_ReallyLR implicit $x0
+  ;
+  ; O3-LABEL: name: subi64
+  ; O3: bb.1 (%ir-block.0):
+  ; O3-NEXT:   liveins: $x0, $x1
+  ; O3-NEXT: {{  $}}
+  ; O3-NEXT:   [[COPY:%[0-9]+]]:_(s64) = COPY $x0
+  ; O3-NEXT:   [[COPY1:%[0-9]+]]:_(s64) = COPY $x1
+  ; O3-NEXT:   [[SUB:%[0-9]+]]:_(s64) = G_SUB [[COPY]], [[COPY1]]
+  ; O3-NEXT:   $x0 = COPY [[SUB]](s64)
+  ; O3-NEXT:   RET_ReallyLR implicit $x0
   %res = sub i64 %arg1, %arg2
   ret i64 %res
 }
@@ -250,6 +552,25 @@ define i64 @subi64(i64 %arg1, i64 %arg2) {
 ; CHECK-NEXT: $w0 = COPY [[RES]]
 ; CHECK-NEXT: RET_ReallyLR implicit $w0
 define i32 @subi32(i32 %arg1, i32 %arg2) {
+  ; CHECK-LABEL: name: subi32
+  ; CHECK: bb.1 (%ir-block.0):
+  ; CHECK-NEXT:   liveins: $w0, $w1
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT:   [[COPY:%[0-9]+]]:_(s32) = COPY $w0
+  ; CHECK-NEXT:   [[COPY1:%[0-9]+]]:_(s32) = COPY $w1
+  ; CHECK-NEXT:   [[SUB:%[0-9]+]]:_(s32) = G_SUB [[COPY]], [[COPY1]]
+  ; CHECK-NEXT:   $w0 = COPY [[SUB]](s32)
+  ; CHECK-NEXT:   RET_ReallyLR implicit $w0
+  ;
+  ; O3-LABEL: name: subi32
+  ; O3: bb.1 (%ir-block.0):
+  ; O3-NEXT:   liveins: $w0, $w1
+  ; O3-NEXT: {{  $}}
+  ; O3-NEXT:   [[COPY:%[0-9]+]]:_(s32) = COPY $w0
+  ; O3-NEXT:   [[COPY1:%[0-9]+]]:_(s32) = COPY $w1
+  ; O3-NEXT:   [[SUB:%[0-9]+]]:_(s32) = G_SUB [[COPY]], [[COPY1]]
+  ; O3-NEXT:   $w0 = COPY [[SUB]](s32)
+  ; O3-NEXT:   RET_ReallyLR implicit $w0
   %res = sub i32 %arg1, %arg2
   ret i32 %res
 }
@@ -260,6 +581,23 @@ define i32 @subi32(i32 %arg1, i32 %arg2) {
 ; CHECK: $x0 = COPY [[RES]]
 ; CHECK: RET_ReallyLR implicit $x0
 define i64 @ptrtoint(ptr %a) {
+  ; CHECK-LABEL: name: ptrtoint
+  ; CHECK: bb.1 (%ir-block.0):
+  ; CHECK-NEXT:   liveins: $x0
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT:   [[COPY:%[0-9]+]]:_(p0) = COPY $x0
+  ; CHECK-NEXT:   [[PTRTOINT:%[0-9]+]]:_(s64) = G_PTRTOINT [[COPY]](p0)
+  ; CHECK-NEXT:   $x0 = COPY [[PTRTOINT]](s64)
+  ; CHECK-NEXT:   RET_ReallyLR implicit $x0
+  ;
+  ; O3-LABEL: name: ptrtoint
+  ; O3: bb.1 (%ir-block.0):
+  ; O3-NEXT:   liveins: $x0
+  ; O3-NEXT: {{  $}}
+  ; O3-NEXT:   [[COPY:%[0-9]+]]:_(p0) = COPY $x0
+  ; O3-NEXT:   [[PTRTOINT:%[0-9]+]]:_(s64) = G_PTRTOINT [[COPY]](p0)
+  ; O3-NEXT:   $x0 = COPY [[PTRTOINT]](s64)
+  ; O3-NEXT:   RET_ReallyLR implicit $x0
   %val = ptrtoint ptr %a to i64
   ret i64 %val
 }
@@ -270,6 +608,23 @@ define i64 @ptrtoint(ptr %a) {
 ; CHECK: $x0 = COPY [[RES]]
 ; CHECK: RET_ReallyLR implicit $x0
 define ptr @inttoptr(i64 %a) {
+  ; CHECK-LABEL: name: inttoptr
+  ; CHECK: bb.1 (%ir-block.0):
+  ; CHECK-NEXT:   liveins: $x0
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT:   [[COPY:%[0-9]+]]:_(s64) = COPY $x0
+  ; CHECK-NEXT:   [[INTTOPTR:%[0-9]+]]:_(p0) = G_INTTOPTR [[COPY]](s64)
+  ; CHECK-NEXT:   $x0 = COPY [[INTTOPTR]](p0)
+  ; CHECK-NEXT:   RET_ReallyLR implicit $x0
+  ;
+  ; O3-LABEL: name: inttoptr
+  ; O3: bb.1 (%ir-block.0):
+  ; O3-NEXT:   liveins: $x0
+  ; O3-NEXT: {{  $}}
+  ; O3-NEXT:   [[COPY:%[0-9]+]]:_(s64) = COPY $x0
+  ; O3-NEXT:   [[INTTOPTR:%[0-9]+]]:_(p0) = G_INTTOPTR [[COPY]](s64)
+  ; O3-NEXT:   $x0 = COPY [[INTTOPTR]](p0)
+  ; O3-NEXT:   RET_ReallyLR implicit $x0
   %val = intto...
[truncated]

@tschuett tschuett requested a review from arsenm May 30, 2024 17:49
@@ -576,6 +576,11 @@ uint32_t MachineInstr::copyFlagsFromInstruction(const Instruction &I) {
MIFlags |= MachineInstr::MIFlag::NoSWrap;
if (TI->hasNoUnsignedWrap())
MIFlags |= MachineInstr::MIFlag::NoUWrap;
} else if (const GetElementPtrInst *GEP = dyn_cast<GetElementPtrInst>(&I)) {
if (GEP->hasNoUnsignedSignedWrap())
MIFlags |= MachineInstr::MIFlag::NoSWrap;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it equivalent to nusw?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See

bool hasNoUnsignedSignedWrap() const;

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've read the RFC and I got an impression that nsw and nusw are different.
CC @nikic

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment reads:
/// Determine whether the GEP has the nusw flag.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's right. But you're setting an nsw flag. Shouldn't we introduce nusw?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment reads:
/// Determine whether the GEP has the nusw flag.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, nusw and nsw are not the same thing. You can transfer nsw to the scale multiplication, but not to the PTRADD, as this PR appears to be doing.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tschuett
Copy link
Author

MachineInstr::copyFlagsFromInstruction had no support for getelementptr. This PR adds support for copying the flags. IRTranslator::translateGetElementPtr blindly takes the flags from the GEP. In some cases, it already added nuw to the ptradd. There is no filter mechanism in translateGetElementPtr

@tschuett
Copy link
Author

Are there still any outstanding issues with importing the GEP flags to ptradd?

%v2 = load i32, ptr %gep2
%res = add i32 %v1, %v2
ret i32 %res
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Test a vector case?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The GEP is a vector.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would prefer to not touch this code. 1. it is copy-and-paste. 2. There are odd rules for attaching flags to ptradd.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

None of these GEPs are vectors

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My concern is with:

MIRBuilder.buildPtrAdd(getOrCreateVReg(U), BaseReg, OffsetMIB.getReg(0),

In one corner case the IRTranslator attaches flags to ptradds. I am afraid that there are many geps that do not hit this line.

Copy link
Contributor

@arsenm arsenm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should really fix whatever the vector issue is though

@tschuett
Copy link
Author

Thanks!

@tschuett tschuett merged commit b1f9440 into llvm:main Jun 14, 2024
7 checks passed
@tschuett tschuett deleted the gisel-gep-flags branch June 14, 2024 18:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants