Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[GlobalISel][ARM] Legalization of G_CONSTANT using constant pool #98308

Merged
merged 4 commits into from
Oct 14, 2024

Conversation

spavloff
Copy link
Collaborator

@spavloff spavloff commented Jul 10, 2024

ARM uses complex encoding of immediate values using small number of bits. As a result, some values cannot be represented as immediate operands, they need to be synthesized in a register. This change implements legalization of such constants with loading values from constant pool.

@spavloff spavloff self-assigned this Jul 10, 2024
Copy link

github-actions bot commented Jul 10, 2024

✅ With the latest revision this PR passed the C/C++ code formatter.

@tschuett
Copy link

Can we revisit #96225 before landing this PR?

@spavloff
Copy link
Collaborator Author

Can we revisit #96225 before landing this PR?

Sure!

@llvmbot
Copy link
Member

llvmbot commented Jul 10, 2024

@llvm/pr-subscribers-llvm-globalisel

@llvm/pr-subscribers-backend-arm

Author: Serge Pavlov (spavloff)

Changes

ARM uses complex encoding of immediate values using small number of bits. As a result, some values cannot be represented as immediate operands, they need to be synthesized in a register. This change implements legalization of such constants with loading values from constant pool.


Full diff: https://github.com/llvm/llvm-project/pull/98308.diff

7 Files Affected:

  • (modified) llvm/lib/Target/ARM/ARMInstructionSelector.cpp (+20)
  • (modified) llvm/lib/Target/ARM/ARMLegalizerInfo.cpp (+12-2)
  • (modified) llvm/lib/Target/ARM/ARMLegalizerInfo.h (+2)
  • (modified) llvm/lib/Target/ARM/ARMRegisterBankInfo.cpp (+1)
  • (added) llvm/test/CodeGen/ARM/GlobalISel/arm-legalize-const.mir (+26)
  • (modified) llvm/test/CodeGen/ARM/GlobalISel/arm-legalize-fp.mir (+16-6)
  • (added) llvm/test/CodeGen/ARM/GlobalISel/select-constpool.mir (+31)
diff --git a/llvm/lib/Target/ARM/ARMInstructionSelector.cpp b/llvm/lib/Target/ARM/ARMInstructionSelector.cpp
index 3b3c380e1e1b3..713f8da9b05c5 100644
--- a/llvm/lib/Target/ARM/ARMInstructionSelector.cpp
+++ b/llvm/lib/Target/ARM/ARMInstructionSelector.cpp
@@ -1103,6 +1103,26 @@ bool ARMInstructionSelector::select(MachineInstr &I) {
     assert((ValSize != 64 || STI.hasVFP2Base()) &&
            "Don't know how to load/store 64-bit value without VFP");
 
+    MachineInstr *Ptr = MRI.getVRegDef(I.getOperand(1).getReg());
+    if (Ptr->getOpcode() == TargetOpcode::G_CONSTANT_POOL) {
+      unsigned Opcode;
+      if (Subtarget->isThumb())
+        Opcode = ARM::tLDRpci;
+      else
+        Opcode = ARM::LDRcp;
+
+      auto Instr = BuildMI(MBB, I, I.getDebugLoc(), TII.get(Opcode))
+                       .addDef(Reg)
+                       .add(Ptr->getOperand(1))
+                       .addImm(0)
+                       .add(predOps(ARMCC::AL))
+                       .addMemOperand(I.memoperands().front());
+      if (!constrainSelectedInstRegOperands(*Instr, TII, TRI, RBI))
+        return false;
+      I.eraseFromParent();
+      return true;
+    }
+
     const auto NewOpc = selectLoadStoreOpCode(I.getOpcode(), RegBank, ValSize);
     if (NewOpc == G_LOAD || NewOpc == G_STORE)
       return false;
diff --git a/llvm/lib/Target/ARM/ARMLegalizerInfo.cpp b/llvm/lib/Target/ARM/ARMLegalizerInfo.cpp
index 660f351bae64b..b41c6183f2b88 100644
--- a/llvm/lib/Target/ARM/ARMLegalizerInfo.cpp
+++ b/llvm/lib/Target/ARM/ARMLegalizerInfo.cpp
@@ -29,7 +29,7 @@ static bool AEABI(const ARMSubtarget &ST) {
   return ST.isTargetAEABI() || ST.isTargetGNUAEABI() || ST.isTargetMuslAEABI();
 }
 
-ARMLegalizerInfo::ARMLegalizerInfo(const ARMSubtarget &ST) {
+ARMLegalizerInfo::ARMLegalizerInfo(const ARMSubtarget &ST) : ST(&ST) {
   using namespace TargetOpcode;
 
   const LLT p0 = LLT::pointer(0, 32);
@@ -99,9 +99,11 @@ ARMLegalizerInfo::ARMLegalizerInfo(const ARMSubtarget &ST) {
       .minScalar(0, s32);
 
   getActionDefinitionsBuilder(G_CONSTANT)
-      .legalFor({s32, p0})
+      .customFor({s32, p0})
       .clampScalar(0, s32, s32);
 
+  getActionDefinitionsBuilder(G_CONSTANT_POOL).legalFor({p0});
+
   getActionDefinitionsBuilder(G_ICMP)
       .legalForCartesianProduct({s1}, {s32, p0})
       .minScalar(1, s32);
@@ -435,6 +437,14 @@ bool ARMLegalizerInfo::legalizeCustom(LegalizerHelper &Helper, MachineInstr &MI,
     }
     break;
   }
+  case G_CONSTANT: {
+    const ConstantInt *ConstVal = MI.getOperand(1).getCImm();
+    uint64_t ImmVal = ConstVal->getZExtValue();
+    if (ConstantMaterializationCost(ImmVal, ST) > 2 &&
+        !ST->genExecuteOnly())
+      return Helper.lowerConstant(MI) == LegalizerHelper::Legalized;
+    return true;
+  }
   case G_FCONSTANT: {
     // Convert to integer constants, while preserving the binary representation.
     auto AsInteger =
diff --git a/llvm/lib/Target/ARM/ARMLegalizerInfo.h b/llvm/lib/Target/ARM/ARMLegalizerInfo.h
index d6ce4eb1055b4..333e24ba2da62 100644
--- a/llvm/lib/Target/ARM/ARMLegalizerInfo.h
+++ b/llvm/lib/Target/ARM/ARMLegalizerInfo.h
@@ -59,6 +59,8 @@ class ARMLegalizerInfo : public LegalizerInfo {
   // Get the libcall(s) corresponding to \p Predicate for operands of \p Size
   // bits.
   FCmpLibcallsList getFCmpLibcalls(CmpInst::Predicate, unsigned Size) const;
+
+  const ARMSubtarget *ST;
 };
 } // End llvm namespace.
 #endif
diff --git a/llvm/lib/Target/ARM/ARMRegisterBankInfo.cpp b/llvm/lib/Target/ARM/ARMRegisterBankInfo.cpp
index 9234881c9407e..2d0619c598987 100644
--- a/llvm/lib/Target/ARM/ARMRegisterBankInfo.cpp
+++ b/llvm/lib/Target/ARM/ARMRegisterBankInfo.cpp
@@ -367,6 +367,7 @@ ARMRegisterBankInfo::getInstrMapping(const MachineInstr &MI) const {
   case G_CONSTANT:
   case G_FRAME_INDEX:
   case G_GLOBAL_VALUE:
+  case G_CONSTANT_POOL:
     OperandsMapping =
         getOperandsMapping({&ARM::ValueMappings[ARM::GPR3OpsIdx], nullptr});
     break;
diff --git a/llvm/test/CodeGen/ARM/GlobalISel/arm-legalize-const.mir b/llvm/test/CodeGen/ARM/GlobalISel/arm-legalize-const.mir
new file mode 100644
index 0000000000000..6d19c650c6e1c
--- /dev/null
+++ b/llvm/test/CodeGen/ARM/GlobalISel/arm-legalize-const.mir
@@ -0,0 +1,26 @@
+# NOTE: Assertions have been autogenerated by utils/update_mir_test_checks.py UTC_ARGS: --version 5
+# RUN: llc -mtriple arm-- -run-pass=legalizer %s -o - | FileCheck %s
+
+--- |
+  define i32 @get_const() {
+  entry:
+    ret i32 287454020
+  }
+...
+---
+name:            get_const
+legalized:       false
+regBankSelected: false
+selected:        false
+tracksRegLiveness: true
+body:             |
+  bb.1.entry:
+    ; CHECK-LABEL: name: get_const
+    ; CHECK: [[CONSTANT_POOL:%[0-9]+]]:_(p0) = G_CONSTANT_POOL %const.0
+    ; CHECK-NEXT: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[CONSTANT_POOL]](p0) :: (load (s32) from constant-pool)
+    ; CHECK-NEXT: $r0 = COPY [[LOAD]](s32)
+    ; CHECK-NEXT: MOVPCLR 14 /* CC::al */, $noreg, implicit $r0
+    %0:_(s32) = G_CONSTANT i32 287454020
+    $r0 = COPY %0(s32)
+    MOVPCLR 14 /* CC::al */, $noreg, implicit $r0
+...
diff --git a/llvm/test/CodeGen/ARM/GlobalISel/arm-legalize-fp.mir b/llvm/test/CodeGen/ARM/GlobalISel/arm-legalize-fp.mir
index 96461e044813d..a35f82722adaf 100644
--- a/llvm/test/CodeGen/ARM/GlobalISel/arm-legalize-fp.mir
+++ b/llvm/test/CodeGen/ARM/GlobalISel/arm-legalize-fp.mir
@@ -1,9 +1,9 @@
 # RUN: llc -O0 -mtriple arm-linux-gnueabihf -mattr=+vfp2 -float-abi=hard -run-pass=legalizer %s -o - | FileCheck %s -check-prefix CHECK -check-prefix HARD
-# RUN: llc -O0 -mtriple arm-linux-gnueabi -mattr=+vfp2,+soft-float -float-abi=soft -run-pass=legalizer %s -o - | FileCheck %s -check-prefix CHECK -check-prefix SOFT -check-prefix SOFT-AEABI
-# RUN: llc -O0 -mtriple arm-linux-gnu -mattr=+soft-float -float-abi=soft -run-pass=legalizer %s -o - | FileCheck %s  -check-prefix CHECK -check-prefix SOFT -check-prefix SOFT-DEFAULT
+# RUN: llc -O0 -mtriple arm-linux-gnueabi -mattr=+vfp2,+soft-float -float-abi=soft -run-pass=legalizer %s -o - | FileCheck %s -check-prefix CHECK -check-prefix SOFT -check-prefix SOFT-AEABI -check-prefix SOFT_POOL
+# RUN: llc -O0 -mtriple arm-linux-gnu -mattr=+soft-float -float-abi=soft -run-pass=legalizer %s -o - | FileCheck %s  -check-prefix CHECK -check-prefix SOFT -check-prefix SOFT-DEFAULT -check-prefix SOFT_POOL
 # RUN: llc -O0 -mtriple thumb-linux-gnueabihf -mattr=+v6t2,+vfp2 -float-abi=hard -run-pass=legalizer %s -o - | FileCheck %s -check-prefix CHECK -check-prefix HARD
-# RUN: llc -O0 -mtriple thumb-linux-gnueabi -mattr=+v6t2,+vfp2,+soft-float -float-abi=soft -run-pass=legalizer %s -o - | FileCheck %s -check-prefix CHECK -check-prefix SOFT -check-prefix SOFT-AEABI
-# RUN: llc -O0 -mtriple thumb-linux-gnu -mattr=+v6t2,+soft-float -float-abi=soft -run-pass=legalizer %s -o - | FileCheck %s  -check-prefix CHECK -check-prefix SOFT -check-prefix SOFT-DEFAULT
+# RUN: llc -O0 -mtriple thumb-linux-gnueabi -mattr=+v6t2,+vfp2,+soft-float -float-abi=soft -run-pass=legalizer %s -o - | FileCheck %s -check-prefix CHECK -check-prefix SOFT -check-prefix=SOFT-AEABI -check-prefix SOFT_CONST
+# RUN: llc -O0 -mtriple thumb-linux-gnu -mattr=+v6t2,+soft-float -float-abi=soft -run-pass=legalizer %s -o - | FileCheck %s  -check-prefix CHECK -check-prefix SOFT -check-prefix SOFT-DEFAULT -check-prefix SOFT_CONST
 --- |
   define void @test_frem_float() { ret void }
   define void @test_frem_double() { ret void }
@@ -657,10 +657,20 @@ body:             |
   bb.0:
     liveins:
 
+    ; SOFT_POOL: constants:
+    ; SOFT_POOL: id: 0
+    ; SOFT_POOL: value: i32 -1073532109
+    ; SOFT_POOL: id: 1
+    ; SOFT_POOL: value: i32 858993459
+
     ; HARD: [[R:%[0-9]+]]:_(s64) = G_FCONSTANT double -2.4
     ; SOFT-NOT: G_FCONSTANT
-    ; SOFT-DAG: [[HI:%[0-9]+]]:_(s32) = G_CONSTANT i32 -1073532109
-    ; SOFT-DAG: [[LO:%[0-9]+]]:_(s32) = G_CONSTANT i32 858993459
+    ; SOFT_CONST-DAG: [[HI:%[0-9]+]]:_(s32) = G_CONSTANT i32 -1073532109
+    ; SOFT_CONST-DAG: [[LO:%[0-9]+]]:_(s32) = G_CONSTANT i32 858993459
+    ; SOFT_POOL-DAG: [[HIPTR:%[0-9]+]]:_(p0) = G_CONSTANT_POOL %const.0
+    ; SOFT_POOL-DAG: [[HI:%[0-9]+]]:_(s32) = G_LOAD [[HIPTR]]
+    ; SOFT_POOL-DAG: [[LOPTR:%[0-9]+]]:_(p0) = G_CONSTANT_POOL %const.1
+    ; SOFT_POOL-DAG: [[LO:%[0-9]+]]:_(s32) = G_LOAD [[LOPTR]]
     ; SOFT-NOT: G_FCONSTANT
     %0(s64) = G_FCONSTANT double -2.4
     ; HARD-DAG: G_UNMERGE_VALUES [[R]](s64)
diff --git a/llvm/test/CodeGen/ARM/GlobalISel/select-constpool.mir b/llvm/test/CodeGen/ARM/GlobalISel/select-constpool.mir
new file mode 100644
index 0000000000000..35d1e932f8d76
--- /dev/null
+++ b/llvm/test/CodeGen/ARM/GlobalISel/select-constpool.mir
@@ -0,0 +1,31 @@
+# NOTE: Assertions have been autogenerated by utils/update_mir_test_checks.py UTC_ARGS: --version 5
+# RUN: llc -mtriple arm-- -run-pass=instruction-select -verify-machineinstrs %s -o - | FileCheck %s
+
+--- |
+  define i32 @get_const() {
+  entry:
+    ret i32 287454020
+  }
+...
+---
+name:            get_const
+legalized:       true
+regBankSelected: true
+selected:        false
+tracksRegLiveness: true
+constants:
+  - id:              0
+    value:           i32 287454020
+    alignment:       4
+    isTargetSpecific: false
+body:             |
+  bb.1.entry:
+    ; CHECK-LABEL: name: get_const
+    ; CHECK: [[LDRcp:%[0-9]+]]:gpr = LDRcp %const.0, 0, 14 /* CC::al */, $noreg :: (load (s32) from constant-pool)
+    ; CHECK-NEXT: $r0 = COPY [[LDRcp]]
+    ; CHECK-NEXT: MOVPCLR 14 /* CC::al */, $noreg, implicit $r0
+    %1:gprb(p0) = G_CONSTANT_POOL %const.0
+    %0:gprb(s32) = G_LOAD %1(p0) :: (load (s32) from constant-pool)
+    $r0 = COPY %0(s32)
+    MOVPCLR 14 /* CC::al */, $noreg, implicit $r0
+...

Copy link
Contributor

@s-barannikov s-barannikov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not an ARM expert, so only general nitpicks.

llvm/lib/Target/ARM/ARMLegalizerInfo.h Outdated Show resolved Hide resolved
llvm/lib/Target/ARM/ARMInstructionSelector.cpp Outdated Show resolved Hide resolved
@tschuett
Copy link

The real issue is .legalFor({s32, p0}) . s32 and p0 are legal types on Arm. It implies that the selector can select them. If the selector wants to explore 10^50 different selection strategies then it is up to the selector.

@tschuett
Copy link

tschuett commented Jul 11, 2024

We should follow #96225.

# RUN: llc -mtriple arm-- -run-pass=instruction-select -verify-machineinstrs %s -o - | FileCheck %s

--- |
define i32 @get_const() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't need the IR section

@tschuett
Copy link

tschuett commented Jul 11, 2024

This is the wrong approach. Please reconsider #96225. We loose the .legalFor. #97513 showed issues with .customFor . There are less constants in the pipeline.
AArch64 shows

how to handle constant pool loads in the selector.

Gisel has a type-based legalizer. There are no illegal values.

@arsenm
Copy link
Contributor

arsenm commented Jul 11, 2024

This is the wrong approach. Please reconsider #96225. We loose the .legalFor. #97513 showed issues with .customFor . There are less constants in the pipeline. AArch64 shows

Disagree. AArch64 should remove this custom handling. This is not how the DAG

how to handle constant pool loads in the selector.
Gisel has a type-based legalizer. There are no illegal values.

Not exactly, legal means can be selected. It doesn't mean custom cannot add restrictions. We could also always augment legality with immediate values (like we already do for sext_inreg)

@tschuett
Copy link

#97513 showed that the standard isLegalOrBeforeLegalizer does not work with customFor.

Arm can select s32 and p0. It is up to the selector find out how.

legalFor is preferably over customFor.

We signed a declaration of independence. :-)

@tschuett
Copy link

I may be blind but,

 getActionDefinitionsBuilder(G_SEXT_INREG)
      .legalFor({s32, s64})
      .legalFor(PackedVectorAllTypeList)
      .maxScalar(0, s64)
      .clampNumElements(0, v8s8, v16s8)
      .clampNumElements(0, v4s16, v8s16)
      .clampNumElements(0, v2s32, v4s32)
      .clampMaxNumElements(0, s64, 2)
      .lower();

looks type-based. I am still searching for immediates/values.

@aemerson
Copy link
Contributor

aemerson commented Jul 11, 2024

This is the wrong approach. Please reconsider #96225. We loose the .legalFor. #97513 showed issues with .customFor . There are less constants in the pipeline. AArch64 shows

Disagree. AArch64 should remove this custom handling. This is not how the DAG

(truncated reply?) I'm not following what the issue is here.

@tschuett
Copy link

We should progress with #96225 and stop this PR.

@tschuett
Copy link

Neutral version:
This PR lowers to constant pool loads if:

if (ConstantMaterializationCost(ImmVal, ST) > 2 &&
        !ST->genExecuteOnly())

using the standard legalizer tools.

#96225 does the same manually in the selector pass.

@spavloff
Copy link
Collaborator Author

Neutral version: This PR lowers to constant pool loads if:

if (ConstantMaterializationCost(ImmVal, ST) > 2 &&
        !ST->genExecuteOnly())

using the standard legalizer tools.

#96225 does the same manually in the selector pass.

Not sure I understand the proposed solution. This condition checks the value of immediate operand but the Legalizer makes decision based on types only, no?

@spavloff
Copy link
Collaborator Author

spavloff commented Aug 12, 2024

Can we decide, which way to choose, this patch or #96225?

@tschuett
Copy link

#96225

@s-barannikov
Copy link
Contributor

Lowering it early disrupts constant folding optimizations. Lowering it at instruction selection is awkward, but may be not too much.
I really wish GISel had the two-step (type and operation) legalization as SDagISel has so that the constants could have a chance to be optimized between type and operation legalization (with the latter finally turning them into a constant pool load).

@tschuett
Copy link

Why should ARM deviate from the way AArch64 handles constants?

Legalizing types with operations has advantages.

@spavloff
Copy link
Collaborator Author

Lowering it at instruction selection is awkward

It is equivalent to the introduction of a pseudo instruction, which is then lowered at the selection phase, no?

@tschuett
Copy link

The pseudo instruction is call G_CONSTANT.

@s-barannikov
Copy link
Contributor

Lowering it at instruction selection is awkward

It is equivalent to the introduction of a pseudo instruction, which is then lowered at the selection phase, no?

I'm not sure what you mean by that?
By awkward I mean that instruction selection should just select instructions, that is, map generic instructions to native ones and do nothing more. If there is no more or less direct mapping, then the instruction should've been lowered prior to instruction selection, potentially creating some optimization opportunities.

@tschuett
Copy link

AArch64 legalizes all constants:

 // Constants
  getActionDefinitionsBuilder(G_CONSTANT)
      .legalFor({p0, s8, s16, s32, s64})
      .widenScalarToNextPow2(0)
      .clampScalar(0, s8, s64);
  getActionDefinitionsBuilder(G_FCONSTANT)
      .legalIf([=](const LegalityQuery &Query) {
        const auto &Ty = Query.Types[0];
        if (HasFP16 && Ty == s16)
          return true;
        return Ty == s32 || Ty == s64 || Ty == s128;
      })
      .clampScalar(0, MinFPScalar, s128);

and selects all constants in the instruction selector:

Why should ARM deviate from that model?

@arsenm
Copy link
Contributor

arsenm commented Sep 24, 2024

AArch64 legalizes all constants:

Why should ARM deviate from that model?

They are not trying to achieve the same result. AArch64 does not use a constant pool, and ARM does. The constant pool expansion can be shared to generic code, and is a higher level transform than belongs in selection.

Comment on lines 1112 to 1116
unsigned Opcode;
if (Subtarget->isThumb())
Opcode = ARM::tLDRpci;
else
Opcode = ARM::LDRcp;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Initialize with ternary operator

@tschuett tschuett removed their request for review September 24, 2024 19:34
llvm/lib/Target/ARM/ARMInstructionSelector.cpp Outdated Show resolved Hide resolved
spavloff and others added 4 commits October 14, 2024 10:22
ARM uses complex encoding of immediate values using small number of
bits. As a result, some values cannot be represented as immediate
operands, they need to be synthesized in a register. This change
implements ilegalization of such constants with loading values from
constant pool.
Co-authored-by: Matt Arsenault <arsenm2@gmail.com>
@spavloff spavloff merged commit 52e5683 into llvm:main Oct 14, 2024
8 checks passed
DanielCChen pushed a commit to DanielCChen/llvm-project that referenced this pull request Oct 16, 2024
…m#98308)

ARM uses complex encoding of immediate values using small number of
bits. As a result, some values cannot be represented as immediate
operands, they need to be synthesized in a register. This change
implements legalization of such constants with loading values from
constant pool.

---------

Co-authored-by: Matt Arsenault <arsenm2@gmail.com>
bricknerb pushed a commit to bricknerb/llvm-project that referenced this pull request Oct 17, 2024
…m#98308)

ARM uses complex encoding of immediate values using small number of
bits. As a result, some values cannot be represented as immediate
operands, they need to be synthesized in a register. This change
implements legalization of such constants with loading values from
constant pool.

---------

Co-authored-by: Matt Arsenault <arsenm2@gmail.com>
EricWF pushed a commit to efcs/llvm-project that referenced this pull request Oct 22, 2024
…m#98308)

ARM uses complex encoding of immediate values using small number of
bits. As a result, some values cannot be represented as immediate
operands, they need to be synthesized in a register. This change
implements legalization of such constants with loading values from
constant pool.

---------

Co-authored-by: Matt Arsenault <arsenm2@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants