Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[AArch64][GISel] Don't pointlessly lower G_TRUNC #81479

Merged
merged 1 commit into from
Feb 13, 2024

Conversation

nikic
Copy link
Contributor

@nikic nikic commented Feb 12, 2024

If we have something like G_TRUNC from v2s32 to v2s16, then lowering this to a concat of two G_TRUNC s32 to s16 followed by G_TRUNC from v2s16 to v2s8 does not bring us any closer to legality. In fact, the first part of that is a G_BUILD_VECTOR whose legalization will produce a new G_TRUNC from v2s32 to v2s16, and both G_TRUNCs will then get combined to the original, causing a legalization cycle.

Make the lowering condition more precise, by requiring that the original vector is >128 bits, which is I believe the only case where this specific splitting approach is useful.

Note that this doesn't actually produce a legal result (the alwaysLegal is a lie, as before), but it will cause a proper globalisel abort instead of an infinite legalization loop.

Fixes #81244.

If we have something like G_TRUNC from v2s32 to v2s16, then lowering
this to a concat of two G_TRUNC s32 to s16 followed by G_TRUNC
from v2s16 to v2s8 does not bring us any closer to legality. In
fact, the first part of that will get legalized back into G_TRUNC
from v2s32 to v2s16, and both G_TRUNC will then get combinde to
the original, causing a legalization cycle.

Make the lowering condition more precise, by requiring that the
original vector is >128 bits, which is I believe the only case
where this specific splitting approach is useful.

Note that this doesn't actually produce a legal result (the
alwaysLegal is a lie, as before), but it will cause a proper
globalisel abort instead of an infinite legalization loop.
@llvmbot
Copy link
Collaborator

llvmbot commented Feb 12, 2024

@llvm/pr-subscribers-llvm-globalisel

Author: Nikita Popov (nikic)

Changes

If we have something like G_TRUNC from v2s32 to v2s16, then lowering this to a concat of two G_TRUNC s32 to s16 followed by G_TRUNC from v2s16 to v2s8 does not bring us any closer to legality. In fact, the first part of that is a G_BUILD_VECTOR whose legalization will produce a new G_TRUNC from v2s32 to v2s16, and both G_TRUNCs will then get combined to the original, causing a legalization cycle.

Make the lowering condition more precise, by requiring that the original vector is >128 bits, which is I believe the only case where this specific splitting approach is useful.

Note that this doesn't actually produce a legal result (the alwaysLegal is a lie, as before), but it will cause a proper globalisel abort instead of an infinite legalization loop.

Fixes #81244.


Full diff: https://github.com/llvm/llvm-project/pull/81479.diff

2 Files Affected:

  • (modified) llvm/lib/Target/AArch64/GISel/AArch64LegalizerInfo.cpp (+2-3)
  • (modified) llvm/test/CodeGen/AArch64/GlobalISel/legalize-xtn.mir (+24)
diff --git a/llvm/lib/Target/AArch64/GISel/AArch64LegalizerInfo.cpp b/llvm/lib/Target/AArch64/GISel/AArch64LegalizerInfo.cpp
index cbf5655706e694..2ce1a7bda608e2 100644
--- a/llvm/lib/Target/AArch64/GISel/AArch64LegalizerInfo.cpp
+++ b/llvm/lib/Target/AArch64/GISel/AArch64LegalizerInfo.cpp
@@ -615,9 +615,8 @@ AArch64LegalizerInfo::AArch64LegalizerInfo(const AArch64Subtarget &ST)
       .lowerIf([=](const LegalityQuery &Query) {
         LLT DstTy = Query.Types[0];
         LLT SrcTy = Query.Types[1];
-        return DstTy.isVector() && (SrcTy.getSizeInBits() > 128 ||
-                                    (DstTy.getScalarSizeInBits() * 2 <
-                                     SrcTy.getScalarSizeInBits()));
+        return DstTy.isVector() && SrcTy.getSizeInBits() > 128 &&
+               DstTy.getScalarSizeInBits() * 2 <= SrcTy.getScalarSizeInBits();
       })
 
       .alwaysLegal();
diff --git a/llvm/test/CodeGen/AArch64/GlobalISel/legalize-xtn.mir b/llvm/test/CodeGen/AArch64/GlobalISel/legalize-xtn.mir
index 16b780a8397347..661265173ae82b 100644
--- a/llvm/test/CodeGen/AArch64/GlobalISel/legalize-xtn.mir
+++ b/llvm/test/CodeGen/AArch64/GlobalISel/legalize-xtn.mir
@@ -529,3 +529,27 @@ body:             |
     RET_ReallyLR implicit $q0
 
 ...
+
+---
+name:            pr81244
+tracksRegLiveness: true
+body:             |
+  bb.0:
+    liveins: $d0
+    ; CHECK-LABEL: name: pr81244
+    ; CHECK: liveins: $d0
+    ; CHECK-NEXT: {{  $}}
+    ; CHECK-NEXT: [[COPY:%[0-9]+]]:_(<2 x s32>) = COPY $d0
+    ; CHECK-NEXT: [[TRUNC:%[0-9]+]]:_(<2 x s8>) = G_TRUNC [[COPY]](<2 x s32>)
+    ; CHECK-NEXT: [[CONCAT_VECTORS:%[0-9]+]]:_(<4 x s8>) = G_CONCAT_VECTORS [[TRUNC]](<2 x s8>), [[TRUNC]](<2 x s8>)
+    ; CHECK-NEXT: [[ANYEXT:%[0-9]+]]:_(<4 x s16>) = G_ANYEXT [[CONCAT_VECTORS]](<4 x s8>)
+    ; CHECK-NEXT: $d0 = COPY [[ANYEXT]](<4 x s16>)
+    ; CHECK-NEXT: RET_ReallyLR implicit $d0
+    %0:_(<2 x s32>) = COPY $d0
+    %1:_(<2 x s8>) = G_TRUNC %0(<2 x s32>)
+    %2:_(<4 x s8>) = G_CONCAT_VECTORS %1(<2 x s8>), %1(<2 x s8>)
+    %3:_(<4 x s16>) = G_ANYEXT %2(<4 x s8>)
+    $d0 = COPY %3(<4 x s16>)
+    RET_ReallyLR implicit $d0
+
+...

@llvmbot
Copy link
Collaborator

llvmbot commented Feb 12, 2024

@llvm/pr-subscribers-backend-aarch64

Author: Nikita Popov (nikic)

Changes

If we have something like G_TRUNC from v2s32 to v2s16, then lowering this to a concat of two G_TRUNC s32 to s16 followed by G_TRUNC from v2s16 to v2s8 does not bring us any closer to legality. In fact, the first part of that is a G_BUILD_VECTOR whose legalization will produce a new G_TRUNC from v2s32 to v2s16, and both G_TRUNCs will then get combined to the original, causing a legalization cycle.

Make the lowering condition more precise, by requiring that the original vector is >128 bits, which is I believe the only case where this specific splitting approach is useful.

Note that this doesn't actually produce a legal result (the alwaysLegal is a lie, as before), but it will cause a proper globalisel abort instead of an infinite legalization loop.

Fixes #81244.


Full diff: https://github.com/llvm/llvm-project/pull/81479.diff

2 Files Affected:

  • (modified) llvm/lib/Target/AArch64/GISel/AArch64LegalizerInfo.cpp (+2-3)
  • (modified) llvm/test/CodeGen/AArch64/GlobalISel/legalize-xtn.mir (+24)
diff --git a/llvm/lib/Target/AArch64/GISel/AArch64LegalizerInfo.cpp b/llvm/lib/Target/AArch64/GISel/AArch64LegalizerInfo.cpp
index cbf5655706e694..2ce1a7bda608e2 100644
--- a/llvm/lib/Target/AArch64/GISel/AArch64LegalizerInfo.cpp
+++ b/llvm/lib/Target/AArch64/GISel/AArch64LegalizerInfo.cpp
@@ -615,9 +615,8 @@ AArch64LegalizerInfo::AArch64LegalizerInfo(const AArch64Subtarget &ST)
       .lowerIf([=](const LegalityQuery &Query) {
         LLT DstTy = Query.Types[0];
         LLT SrcTy = Query.Types[1];
-        return DstTy.isVector() && (SrcTy.getSizeInBits() > 128 ||
-                                    (DstTy.getScalarSizeInBits() * 2 <
-                                     SrcTy.getScalarSizeInBits()));
+        return DstTy.isVector() && SrcTy.getSizeInBits() > 128 &&
+               DstTy.getScalarSizeInBits() * 2 <= SrcTy.getScalarSizeInBits();
       })
 
       .alwaysLegal();
diff --git a/llvm/test/CodeGen/AArch64/GlobalISel/legalize-xtn.mir b/llvm/test/CodeGen/AArch64/GlobalISel/legalize-xtn.mir
index 16b780a8397347..661265173ae82b 100644
--- a/llvm/test/CodeGen/AArch64/GlobalISel/legalize-xtn.mir
+++ b/llvm/test/CodeGen/AArch64/GlobalISel/legalize-xtn.mir
@@ -529,3 +529,27 @@ body:             |
     RET_ReallyLR implicit $q0
 
 ...
+
+---
+name:            pr81244
+tracksRegLiveness: true
+body:             |
+  bb.0:
+    liveins: $d0
+    ; CHECK-LABEL: name: pr81244
+    ; CHECK: liveins: $d0
+    ; CHECK-NEXT: {{  $}}
+    ; CHECK-NEXT: [[COPY:%[0-9]+]]:_(<2 x s32>) = COPY $d0
+    ; CHECK-NEXT: [[TRUNC:%[0-9]+]]:_(<2 x s8>) = G_TRUNC [[COPY]](<2 x s32>)
+    ; CHECK-NEXT: [[CONCAT_VECTORS:%[0-9]+]]:_(<4 x s8>) = G_CONCAT_VECTORS [[TRUNC]](<2 x s8>), [[TRUNC]](<2 x s8>)
+    ; CHECK-NEXT: [[ANYEXT:%[0-9]+]]:_(<4 x s16>) = G_ANYEXT [[CONCAT_VECTORS]](<4 x s8>)
+    ; CHECK-NEXT: $d0 = COPY [[ANYEXT]](<4 x s16>)
+    ; CHECK-NEXT: RET_ReallyLR implicit $d0
+    %0:_(<2 x s32>) = COPY $d0
+    %1:_(<2 x s8>) = G_TRUNC %0(<2 x s32>)
+    %2:_(<4 x s8>) = G_CONCAT_VECTORS %1(<2 x s8>), %1(<2 x s8>)
+    %3:_(<4 x s16>) = G_ANYEXT %2(<4 x s8>)
+    $d0 = COPY %3(<4 x s16>)
+    RET_ReallyLR implicit $d0
+
+...

Copy link
Contributor

@aemerson aemerson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks.

@nikic nikic merged commit 070848c into llvm:main Feb 13, 2024
7 checks passed
@nikic nikic deleted the aarch64-trunc-legalize branch February 13, 2024 08:29
llvmbot pushed a commit to llvmbot/llvm-project that referenced this pull request Feb 13, 2024
If we have something like G_TRUNC from v2s32 to v2s16, then lowering
this to a concat of two G_TRUNC s32 to s16 followed by G_TRUNC from
v2s16 to v2s8 does not bring us any closer to legality. In fact, the
first part of that is a G_BUILD_VECTOR whose legalization will produce a
new G_TRUNC from v2s32 to v2s16, and both G_TRUNCs will then get
combined to the original, causing a legalization cycle.

Make the lowering condition more precise, by requiring that the original
vector is >128 bits, which is I believe the only case where this
specific splitting approach is useful.

Note that this doesn't actually produce a legal result (the alwaysLegal
is a lie, as before), but it will cause a proper globalisel abort
instead of an infinite legalization loop.

Fixes llvm#81244.

(cherry picked from commit 070848c)
nikic added a commit to rust-lang/llvm-project that referenced this pull request Feb 13, 2024
If we have something like G_TRUNC from v2s32 to v2s16, then lowering
this to a concat of two G_TRUNC s32 to s16 followed by G_TRUNC from
v2s16 to v2s8 does not bring us any closer to legality. In fact, the
first part of that is a G_BUILD_VECTOR whose legalization will produce a
new G_TRUNC from v2s32 to v2s16, and both G_TRUNCs will then get
combined to the original, causing a legalization cycle.

Make the lowering condition more precise, by requiring that the original
vector is >128 bits, which is I believe the only case where this
specific splitting approach is useful.

Note that this doesn't actually produce a legal result (the alwaysLegal
is a lie, as before), but it will cause a proper globalisel abort
instead of an infinite legalization loop.

Fixes llvm#81244.

(cherry picked from commit 070848c)
llvmbot pushed a commit to llvmbot/llvm-project that referenced this pull request Feb 13, 2024
If we have something like G_TRUNC from v2s32 to v2s16, then lowering
this to a concat of two G_TRUNC s32 to s16 followed by G_TRUNC from
v2s16 to v2s8 does not bring us any closer to legality. In fact, the
first part of that is a G_BUILD_VECTOR whose legalization will produce a
new G_TRUNC from v2s32 to v2s16, and both G_TRUNCs will then get
combined to the original, causing a legalization cycle.

Make the lowering condition more precise, by requiring that the original
vector is >128 bits, which is I believe the only case where this
specific splitting approach is useful.

Note that this doesn't actually produce a legal result (the alwaysLegal
is a lie, as before), but it will cause a proper globalisel abort
instead of an infinite legalization loop.

Fixes llvm#81244.

(cherry picked from commit 070848c)
tstellar pushed a commit to tstellar/llvm-project that referenced this pull request Feb 14, 2024
If we have something like G_TRUNC from v2s32 to v2s16, then lowering
this to a concat of two G_TRUNC s32 to s16 followed by G_TRUNC from
v2s16 to v2s8 does not bring us any closer to legality. In fact, the
first part of that is a G_BUILD_VECTOR whose legalization will produce a
new G_TRUNC from v2s32 to v2s16, and both G_TRUNCs will then get
combined to the original, causing a legalization cycle.

Make the lowering condition more precise, by requiring that the original
vector is >128 bits, which is I believe the only case where this
specific splitting approach is useful.

Note that this doesn't actually produce a legal result (the alwaysLegal
is a lie, as before), but it will cause a proper globalisel abort
instead of an infinite legalization loop.

Fixes llvm#81244.

(cherry picked from commit 070848c)
tstellar pushed a commit to tstellar/llvm-project that referenced this pull request Feb 14, 2024
If we have something like G_TRUNC from v2s32 to v2s16, then lowering
this to a concat of two G_TRUNC s32 to s16 followed by G_TRUNC from
v2s16 to v2s8 does not bring us any closer to legality. In fact, the
first part of that is a G_BUILD_VECTOR whose legalization will produce a
new G_TRUNC from v2s32 to v2s16, and both G_TRUNCs will then get
combined to the original, causing a legalization cycle.

Make the lowering condition more precise, by requiring that the original
vector is >128 bits, which is I believe the only case where this
specific splitting approach is useful.

Note that this doesn't actually produce a legal result (the alwaysLegal
is a lie, as before), but it will cause a proper globalisel abort
instead of an infinite legalization loop.

Fixes llvm#81244.

(cherry picked from commit 070848c)
tstellar pushed a commit to tstellar/llvm-project that referenced this pull request Feb 14, 2024
If we have something like G_TRUNC from v2s32 to v2s16, then lowering
this to a concat of two G_TRUNC s32 to s16 followed by G_TRUNC from
v2s16 to v2s8 does not bring us any closer to legality. In fact, the
first part of that is a G_BUILD_VECTOR whose legalization will produce a
new G_TRUNC from v2s32 to v2s16, and both G_TRUNCs will then get
combined to the original, causing a legalization cycle.

Make the lowering condition more precise, by requiring that the original
vector is >128 bits, which is I believe the only case where this
specific splitting approach is useful.

Note that this doesn't actually produce a legal result (the alwaysLegal
is a lie, as before), but it will cause a proper globalisel abort
instead of an infinite legalization loop.

Fixes llvm#81244.

(cherry picked from commit 070848c)
@pointhex pointhex mentioned this pull request May 7, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[AArch64][GlobalISel] Legalization cycle with shufflevector
3 participants