-
Notifications
You must be signed in to change notification settings - Fork 12k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[AArch64][GISel] Don't pointlessly lower G_TRUNC #81479
Conversation
If we have something like G_TRUNC from v2s32 to v2s16, then lowering this to a concat of two G_TRUNC s32 to s16 followed by G_TRUNC from v2s16 to v2s8 does not bring us any closer to legality. In fact, the first part of that will get legalized back into G_TRUNC from v2s32 to v2s16, and both G_TRUNC will then get combinde to the original, causing a legalization cycle. Make the lowering condition more precise, by requiring that the original vector is >128 bits, which is I believe the only case where this specific splitting approach is useful. Note that this doesn't actually produce a legal result (the alwaysLegal is a lie, as before), but it will cause a proper globalisel abort instead of an infinite legalization loop.
@llvm/pr-subscribers-llvm-globalisel Author: Nikita Popov (nikic) ChangesIf we have something like G_TRUNC from v2s32 to v2s16, then lowering this to a concat of two G_TRUNC s32 to s16 followed by G_TRUNC from v2s16 to v2s8 does not bring us any closer to legality. In fact, the first part of that is a G_BUILD_VECTOR whose legalization will produce a new G_TRUNC from v2s32 to v2s16, and both G_TRUNCs will then get combined to the original, causing a legalization cycle. Make the lowering condition more precise, by requiring that the original vector is >128 bits, which is I believe the only case where this specific splitting approach is useful. Note that this doesn't actually produce a legal result (the alwaysLegal is a lie, as before), but it will cause a proper globalisel abort instead of an infinite legalization loop. Fixes #81244. Full diff: https://github.com/llvm/llvm-project/pull/81479.diff 2 Files Affected:
diff --git a/llvm/lib/Target/AArch64/GISel/AArch64LegalizerInfo.cpp b/llvm/lib/Target/AArch64/GISel/AArch64LegalizerInfo.cpp
index cbf5655706e694..2ce1a7bda608e2 100644
--- a/llvm/lib/Target/AArch64/GISel/AArch64LegalizerInfo.cpp
+++ b/llvm/lib/Target/AArch64/GISel/AArch64LegalizerInfo.cpp
@@ -615,9 +615,8 @@ AArch64LegalizerInfo::AArch64LegalizerInfo(const AArch64Subtarget &ST)
.lowerIf([=](const LegalityQuery &Query) {
LLT DstTy = Query.Types[0];
LLT SrcTy = Query.Types[1];
- return DstTy.isVector() && (SrcTy.getSizeInBits() > 128 ||
- (DstTy.getScalarSizeInBits() * 2 <
- SrcTy.getScalarSizeInBits()));
+ return DstTy.isVector() && SrcTy.getSizeInBits() > 128 &&
+ DstTy.getScalarSizeInBits() * 2 <= SrcTy.getScalarSizeInBits();
})
.alwaysLegal();
diff --git a/llvm/test/CodeGen/AArch64/GlobalISel/legalize-xtn.mir b/llvm/test/CodeGen/AArch64/GlobalISel/legalize-xtn.mir
index 16b780a8397347..661265173ae82b 100644
--- a/llvm/test/CodeGen/AArch64/GlobalISel/legalize-xtn.mir
+++ b/llvm/test/CodeGen/AArch64/GlobalISel/legalize-xtn.mir
@@ -529,3 +529,27 @@ body: |
RET_ReallyLR implicit $q0
...
+
+---
+name: pr81244
+tracksRegLiveness: true
+body: |
+ bb.0:
+ liveins: $d0
+ ; CHECK-LABEL: name: pr81244
+ ; CHECK: liveins: $d0
+ ; CHECK-NEXT: {{ $}}
+ ; CHECK-NEXT: [[COPY:%[0-9]+]]:_(<2 x s32>) = COPY $d0
+ ; CHECK-NEXT: [[TRUNC:%[0-9]+]]:_(<2 x s8>) = G_TRUNC [[COPY]](<2 x s32>)
+ ; CHECK-NEXT: [[CONCAT_VECTORS:%[0-9]+]]:_(<4 x s8>) = G_CONCAT_VECTORS [[TRUNC]](<2 x s8>), [[TRUNC]](<2 x s8>)
+ ; CHECK-NEXT: [[ANYEXT:%[0-9]+]]:_(<4 x s16>) = G_ANYEXT [[CONCAT_VECTORS]](<4 x s8>)
+ ; CHECK-NEXT: $d0 = COPY [[ANYEXT]](<4 x s16>)
+ ; CHECK-NEXT: RET_ReallyLR implicit $d0
+ %0:_(<2 x s32>) = COPY $d0
+ %1:_(<2 x s8>) = G_TRUNC %0(<2 x s32>)
+ %2:_(<4 x s8>) = G_CONCAT_VECTORS %1(<2 x s8>), %1(<2 x s8>)
+ %3:_(<4 x s16>) = G_ANYEXT %2(<4 x s8>)
+ $d0 = COPY %3(<4 x s16>)
+ RET_ReallyLR implicit $d0
+
+...
|
@llvm/pr-subscribers-backend-aarch64 Author: Nikita Popov (nikic) ChangesIf we have something like G_TRUNC from v2s32 to v2s16, then lowering this to a concat of two G_TRUNC s32 to s16 followed by G_TRUNC from v2s16 to v2s8 does not bring us any closer to legality. In fact, the first part of that is a G_BUILD_VECTOR whose legalization will produce a new G_TRUNC from v2s32 to v2s16, and both G_TRUNCs will then get combined to the original, causing a legalization cycle. Make the lowering condition more precise, by requiring that the original vector is >128 bits, which is I believe the only case where this specific splitting approach is useful. Note that this doesn't actually produce a legal result (the alwaysLegal is a lie, as before), but it will cause a proper globalisel abort instead of an infinite legalization loop. Fixes #81244. Full diff: https://github.com/llvm/llvm-project/pull/81479.diff 2 Files Affected:
diff --git a/llvm/lib/Target/AArch64/GISel/AArch64LegalizerInfo.cpp b/llvm/lib/Target/AArch64/GISel/AArch64LegalizerInfo.cpp
index cbf5655706e694..2ce1a7bda608e2 100644
--- a/llvm/lib/Target/AArch64/GISel/AArch64LegalizerInfo.cpp
+++ b/llvm/lib/Target/AArch64/GISel/AArch64LegalizerInfo.cpp
@@ -615,9 +615,8 @@ AArch64LegalizerInfo::AArch64LegalizerInfo(const AArch64Subtarget &ST)
.lowerIf([=](const LegalityQuery &Query) {
LLT DstTy = Query.Types[0];
LLT SrcTy = Query.Types[1];
- return DstTy.isVector() && (SrcTy.getSizeInBits() > 128 ||
- (DstTy.getScalarSizeInBits() * 2 <
- SrcTy.getScalarSizeInBits()));
+ return DstTy.isVector() && SrcTy.getSizeInBits() > 128 &&
+ DstTy.getScalarSizeInBits() * 2 <= SrcTy.getScalarSizeInBits();
})
.alwaysLegal();
diff --git a/llvm/test/CodeGen/AArch64/GlobalISel/legalize-xtn.mir b/llvm/test/CodeGen/AArch64/GlobalISel/legalize-xtn.mir
index 16b780a8397347..661265173ae82b 100644
--- a/llvm/test/CodeGen/AArch64/GlobalISel/legalize-xtn.mir
+++ b/llvm/test/CodeGen/AArch64/GlobalISel/legalize-xtn.mir
@@ -529,3 +529,27 @@ body: |
RET_ReallyLR implicit $q0
...
+
+---
+name: pr81244
+tracksRegLiveness: true
+body: |
+ bb.0:
+ liveins: $d0
+ ; CHECK-LABEL: name: pr81244
+ ; CHECK: liveins: $d0
+ ; CHECK-NEXT: {{ $}}
+ ; CHECK-NEXT: [[COPY:%[0-9]+]]:_(<2 x s32>) = COPY $d0
+ ; CHECK-NEXT: [[TRUNC:%[0-9]+]]:_(<2 x s8>) = G_TRUNC [[COPY]](<2 x s32>)
+ ; CHECK-NEXT: [[CONCAT_VECTORS:%[0-9]+]]:_(<4 x s8>) = G_CONCAT_VECTORS [[TRUNC]](<2 x s8>), [[TRUNC]](<2 x s8>)
+ ; CHECK-NEXT: [[ANYEXT:%[0-9]+]]:_(<4 x s16>) = G_ANYEXT [[CONCAT_VECTORS]](<4 x s8>)
+ ; CHECK-NEXT: $d0 = COPY [[ANYEXT]](<4 x s16>)
+ ; CHECK-NEXT: RET_ReallyLR implicit $d0
+ %0:_(<2 x s32>) = COPY $d0
+ %1:_(<2 x s8>) = G_TRUNC %0(<2 x s32>)
+ %2:_(<4 x s8>) = G_CONCAT_VECTORS %1(<2 x s8>), %1(<2 x s8>)
+ %3:_(<4 x s16>) = G_ANYEXT %2(<4 x s8>)
+ $d0 = COPY %3(<4 x s16>)
+ RET_ReallyLR implicit $d0
+
+...
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks.
If we have something like G_TRUNC from v2s32 to v2s16, then lowering this to a concat of two G_TRUNC s32 to s16 followed by G_TRUNC from v2s16 to v2s8 does not bring us any closer to legality. In fact, the first part of that is a G_BUILD_VECTOR whose legalization will produce a new G_TRUNC from v2s32 to v2s16, and both G_TRUNCs will then get combined to the original, causing a legalization cycle. Make the lowering condition more precise, by requiring that the original vector is >128 bits, which is I believe the only case where this specific splitting approach is useful. Note that this doesn't actually produce a legal result (the alwaysLegal is a lie, as before), but it will cause a proper globalisel abort instead of an infinite legalization loop. Fixes llvm#81244. (cherry picked from commit 070848c)
If we have something like G_TRUNC from v2s32 to v2s16, then lowering this to a concat of two G_TRUNC s32 to s16 followed by G_TRUNC from v2s16 to v2s8 does not bring us any closer to legality. In fact, the first part of that is a G_BUILD_VECTOR whose legalization will produce a new G_TRUNC from v2s32 to v2s16, and both G_TRUNCs will then get combined to the original, causing a legalization cycle. Make the lowering condition more precise, by requiring that the original vector is >128 bits, which is I believe the only case where this specific splitting approach is useful. Note that this doesn't actually produce a legal result (the alwaysLegal is a lie, as before), but it will cause a proper globalisel abort instead of an infinite legalization loop. Fixes llvm#81244. (cherry picked from commit 070848c)
If we have something like G_TRUNC from v2s32 to v2s16, then lowering this to a concat of two G_TRUNC s32 to s16 followed by G_TRUNC from v2s16 to v2s8 does not bring us any closer to legality. In fact, the first part of that is a G_BUILD_VECTOR whose legalization will produce a new G_TRUNC from v2s32 to v2s16, and both G_TRUNCs will then get combined to the original, causing a legalization cycle. Make the lowering condition more precise, by requiring that the original vector is >128 bits, which is I believe the only case where this specific splitting approach is useful. Note that this doesn't actually produce a legal result (the alwaysLegal is a lie, as before), but it will cause a proper globalisel abort instead of an infinite legalization loop. Fixes llvm#81244. (cherry picked from commit 070848c)
If we have something like G_TRUNC from v2s32 to v2s16, then lowering this to a concat of two G_TRUNC s32 to s16 followed by G_TRUNC from v2s16 to v2s8 does not bring us any closer to legality. In fact, the first part of that is a G_BUILD_VECTOR whose legalization will produce a new G_TRUNC from v2s32 to v2s16, and both G_TRUNCs will then get combined to the original, causing a legalization cycle. Make the lowering condition more precise, by requiring that the original vector is >128 bits, which is I believe the only case where this specific splitting approach is useful. Note that this doesn't actually produce a legal result (the alwaysLegal is a lie, as before), but it will cause a proper globalisel abort instead of an infinite legalization loop. Fixes llvm#81244. (cherry picked from commit 070848c)
If we have something like G_TRUNC from v2s32 to v2s16, then lowering this to a concat of two G_TRUNC s32 to s16 followed by G_TRUNC from v2s16 to v2s8 does not bring us any closer to legality. In fact, the first part of that is a G_BUILD_VECTOR whose legalization will produce a new G_TRUNC from v2s32 to v2s16, and both G_TRUNCs will then get combined to the original, causing a legalization cycle. Make the lowering condition more precise, by requiring that the original vector is >128 bits, which is I believe the only case where this specific splitting approach is useful. Note that this doesn't actually produce a legal result (the alwaysLegal is a lie, as before), but it will cause a proper globalisel abort instead of an infinite legalization loop. Fixes llvm#81244. (cherry picked from commit 070848c)
If we have something like G_TRUNC from v2s32 to v2s16, then lowering this to a concat of two G_TRUNC s32 to s16 followed by G_TRUNC from v2s16 to v2s8 does not bring us any closer to legality. In fact, the first part of that is a G_BUILD_VECTOR whose legalization will produce a new G_TRUNC from v2s32 to v2s16, and both G_TRUNCs will then get combined to the original, causing a legalization cycle. Make the lowering condition more precise, by requiring that the original vector is >128 bits, which is I believe the only case where this specific splitting approach is useful. Note that this doesn't actually produce a legal result (the alwaysLegal is a lie, as before), but it will cause a proper globalisel abort instead of an infinite legalization loop. Fixes llvm#81244. (cherry picked from commit 070848c)
If we have something like G_TRUNC from v2s32 to v2s16, then lowering this to a concat of two G_TRUNC s32 to s16 followed by G_TRUNC from v2s16 to v2s8 does not bring us any closer to legality. In fact, the first part of that is a G_BUILD_VECTOR whose legalization will produce a new G_TRUNC from v2s32 to v2s16, and both G_TRUNCs will then get combined to the original, causing a legalization cycle.
Make the lowering condition more precise, by requiring that the original vector is >128 bits, which is I believe the only case where this specific splitting approach is useful.
Note that this doesn't actually produce a legal result (the alwaysLegal is a lie, as before), but it will cause a proper globalisel abort instead of an infinite legalization loop.
Fixes #81244.