-
Notifications
You must be signed in to change notification settings - Fork 12k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[X86][Inline] Skip inline asm in inlining target feature check #83820
Conversation
When inlining across functions with different target features, we perform roughly two checks: 1. The caller features must be a superset of the callee features. 2. Calls in the callee cannot use types where the target feeatures would change the call ABI (e.g. by changing whether something is passed in a zmm or two ymm registers). The latter check is very crude right now. The latter check currently also catches inline asm "calls". I believe that inline asm should be excluded from this check, as it is independent from the usual call ABI, and instead governed by the inline asm constraint string.
@llvm/pr-subscribers-llvm-transforms @llvm/pr-subscribers-backend-x86 Author: Nikita Popov (nikic) ChangesWhen inlining across functions with different target features, we perform roughly two checks:
The latter check currently also catches inline asm "calls". I believe that inline asm should be excluded from this check, as it is independent from the usual call ABI, and instead governed by the inline asm constraint string. Fixes #67054. Full diff: https://github.com/llvm/llvm-project/pull/83820.diff 2 Files Affected:
diff --git a/llvm/lib/Target/X86/X86TargetTransformInfo.cpp b/llvm/lib/Target/X86/X86TargetTransformInfo.cpp
index 18bf32fe1acaad..4cca291a245622 100644
--- a/llvm/lib/Target/X86/X86TargetTransformInfo.cpp
+++ b/llvm/lib/Target/X86/X86TargetTransformInfo.cpp
@@ -6087,6 +6087,10 @@ bool X86TTIImpl::areInlineCompatible(const Function *Caller,
for (const Instruction &I : instructions(Callee)) {
if (const auto *CB = dyn_cast<CallBase>(&I)) {
+ // Having more target features is fine for inline ASM.
+ if (CB->isInlineAsm())
+ continue;
+
SmallVector<Type *, 8> Types;
for (Value *Arg : CB->args())
Types.push_back(Arg->getType());
diff --git a/llvm/test/Transforms/Inline/X86/call-abi-compatibility.ll b/llvm/test/Transforms/Inline/X86/call-abi-compatibility.ll
index f03270bafea999..6f582cab2f1452 100644
--- a/llvm/test/Transforms/Inline/X86/call-abi-compatibility.ll
+++ b/llvm/test/Transforms/Inline/X86/call-abi-compatibility.ll
@@ -94,27 +94,22 @@ define internal void @caller_not_avx4() {
declare i64 @caller_unknown_simple(i64)
-; FIXME: This call should get inlined, because the callee only contains
+; This call should get inlined, because the callee only contains
; inline ASM, not real calls.
define <8 x i64> @caller_inline_asm(ptr %p0, i64 %k, ptr %p1, ptr %p2) #0 {
; CHECK-LABEL: define {{[^@]+}}@caller_inline_asm
; CHECK-SAME: (ptr [[P0:%.*]], i64 [[K:%.*]], ptr [[P1:%.*]], ptr [[P2:%.*]]) #[[ATTR2:[0-9]+]] {
-; CHECK-NEXT: [[CALL:%.*]] = call <8 x i64> @callee_inline_asm(ptr [[P0]], i64 [[K]], ptr [[P1]], ptr [[P2]])
-; CHECK-NEXT: ret <8 x i64> [[CALL]]
+; CHECK-NEXT: [[SRC_I:%.*]] = load <8 x i64>, ptr [[P0]], align 64
+; CHECK-NEXT: [[A_I:%.*]] = load <8 x i64>, ptr [[P1]], align 64
+; CHECK-NEXT: [[B_I:%.*]] = load <8 x i64>, ptr [[P2]], align 64
+; CHECK-NEXT: [[TMP1:%.*]] = call <8 x i64> asm "vpaddb\09$($3, $2, $0 {$1}", "=v,^Yk,v,v,0,~{dirflag},~{fpsr},~{flags}"(i64 [[K]], <8 x i64> [[A_I]], <8 x i64> [[B_I]], <8 x i64> [[SRC_I]])
+; CHECK-NEXT: ret <8 x i64> [[TMP1]]
;
%call = call <8 x i64> @callee_inline_asm(ptr %p0, i64 %k, ptr %p1, ptr %p2)
ret <8 x i64> %call
}
define internal <8 x i64> @callee_inline_asm(ptr %p0, i64 %k, ptr %p1, ptr %p2) #1 {
-; CHECK-LABEL: define {{[^@]+}}@callee_inline_asm
-; CHECK-SAME: (ptr [[P0:%.*]], i64 [[K:%.*]], ptr [[P1:%.*]], ptr [[P2:%.*]]) #[[ATTR3:[0-9]+]] {
-; CHECK-NEXT: [[SRC:%.*]] = load <8 x i64>, ptr [[P0]], align 64
-; CHECK-NEXT: [[A:%.*]] = load <8 x i64>, ptr [[P1]], align 64
-; CHECK-NEXT: [[B:%.*]] = load <8 x i64>, ptr [[P2]], align 64
-; CHECK-NEXT: [[TMP1:%.*]] = tail call <8 x i64> asm "vpaddb\09$($3, $2, $0 {$1}", "=v,^Yk,v,v,0,~{dirflag},~{fpsr},~{flags}"(i64 [[K]], <8 x i64> [[A]], <8 x i64> [[B]], <8 x i64> [[SRC]])
-; CHECK-NEXT: ret <8 x i64> [[TMP1]]
-;
%src = load <8 x i64>, ptr %p0, align 64
%a = load <8 x i64>, ptr %p1, align 64
%b = load <8 x i64>, ptr %p2, align 64
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can always drop the second check. I assume if a callee calls something unmatching its feature, e.g., 512-bit vector with AVX2. User would have got a warning about ABI change. Such user scenario is not ABI compatible already. We should not care it at all. Passing 512-bit vector with AVX512 feature is guaranteed not split in two ymm registers. We should not worry about it's changed by inlining.
Anyway, inline asm never has such oversized argument problem. So I'm good to start from it.
feeatures -> features |
The problem is that this doesn't just happen in user-written code, but also as a result of argument promotion for example. We do want vector arguments to be promoted, but after this has happened, inlining may no longer be safe (even if vector types are never passed by value in the original code). |
I think I have fixed the argument promotion issue by https://reviews.llvm.org/D123284 |
I don't think it (fully) addresses the issue, at least because non-clang frontends do not use min-legal-vector-width. |
The design of |
@phoebewang I just double checked to confirm that your patch does not address this issue. Here's a very simple example:
Note that this does not pass any vectors by value. Now comment out the check in areInlineCompatible and run |
…83820) When inlining across functions with different target features, we perform roughly two checks: 1. The caller features must be a superset of the callee features. 2. Calls in the callee cannot use types where the target features would change the call ABI (e.g. by changing whether something is passed in a zmm or two ymm registers). The latter check is very crude right now. The latter check currently also catches inline asm "calls". I believe that inline asm should be excluded from this check, as it is independent from the usual call ABI, and instead governed by the inline asm constraint string. Fixes llvm#67054. (cherry picked from commit e84182a)
/pull-request #84029 |
I didn't get ymm0 in test1: The argument is passed by pointer between test1 and test2, because in The only hole was |
Sorry, I missed the condition |
…83820) When inlining across functions with different target features, we perform roughly two checks: 1. The caller features must be a superset of the callee features. 2. Calls in the callee cannot use types where the target features would change the call ABI (e.g. by changing whether something is passed in a zmm or two ymm registers). The latter check is very crude right now. The latter check currently also catches inline asm "calls". I believe that inline asm should be excluded from this check, as it is independent from the usual call ABI, and instead governed by the inline asm constraint string. Fixes llvm#67054. (cherry picked from commit e84182a)
When inlining across functions with different target features, we perform roughly two checks:
The latter check currently also catches inline asm "calls". I believe that inline asm should be excluded from this check, as it is independent from the usual call ABI, and instead governed by the inline asm constraint string.
Fixes #67054.