Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[AArch64][SME] Fix inlining bug introduced in #78703 #79994

Merged
merged 3 commits into from
Jan 31, 2024

Conversation

sdesmalen-arm
Copy link
Collaborator

Calling a __arm_locally_streaming function from a function that
is not a streaming-SVE function would lead to incorrect inlining.

The issue didn't surface because the tests were not testing what
they were supposed to test.

This PR consists of two patches:

  • The first one fixes the tests so that they are testing the right things.
  • The second patch fixes the issue.

The code that tests if two functions are compatible to inline does a
compatibility check and then combines that with
`&& hasPossibleIncompatibleOps(..)`.

But `hasPossibleIncompatibleOps` always returns `false` for a function that
merely does a regular function call, which meant that many of the tests
weren't testing what they were supposed to test. By choosing a call to an
intrinsic that is sensitive to streaming-mode (i.e. llvm.vscale()), the
function will always return true, making the tests more sensible.
The issue didn't surface because the tests were not testing what
they were supposed to test.
@llvmbot
Copy link
Collaborator

llvmbot commented Jan 30, 2024

@llvm/pr-subscribers-backend-aarch64

@llvm/pr-subscribers-llvm-transforms

Author: Sander de Smalen (sdesmalen-arm)

Changes

Calling a __arm_locally_streaming function from a function that
is not a streaming-SVE function would lead to incorrect inlining.

The issue didn't surface because the tests were not testing what
they were supposed to test.

This PR consists of two patches:

  • The first one fixes the tests so that they are testing the right things.
  • The second patch fixes the issue.

Patch is 25.82 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/79994.diff

2 Files Affected:

  • (modified) llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp (+11-6)
  • (modified) llvm/test/Transforms/Inline/AArch64/sme-pstatesm-attrs.ll (+181-180)
diff --git a/llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp b/llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp
index d611338fc268..992b11da7eee 100644
--- a/llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp
+++ b/llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp
@@ -233,15 +233,20 @@ static bool hasPossibleIncompatibleOps(const Function *F) {
 
 bool AArch64TTIImpl::areInlineCompatible(const Function *Caller,
                                          const Function *Callee) const {
-  SMEAttrs CallerAttrs(*Caller);
-  SMEAttrs CalleeAttrs(*Callee);
+  SMEAttrs CallerAttrs(*Caller), CalleeAttrs(*Callee);
+
+  // When inlining, we should consider the body of the function, not the
+  // interface.
+  if (CalleeAttrs.hasStreamingBody()) {
+    CalleeAttrs.set(SMEAttrs::SM_Compatible, false);
+    CalleeAttrs.set(SMEAttrs::SM_Enabled, true);
+  }
+
   if (CalleeAttrs.hasNewZABody())
     return false;
 
   if (CallerAttrs.requiresLazySave(CalleeAttrs) ||
-      (CallerAttrs.requiresSMChange(CalleeAttrs) &&
-       (!CallerAttrs.hasStreamingInterfaceOrBody() ||
-        !CalleeAttrs.hasStreamingBody()))) {
+      CallerAttrs.requiresSMChange(CalleeAttrs)) {
     if (hasPossibleIncompatibleOps(Callee))
       return false;
   }
@@ -4062,4 +4067,4 @@ bool AArch64TTIImpl::shouldTreatInstructionLikeSelect(const Instruction *I) {
       cast<BranchInst>(I->getNextNode())->isUnconditional())
     return true;
   return BaseT::shouldTreatInstructionLikeSelect(I);
-}
\ No newline at end of file
+}
diff --git a/llvm/test/Transforms/Inline/AArch64/sme-pstatesm-attrs.ll b/llvm/test/Transforms/Inline/AArch64/sme-pstatesm-attrs.ll
index d6b1f3ef45e7..25b9aad3949b 100644
--- a/llvm/test/Transforms/Inline/AArch64/sme-pstatesm-attrs.ll
+++ b/llvm/test/Transforms/Inline/AArch64/sme-pstatesm-attrs.ll
@@ -2,70 +2,71 @@
 ; RUN: opt < %s -mtriple=aarch64-unknown-linux-gnu -mattr=+sme -S -passes=inline | FileCheck %s
 
 declare void @inlined_body() "aarch64_pstate_sm_compatible";
+declare i32 @llvm.vscale()
 
 ; Define some functions that will be called by the functions below.
 ; These just call a '...body()' function. If we see the call to one of
 ; these functions being replaced by '...body()', then we know it has been
 ; inlined.
 
-define void @normal_callee() {
-; CHECK-LABEL: define void @normal_callee
+define i32 @normal_callee() {
+; CHECK-LABEL: define i32 @normal_callee
 ; CHECK-SAME: () #[[ATTR1:[0-9]+]] {
 ; CHECK-NEXT:  entry:
-; CHECK-NEXT:    call void @inlined_body()
-; CHECK-NEXT:    ret void
+; CHECK-NEXT:    [[RES:%.*]] = call i32 @llvm.vscale.i32()
+; CHECK-NEXT:    ret i32 [[RES]]
 ;
 entry:
-  call void @inlined_body()
-  ret void
+  %res = call i32 @llvm.vscale.i32()
+  ret i32 %res
 }
 
-define void @streaming_callee() "aarch64_pstate_sm_enabled" {
-; CHECK-LABEL: define void @streaming_callee
+define i32 @streaming_callee() "aarch64_pstate_sm_enabled" {
+; CHECK-LABEL: define i32 @streaming_callee
 ; CHECK-SAME: () #[[ATTR2:[0-9]+]] {
 ; CHECK-NEXT:  entry:
-; CHECK-NEXT:    call void @inlined_body()
-; CHECK-NEXT:    ret void
+; CHECK-NEXT:    [[RES:%.*]] = call i32 @llvm.vscale.i32()
+; CHECK-NEXT:    ret i32 [[RES]]
 ;
 entry:
-  call void @inlined_body()
-  ret void
+  %res = call i32 @llvm.vscale.i32()
+  ret i32 %res
 }
 
-define void @locally_streaming_callee() "aarch64_pstate_sm_body" {
-; CHECK-LABEL: define void @locally_streaming_callee
+define i32 @locally_streaming_callee() "aarch64_pstate_sm_body" {
+; CHECK-LABEL: define i32 @locally_streaming_callee
 ; CHECK-SAME: () #[[ATTR3:[0-9]+]] {
 ; CHECK-NEXT:  entry:
-; CHECK-NEXT:    call void @inlined_body()
-; CHECK-NEXT:    ret void
+; CHECK-NEXT:    [[RES:%.*]] = call i32 @llvm.vscale.i32()
+; CHECK-NEXT:    ret i32 [[RES]]
 ;
 entry:
-  call void @inlined_body()
-  ret void
+  %res = call i32 @llvm.vscale.i32()
+  ret i32 %res
 }
 
-define void @streaming_compatible_callee() "aarch64_pstate_sm_compatible" {
-; CHECK-LABEL: define void @streaming_compatible_callee
+define i32 @streaming_compatible_callee() "aarch64_pstate_sm_compatible" {
+; CHECK-LABEL: define i32 @streaming_compatible_callee
 ; CHECK-SAME: () #[[ATTR0:[0-9]+]] {
 ; CHECK-NEXT:  entry:
-; CHECK-NEXT:    call void @inlined_body()
-; CHECK-NEXT:    ret void
+; CHECK-NEXT:    [[RES:%.*]] = call i32 @llvm.vscale.i32()
+; CHECK-NEXT:    ret i32 [[RES]]
 ;
 entry:
-  call void @inlined_body()
-  ret void
+  %res = call i32 @llvm.vscale.i32()
+  ret i32 %res
 }
 
-define void @streaming_compatible_locally_streaming_callee() "aarch64_pstate_sm_compatible" "aarch64_pstate_sm_body" {
-; CHECK-LABEL: define void @streaming_compatible_locally_streaming_callee
+define i32 @streaming_compatible_locally_streaming_callee() "aarch64_pstate_sm_compatible" "aarch64_pstate_sm_body" {
+; CHECK-LABEL: define i32 @streaming_compatible_locally_streaming_callee
 ; CHECK-SAME: () #[[ATTR4:[0-9]+]] {
 ; CHECK-NEXT:  entry:
-; CHECK-NEXT:    call void @inlined_body()
-; CHECK-NEXT:    ret void
+; CHECK-NEXT:    [[RES:%.*]] = call i32 @llvm.vscale.i32()
+; CHECK-NEXT:    ret i32 [[RES]]
 ;
 entry:
-  call void @inlined_body()
-  ret void
+  %res = call i32 @llvm.vscale()
+  ret i32 %res
 }
 
 ; Now test that inlining only happens when their streaming modes match.
@@ -85,16 +86,16 @@ entry:
 ; [ ] N  -> SC
 ; [ ] N  -> N + B
 ; [ ] N  -> SC + B
-define void @normal_caller_normal_callee_inline() {
-; CHECK-LABEL: define void @normal_caller_normal_callee_inline
+define i32 @normal_caller_normal_callee_inline() {
+; CHECK-LABEL: define i32 @normal_caller_normal_callee_inline
 ; CHECK-SAME: () #[[ATTR1]] {
 ; CHECK-NEXT:  entry:
-; CHECK-NEXT:    call void @inlined_body()
-; CHECK-NEXT:    ret void
+; CHECK-NEXT:    [[RES_I:%.*]] = call i32 @llvm.vscale.i32()
+; CHECK-NEXT:    ret i32 [[RES_I]]
 ;
 entry:
-  call void @normal_callee()
-  ret void
+  %res = call i32 @normal_callee()
+  ret i32 %res
 }
 
 ; [ ] N  -> N
@@ -102,16 +103,16 @@ entry:
 ; [ ] N  -> SC
 ; [ ] N  -> N + B
 ; [ ] N  -> SC + B
-define void @normal_caller_streaming_callee_inline() {
-; CHECK-LABEL: define void @normal_caller_streaming_callee_inline
+define i32 @normal_caller_streaming_callee_dont_inline() {
+; CHECK-LABEL: define i32 @normal_caller_streaming_callee_dont_inline
 ; CHECK-SAME: () #[[ATTR1]] {
 ; CHECK-NEXT:  entry:
-; CHECK-NEXT:    call void @inlined_body()
-; CHECK-NEXT:    ret void
+; CHECK-NEXT:    [[RES:%.*]] = call i32 @streaming_callee()
+; CHECK-NEXT:    ret i32 [[RES]]
 ;
 entry:
-  call void @streaming_callee()
-  ret void
+  %res = call i32 @streaming_callee()
+  ret i32 %res
 }
 
 ; [ ] N  -> N
@@ -119,16 +120,16 @@ entry:
 ; [x] N  -> SC
 ; [ ] N  -> N + B
 ; [ ] N  -> SC + B
-define void @normal_caller_streaming_compatible_callee_inline() {
-; CHECK-LABEL: define void @normal_caller_streaming_compatible_callee_inline
+define i32 @normal_caller_streaming_compatible_callee_inline() {
+; CHECK-LABEL: define i32 @normal_caller_streaming_compatible_callee_inline
 ; CHECK-SAME: () #[[ATTR1]] {
 ; CHECK-NEXT:  entry:
-; CHECK-NEXT:    call void @inlined_body()
-; CHECK-NEXT:    ret void
+; CHECK-NEXT:    [[RES_I:%.*]] = call i32 @llvm.vscale.i32()
+; CHECK-NEXT:    ret i32 [[RES_I]]
 ;
 entry:
-  call void @streaming_compatible_callee()
-  ret void
+  %res = call i32 @streaming_compatible_callee()
+  ret i32 %res
 }
 
 ; [ ] N  -> N
@@ -136,16 +137,16 @@ entry:
 ; [ ] N  -> SC
 ; [x] N  -> N + B
 ; [ ] N  -> SC + B
-define void @normal_caller_locally_streaming_callee_inline() {
-; CHECK-LABEL: define void @normal_caller_locally_streaming_callee_inline
+define i32 @normal_caller_locally_streaming_callee_dont_inline() {
+; CHECK-LABEL: define i32 @normal_caller_locally_streaming_callee_dont_inline
 ; CHECK-SAME: () #[[ATTR1]] {
 ; CHECK-NEXT:  entry:
-; CHECK-NEXT:    call void @inlined_body()
-; CHECK-NEXT:    ret void
+; CHECK-NEXT:    [[RES:%.*]] = call i32 @locally_streaming_callee()
+; CHECK-NEXT:    ret i32 [[RES]]
 ;
 entry:
-  call void @locally_streaming_callee()
-  ret void
+  %res = call i32 @locally_streaming_callee()
+  ret i32 %res
 }
 
 ; [ ] N  -> N
@@ -153,16 +154,16 @@ entry:
 ; [ ] N  -> SC
 ; [ ] N  -> N + B
 ; [x] N  -> SC + B
-define void @normal_caller_streaming_compatible_locally_streaming_callee_inline() {
-; CHECK-LABEL: define void @normal_caller_streaming_compatible_locally_streaming_callee_inline
+define i32 @normal_caller_streaming_compatible_locally_streaming_callee_dont_inline() {
+; CHECK-LABEL: define i32 @normal_caller_streaming_compatible_locally_streaming_callee_dont_inline
 ; CHECK-SAME: () #[[ATTR1]] {
 ; CHECK-NEXT:  entry:
-; CHECK-NEXT:    call void @inlined_body()
-; CHECK-NEXT:    ret void
+; CHECK-NEXT:    [[RES:%.*]] = call i32 @streaming_compatible_locally_streaming_callee()
+; CHECK-NEXT:    ret i32 [[RES]]
 ;
 entry:
-  call void @streaming_compatible_locally_streaming_callee()
-  ret void
+  %res = call i32 @streaming_compatible_locally_streaming_callee()
+  ret i32 %res
 }
 
 ; [x] S  -> N
@@ -170,16 +171,16 @@ entry:
 ; [ ] S  -> SC
 ; [ ] S  -> N + B
 ; [ ] S  -> SC + B
-define void @streaming_caller_normal_callee_inline() "aarch64_pstate_sm_enabled" {
-; CHECK-LABEL: define void @streaming_caller_normal_callee_inline
+define i32 @streaming_caller_normal_callee_dont_inline() "aarch64_pstate_sm_enabled" {
+; CHECK-LABEL: define i32 @streaming_caller_normal_callee_dont_inline
 ; CHECK-SAME: () #[[ATTR2]] {
 ; CHECK-NEXT:  entry:
-; CHECK-NEXT:    call void @inlined_body()
-; CHECK-NEXT:    ret void
+; CHECK-NEXT:    [[RES:%.*]] = call i32 @normal_callee()
+; CHECK-NEXT:    ret i32 [[RES]]
 ;
 entry:
-  call void @normal_callee()
-  ret void
+  %res = call i32 @normal_callee()
+  ret i32 %res
 }
 
 ; [ ] S  -> N
@@ -187,16 +188,16 @@ entry:
 ; [ ] S  -> SC
 ; [ ] S  -> N + B
 ; [ ] S  -> SC + B
-define void @streaming_caller_streaming_callee_inline() "aarch64_pstate_sm_enabled" {
-; CHECK-LABEL: define void @streaming_caller_streaming_callee_inline
+define i32 @streaming_caller_streaming_callee_inline() "aarch64_pstate_sm_enabled" {
+; CHECK-LABEL: define i32 @streaming_caller_streaming_callee_inline
 ; CHECK-SAME: () #[[ATTR2]] {
 ; CHECK-NEXT:  entry:
-; CHECK-NEXT:    call void @inlined_body()
-; CHECK-NEXT:    ret void
+; CHECK-NEXT:    [[RES_I:%.*]] = call i32 @llvm.vscale.i32()
+; CHECK-NEXT:    ret i32 [[RES_I]]
 ;
 entry:
-  call void @streaming_callee()
-  ret void
+  %res = call i32 @streaming_callee()
+  ret i32 %res
 }
 
 ; [ ] S  -> N
@@ -204,16 +205,16 @@ entry:
 ; [x] S  -> SC
 ; [ ] S  -> N + B
 ; [ ] S  -> SC + B
-define void @streaming_caller_streaming_compatible_callee_inline() "aarch64_pstate_sm_enabled" {
-; CHECK-LABEL: define void @streaming_caller_streaming_compatible_callee_inline
+define i32 @streaming_caller_streaming_compatible_callee_inline() "aarch64_pstate_sm_enabled" {
+; CHECK-LABEL: define i32 @streaming_caller_streaming_compatible_callee_inline
 ; CHECK-SAME: () #[[ATTR2]] {
 ; CHECK-NEXT:  entry:
-; CHECK-NEXT:    call void @inlined_body()
-; CHECK-NEXT:    ret void
+; CHECK-NEXT:    [[RES_I:%.*]] = call i32 @llvm.vscale.i32()
+; CHECK-NEXT:    ret i32 [[RES_I]]
 ;
 entry:
-  call void @streaming_compatible_callee()
-  ret void
+  %res = call i32 @streaming_compatible_callee()
+  ret i32 %res
 }
 
 ; [ ] S  -> N
@@ -221,16 +222,16 @@ entry:
 ; [ ] S  -> SC
 ; [x] S  -> N + B
 ; [ ] S  -> SC + B
-define void @streaming_caller_locally_streaming_callee_inline() "aarch64_pstate_sm_enabled" {
-; CHECK-LABEL: define void @streaming_caller_locally_streaming_callee_inline
+define i32 @streaming_caller_locally_streaming_callee_inline() "aarch64_pstate_sm_enabled" {
+; CHECK-LABEL: define i32 @streaming_caller_locally_streaming_callee_inline
 ; CHECK-SAME: () #[[ATTR2]] {
 ; CHECK-NEXT:  entry:
-; CHECK-NEXT:    call void @inlined_body()
-; CHECK-NEXT:    ret void
+; CHECK-NEXT:    [[RES_I:%.*]] = call i32 @llvm.vscale.i32()
+; CHECK-NEXT:    ret i32 [[RES_I]]
 ;
 entry:
-  call void @locally_streaming_callee()
-  ret void
+  %res = call i32 @locally_streaming_callee()
+  ret i32 %res
 }
 
 ; [ ] S  -> N
@@ -238,16 +239,16 @@ entry:
 ; [ ] S  -> SC
 ; [ ] S  -> N + B
 ; [x] S  -> SC + B
-define void @streaming_caller_streaming_compatible_locally_streaming_callee_inline() "aarch64_pstate_sm_enabled" {
-; CHECK-LABEL: define void @streaming_caller_streaming_compatible_locally_streaming_callee_inline
+define i32 @streaming_caller_streaming_compatible_locally_streaming_callee_inline() "aarch64_pstate_sm_enabled" {
+; CHECK-LABEL: define i32 @streaming_caller_streaming_compatible_locally_streaming_callee_inline
 ; CHECK-SAME: () #[[ATTR2]] {
 ; CHECK-NEXT:  entry:
-; CHECK-NEXT:    call void @inlined_body()
-; CHECK-NEXT:    ret void
+; CHECK-NEXT:    [[RES_I:%.*]] = call i32 @llvm.vscale.i32()
+; CHECK-NEXT:    ret i32 [[RES_I]]
 ;
 entry:
-  call void @streaming_compatible_locally_streaming_callee()
-  ret void
+  %res = call i32 @streaming_compatible_locally_streaming_callee()
+  ret i32 %res
 }
 
 ; [x] N + B -> N
@@ -255,16 +256,16 @@ entry:
 ; [ ] N + B -> SC
 ; [ ] N + B -> N + B
 ; [ ] N + B -> SC + B
-define void @locally_streaming_caller_normal_callee_inline() "aarch64_pstate_sm_body" {
-; CHECK-LABEL: define void @locally_streaming_caller_normal_callee_inline
+define i32 @locally_streaming_caller_normal_callee_dont_inline() "aarch64_pstate_sm_body" {
+; CHECK-LABEL: define i32 @locally_streaming_caller_normal_callee_dont_inline
 ; CHECK-SAME: () #[[ATTR3]] {
 ; CHECK-NEXT:  entry:
-; CHECK-NEXT:    call void @inlined_body()
-; CHECK-NEXT:    ret void
+; CHECK-NEXT:    [[RES:%.*]] = call i32 @normal_callee()
+; CHECK-NEXT:    ret i32 [[RES]]
 ;
 entry:
-  call void @normal_callee()
-  ret void
+  %res = call i32 @normal_callee()
+  ret i32 %res
 }
 
 ; [ ] N + B -> N
@@ -272,16 +273,16 @@ entry:
 ; [ ] N + B -> SC
 ; [ ] N + B -> N + B
 ; [ ] N + B -> SC + B
-define void @locally_streaming_caller_streaming_callee_inline() "aarch64_pstate_sm_body" {
-; CHECK-LABEL: define void @locally_streaming_caller_streaming_callee_inline
+define i32 @locally_streaming_caller_streaming_callee_inline() "aarch64_pstate_sm_body" {
+; CHECK-LABEL: define i32 @locally_streaming_caller_streaming_callee_inline
 ; CHECK-SAME: () #[[ATTR3]] {
 ; CHECK-NEXT:  entry:
-; CHECK-NEXT:    call void @inlined_body()
-; CHECK-NEXT:    ret void
+; CHECK-NEXT:    [[RES_I:%.*]] = call i32 @llvm.vscale.i32()
+; CHECK-NEXT:    ret i32 [[RES_I]]
 ;
 entry:
-  call void @streaming_callee()
-  ret void
+  %res = call i32 @streaming_callee()
+  ret i32 %res
 }
 
 ; [ ] N + B -> N
@@ -289,16 +290,16 @@ entry:
 ; [x] N + B -> SC
 ; [ ] N + B -> N + B
 ; [ ] N + B -> SC + B
-define void @locally_streaming_caller_streaming_compatible_callee_inline() "aarch64_pstate_sm_body" {
-; CHECK-LABEL: define void @locally_streaming_caller_streaming_compatible_callee_inline
+define i32 @locally_streaming_caller_streaming_compatible_callee_inline() "aarch64_pstate_sm_body" {
+; CHECK-LABEL: define i32 @locally_streaming_caller_streaming_compatible_callee_inline
 ; CHECK-SAME: () #[[ATTR3]] {
 ; CHECK-NEXT:  entry:
-; CHECK-NEXT:    call void @inlined_body()
-; CHECK-NEXT:    ret void
+; CHECK-NEXT:    [[RES_I:%.*]] = call i32 @llvm.vscale.i32()
+; CHECK-NEXT:    ret i32 [[RES_I]]
 ;
 entry:
-  call void @streaming_compatible_callee()
-  ret void
+  %res = call i32 @streaming_compatible_callee()
+  ret i32 %res
 }
 
 ; [ ] N + B -> N
@@ -306,16 +307,16 @@ entry:
 ; [ ] N + B -> SC
 ; [x] N + B -> N + B
 ; [ ] N + B -> SC + B
-define void @locally_streaming_caller_locally_streaming_callee_inline() "aarch64_pstate_sm_body" {
-; CHECK-LABEL: define void @locally_streaming_caller_locally_streaming_callee_inline
+define i32 @locally_streaming_caller_locally_streaming_callee_inline() "aarch64_pstate_sm_body" {
+; CHECK-LABEL: define i32 @locally_streaming_caller_locally_streaming_callee_inline
 ; CHECK-SAME: () #[[ATTR3]] {
 ; CHECK-NEXT:  entry:
-; CHECK-NEXT:    call void @inlined_body()
-; CHECK-NEXT:    ret void
+; CHECK-NEXT:    [[RES_I:%.*]] = call i32 @llvm.vscale.i32()
+; CHECK-NEXT:    ret i32 [[RES_I]]
 ;
 entry:
-  call void @locally_streaming_callee()
-  ret void
+  %res = call i32 @locally_streaming_callee()
+  ret i32 %res
 }
 
 ; [ ] N + B -> N
@@ -323,16 +324,16 @@ entry:
 ; [ ] N + B -> SC
 ; [ ] N + B -> N + B
 ; [x] N + B -> SC + B
-define void @locally_streaming_caller_streaming_compatible_locally_streaming_callee_inline() "aarch64_pstate_sm_body" {
-; CHECK-LABEL: define void @locally_streaming_caller_streaming_compatible_locally_streaming_callee_inline
+define i32 @locally_streaming_caller_streaming_compatible_locally_streaming_callee_inline() "aarch64_pstate_sm_body" {
+; CHECK-LABEL: define i32 @locally_streaming_caller_streaming_compatible_locally_streaming_callee_inline
 ; CHECK-SAME: () #[[ATTR3]] {
 ; CHECK-NEXT:  entry:
-; CHECK-NEXT:    call void @inlined_body()
-; CHECK-NEXT:    ret void
+; CHECK-NEXT:    [[RES_I:%.*]] = call i32 @llvm.vscale.i32()
+; CHECK-NEXT:    ret i32 [[RES_I]]
 ;
 entry:
-  call void @streaming_compatible_locally_streaming_callee()
-  ret void
+  %res = call i32 @streaming_compatible_locally_streaming_callee()
+  ret i32 %res
 }
 
 ; [x] SC -> N
@@ -340,16 +341,16 @@ entry:
 ; [ ] SC -> SC
 ; [ ] SC -> N + B
 ; [ ] SC -> SC + B
-define void @streaming_compatible_caller_normal_callee_inline() "aarch64_pstate_sm_compatible" {
-; CHECK-LABEL: define void @streaming_compatible_caller_normal_callee_inline
+define i32 @streaming_compatible_caller_normal_callee_dont_inline() "aarch64_pstate_sm_compatible" {
+; CHECK-LABEL: define i32 @streaming_compatible_caller_normal_callee_dont_inline
 ; CHECK-SAME: () #[[ATTR0]] {
 ; CHECK-NEXT:  entry:
-; CHECK-NEXT:    call void @inlined_body()
-; CHECK-NEXT:    ret void
+; CHECK-NEXT:    [[RES:%.*]] = call i32 @normal_callee()
+; CHECK-NEXT:    ret i32 [[RES]]
 ;
 entry:
-  call void @normal_callee()
-  ret void
+  %res = call i32 @normal_callee()
+  ret i32 %res
 }
 
 ; [ ] SC -> N
@@ -357,16 +358,16 @@ entry:
 ; [ ] SC -> SC
 ; [ ] SC -> N + B
 ; [ ] SC -> SC + B
-define void @streaming_compatible_caller_streaming_callee_inline() "aarch64_pstate_sm_compatible" {
-; CHECK-LABEL: define void @streaming_compatible_caller_streaming_callee_inline
+define i32 @streaming_compatible_caller_streaming_callee_dont_inline() "aarch64_pstate_sm_compatible" {
+; CHECK-LABEL: define i32 @streaming_compatible_caller_streaming_callee_dont_inline
 ; CHECK-SAME: () #[[ATTR0]] {
 ; CHECK-NEXT:  entry:
-; CHECK-NEXT:    call void @inlined_body()
-; CHECK-NEXT:    ret void
+; CHECK-NEXT:    [[RES:%.*]] = call i32 @streaming_callee()
+; CHECK-NEXT:    ret i32 [[RES]]
 ;
 entry:
-  call void @streaming_callee()
-  ret void
+  %res = call i32 @streaming_callee()
+  ret i32 %res
 }
 
 ; [ ] SC -> N
@@ -374,16 +375,16 @@ entry:
 ; [x] SC -> SC
 ; [ ] SC -> N + B
 ; [ ] SC -> SC + B
-define void @streaming_compatible_caller_streaming_compatible_callee_inline() "aarch64_pstate_sm_compatible" {
-; CHECK-LABEL: define void @streaming_compatible_caller_streaming_compatible_callee_inline
+define i32 @streaming_compatible_caller_streaming_compatible_callee_inline() "aarch64_pstate_sm_compatible" {
+; CHECK-LABEL: define i32 @streaming_compatible_caller_streaming_compatible_callee_inline
 ; CHECK-SAME: () #[[ATTR0]] {
 ; CHECK-NEXT:  entry:
-; CHECK-NEXT:    call void @inlined_body()
-; CHECK-NEXT:    ret void
+; CHECK-NEXT:    [[RES_I:%.*]] = call i32 @llvm.vscale.i32()
+; CHECK-NEXT:    ret i32 [[RES_I]]
 ;
 entry:
-  call void @streaming_compatible_callee()
-  ret void
+  %res = call i32 @streaming_compatible_callee()
+  ret i32 %res
 }
 
 ; [ ] SC -> N
@@ -391,16 +392,16 @@ entry:
 ; [ ] SC -> SC
 ; [x] SC -> N + B
 ; [ ] SC -> SC + B
-define void @streaming_compatible_caller_locally_streaming_callee_inline() "aarch64_pstate_sm_compatible" {
-; CHECK-LABEL: define void @streaming_compatible_caller_locally_streaming_callee_inline
+define i32 @streaming_compatible_caller_locally_streaming_callee_dont_inline() "aarch64_pstate_sm_compatible" {
+; CHECK-LABEL: define i32 @streaming_compatible_caller_locally_streaming_callee_dont_inline
 ; CHECK-SAME: () #[[ATTR0]] {
 ; CHECK-NEXT:  entry:
-; CHECK-NEXT:    call void @inlined_body()
-; CHECK-NEXT:    ret void
+; CHECK-NEXT:    [[RES:%.*]] = call i32 @locally_streaming_callee()
+; CHECK-NEXT:    ret i32 [[RES]]
 ;
 entry:
-  call void @locally_streaming_cal...
[truncated]

@@ -2,70 +2,71 @@
; RUN: opt < %s -mtriple=aarch64-unknown-linux-gnu -mattr=+sme -S -passes=inline | FileCheck %s

declare void @inlined_body() "aarch64_pstate_sm_compatible";
declare i32 @llvm.vscale()

; Define some functions that will be called by the functions below.
; These just call a '...body()' function. If we see the call to one of
; these functions being replaced by '...body()', then we know it has been
; inlined.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this comment need to be updated now that @inlined_body() has been replaced with @llvm.vscale.i32()?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good spot, I've rephrased it.

@@ -2,70 +2,71 @@
; RUN: opt < %s -mtriple=aarch64-unknown-linux-gnu -mattr=+sme -S -passes=inline | FileCheck %s

declare void @inlined_body() "aarch64_pstate_sm_compatible";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can inlined_body() be removed now?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It can, thanks!

@sdesmalen-arm sdesmalen-arm merged commit 3abf55a into llvm:main Jan 31, 2024
4 checks passed
@sdesmalen-arm sdesmalen-arm deleted the sme-fix-invalid-inlining branch January 31, 2024 11:38
llvmbot pushed a commit to llvmbot/llvm-project that referenced this pull request Jan 31, 2024
Calling a `__arm_locally_streaming` function from a function that
is not a streaming-SVE function would lead to incorrect inlining.

The issue didn't surface because the tests were not testing what
they were supposed to test.

(cherry picked from commit 3abf55a)
llvmbot pushed a commit to llvmbot/llvm-project that referenced this pull request Jan 31, 2024
Calling a `__arm_locally_streaming` function from a function that
is not a streaming-SVE function would lead to incorrect inlining.

The issue didn't surface because the tests were not testing what
they were supposed to test.

(cherry picked from commit 3abf55a)
llvmbot pushed a commit to llvmbot/llvm-project that referenced this pull request Feb 1, 2024
Calling a `__arm_locally_streaming` function from a function that
is not a streaming-SVE function would lead to incorrect inlining.

The issue didn't surface because the tests were not testing what
they were supposed to test.

(cherry picked from commit 3abf55a)
tstellar pushed a commit to tstellar/llvm-project that referenced this pull request Feb 14, 2024
Calling a `__arm_locally_streaming` function from a function that
is not a streaming-SVE function would lead to incorrect inlining.

The issue didn't surface because the tests were not testing what
they were supposed to test.

(cherry picked from commit 3abf55a)
tstellar pushed a commit to tstellar/llvm-project that referenced this pull request Feb 14, 2024
Calling a `__arm_locally_streaming` function from a function that
is not a streaming-SVE function would lead to incorrect inlining.

The issue didn't surface because the tests were not testing what
they were supposed to test.

(cherry picked from commit 3abf55a)
tstellar pushed a commit to tstellar/llvm-project that referenced this pull request Feb 14, 2024
Calling a `__arm_locally_streaming` function from a function that
is not a streaming-SVE function would lead to incorrect inlining.

The issue didn't surface because the tests were not testing what
they were supposed to test.

(cherry picked from commit 3abf55a)
tstellar pushed a commit to tstellar/llvm-project that referenced this pull request Feb 14, 2024
Calling a `__arm_locally_streaming` function from a function that
is not a streaming-SVE function would lead to incorrect inlining.

The issue didn't surface because the tests were not testing what
they were supposed to test.

(cherry picked from commit 3abf55a)
@pointhex pointhex mentioned this pull request May 7, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants