[ARM] Armv8-R does not require fp64 or neon. #88287

chrisnc · 2024-04-10T16:09:12Z

This was addressed for AArch64 here, but the same applies to ARM.

Move the enablement of neon+fp64 to -mcpu=cortex-r52, which optionally supports these features.

github-actions · 2024-04-10T16:09:30Z

Thank you for submitting a Pull Request (PR) to the LLVM Project!

This PR will be automatically labeled and the relevant teams will be
notified.

If you wish to, you can add reviewers by using the "Reviewers" section on this page.

If this is not working for you, it is probably because you do not have write
permissions for the repository. In which case you can instead tag reviewers by
name in a comment by using @ followed by their GitHub username.

If you have received no comments on your PR for a week, you can request a review
by "ping"ing the PR by adding a comment “Ping”. The common courtesy "ping" rate
is once a week. Please remember that you are asking for valuable time from other developers.

If you have further questions, they may be answered by the LLVM GitHub User Guide.

You can also ask questions in a comment on this PR, on the LLVM Discord or on the forums.

llvmbot · 2024-04-10T16:10:01Z

@llvm/pr-subscribers-clang-driver
@llvm/pr-subscribers-llvm-analysis

@llvm/pr-subscribers-backend-arm

Author: Chris Copeland (chrisnc)

Changes

Full diff: https://github.com/llvm/llvm-project/pull/88287.diff

14 Files Affected:

(modified) clang/test/Preprocessor/arm-target-features.c (+1-1)
(modified) llvm/include/llvm/TargetParser/ARMTargetParser.def (+1-1)
(modified) llvm/lib/Target/ARM/ARM.td (+3-3)
(modified) llvm/test/Analysis/CostModel/ARM/arith.ll (+1-1)
(modified) llvm/test/Analysis/CostModel/ARM/cast.ll (+2-2)
(modified) llvm/test/Analysis/CostModel/ARM/cast_ldst.ll (+2-2)
(modified) llvm/test/Analysis/CostModel/ARM/cmps.ll (+2-2)
(modified) llvm/test/Analysis/CostModel/ARM/divrem.ll (+1-1)
(modified) llvm/test/CodeGen/ARM/build-attributes.ll (+2-2)
(modified) llvm/test/CodeGen/ARM/fpconv.ll (+2-2)
(modified) llvm/test/CodeGen/ARM/half.ll (+2-2)
(modified) llvm/test/CodeGen/ARM/pr42638-VMOVRRDCombine.ll (+1-1)
(modified) llvm/test/CodeGen/ARM/useaa.ll (+2-2)
(modified) llvm/unittests/TargetParser/TargetParserTest.cpp (+1-1)

diff --git a/clang/test/Preprocessor/arm-target-features.c b/clang/test/Preprocessor/arm-target-features.c
index 236c9f2479b705..5805e9e0bf0b50 100644
--- a/clang/test/Preprocessor/arm-target-features.c
+++ b/clang/test/Preprocessor/arm-target-features.c
@@ -96,7 +96,7 @@
 // CHECK-V8R-ALLOW-FP-INSTR: #define __ARM_FEATURE_CRC32 1
 // CHECK-V8R-ALLOW-FP-INSTR: #define __ARM_FEATURE_DIRECTED_ROUNDING 1
 // CHECK-V8R-ALLOW-FP-INSTR: #define __ARM_FEATURE_NUMERIC_MAXMIN 1
-// CHECK-V8R-ALLOW-FP-INSTR: #define __ARM_FP 0xe
+// CHECK-V8R-ALLOW-FP-INSTR: #define __ARM_FP 0x4
 
 // RUN: %clang -target armv7a-none-linux-gnu -x c -E -dM %s -o - | FileCheck -match-full-lines --check-prefix=CHECK-V7 %s
 // CHECK-V7: #define __ARMEL__ 1
diff --git a/llvm/include/llvm/TargetParser/ARMTargetParser.def b/llvm/include/llvm/TargetParser/ARMTargetParser.def
index b821d224d7a82c..dee0f383b4b234 100644
--- a/llvm/include/llvm/TargetParser/ARMTargetParser.def
+++ b/llvm/include/llvm/TargetParser/ARMTargetParser.def
@@ -329,7 +329,7 @@ ARM_CPU_NAME("cortex-r7", ARMV7R, FK_VFPV3_D16_FP16, false,
              (ARM::AEK_MP | ARM::AEK_HWDIVARM))
 ARM_CPU_NAME("cortex-r8", ARMV7R, FK_VFPV3_D16_FP16, false,
              (ARM::AEK_MP | ARM::AEK_HWDIVARM))
-ARM_CPU_NAME("cortex-r52", ARMV8R, FK_NEON_FP_ARMV8, true, ARM::AEK_NONE)
+ARM_CPU_NAME("cortex-r52", ARMV8R, FK_FPV5_SP_D16, true, ARM::AEK_NONE)
 ARM_CPU_NAME("sc300", ARMV7M, FK_NONE, false, ARM::AEK_NONE)
 ARM_CPU_NAME("cortex-m3", ARMV7M, FK_NONE, true, ARM::AEK_NONE)
 ARM_CPU_NAME("cortex-m4", ARMV7EM, FK_FPV4_SP_D16, true, ARM::AEK_NONE)
diff --git a/llvm/lib/Target/ARM/ARM.td b/llvm/lib/Target/ARM/ARM.td
index 66596dbda83c95..6ee8c0540e6488 100644
--- a/llvm/lib/Target/ARM/ARM.td
+++ b/llvm/lib/Target/ARM/ARM.td
@@ -1167,9 +1167,7 @@ def ARMv8r    : Architecture<"armv8-r",   "ARMv8r",   [HasV8Ops,
                                                        FeatureDSP,
                                                        FeatureCRC,
                                                        FeatureMP,
-                                                       FeatureVirtualization,
-                                                       FeatureFPARMv8,
-                                                       FeatureNEON]>;
+                                                       FeatureVirtualization]>;
 
 def ARMv8mBaseline : Architecture<"armv8-m.base", "ARMv8mBaseline",
                                                       [HasV8MBaselineOps,
@@ -1726,6 +1724,8 @@ def : ProcNoItin<"kryo",                                [ARMv8a, ProcKryo,
                                                          FeatureCRC]>;
 
 def : ProcessorModel<"cortex-r52", CortexR52Model,      [ARMv8r, ProcR52,
+                                                         FeatureFPARMv8_D16_SP,
+                                                         FeatureFP16,
                                                          FeatureUseMISched,
                                                          FeatureFPAO]>;
 
diff --git a/llvm/test/Analysis/CostModel/ARM/arith.ll b/llvm/test/Analysis/CostModel/ARM/arith.ll
index 3a137a5af36664..8f173596c3b9a0 100644
--- a/llvm/test/Analysis/CostModel/ARM/arith.ll
+++ b/llvm/test/Analysis/CostModel/ARM/arith.ll
@@ -4,7 +4,7 @@
 ; RUN: opt -passes="print<cost-model>" 2>&1 -disable-output -mtriple=thumbv8.1m.main-none-eabi -mattr=+mve,+mve4beat < %s | FileCheck %s --check-prefix=CHECK --check-prefix=CHECK-MVE4
 ; RUN: opt -passes="print<cost-model>" 2>&1 -disable-output -mtriple=thumbv8m.main-none-eabi < %s | FileCheck %s --check-prefix=CHECK-V8M-MAIN
 ; RUN: opt -passes="print<cost-model>" 2>&1 -disable-output -mtriple=thumbv8m.base-none-eabi < %s | FileCheck %s --check-prefix=CHECK-V8M-BASE
-; RUN: opt -passes="print<cost-model>" 2>&1 -disable-output -mtriple=armv8r-none-eabi < %s | FileCheck %s --check-prefix=CHECK-V8R
+; RUN: opt -passes="print<cost-model>" 2>&1 -disable-output -mtriple=armv8r-none-eabi -mattr=+neon,+fp-armv8 < %s | FileCheck %s --check-prefix=CHECK-V8R
 ; RUN: opt -passes="print<cost-model>" -cost-kind=code-size 2>&1 -disable-output -mtriple=thumbv8.1m.main -mattr=+mve < %s | FileCheck %s --check-prefix=CHECK-MVE-SIZE
 
 target datalayout = "e-m:e-p:32:32-Fi8-i64:64-v128:64:128-a:0:32-n32-S64"
diff --git a/llvm/test/Analysis/CostModel/ARM/cast.ll b/llvm/test/Analysis/CostModel/ARM/cast.ll
index 60addd3077ed14..ae0d2347ec8be0 100644
--- a/llvm/test/Analysis/CostModel/ARM/cast.ll
+++ b/llvm/test/Analysis/CostModel/ARM/cast.ll
@@ -3,11 +3,11 @@
 ; RUN: opt -passes="print<cost-model>" 2>&1 -disable-output -mtriple=thumbv8.1m.main-none-eabi -mattr=+mve.fp < %s | FileCheck %s --check-prefix=CHECK-MVE-RECIP
 ; RUN: opt -passes="print<cost-model>" 2>&1 -disable-output -mtriple=thumbv8m.main-none-eabi < %s | FileCheck %s --check-prefix=CHECK-V8M-MAIN-RECIP
 ; RUN: opt -passes="print<cost-model>" 2>&1 -disable-output -mtriple=thumbv8m.base-none-eabi < %s | FileCheck %s --check-prefix=CHECK-V8M-BASE-RECIP
-; RUN: opt -passes="print<cost-model>" 2>&1 -disable-output -mtriple=armv8r-none-eabi < %s | FileCheck %s --check-prefix=CHECK-V8R-RECIP
+; RUN: opt -passes="print<cost-model>" 2>&1 -disable-output -mtriple=armv8r-none-eabi -mattr=+neon,+fp-armv8 < %s | FileCheck %s --check-prefix=CHECK-V8R-RECIP
 ; RUN: opt -passes="print<cost-model>" 2>&1 -disable-output -cost-kind=code-size -mtriple=thumbv8.1m.main-none-eabi -mattr=+mve.fp < %s | FileCheck %s --check-prefix=CHECK-MVE-SIZE
 ; RUN: opt -passes="print<cost-model>" 2>&1 -disable-output -cost-kind=code-size -mtriple=thumbv8m.main-none-eabi < %s | FileCheck %s --check-prefix=CHECK-V8M-MAIN-SIZE
 ; RUN: opt -passes="print<cost-model>" 2>&1 -disable-output -cost-kind=code-size -mtriple=thumbv8m.base-none-eabi < %s | FileCheck %s --check-prefix=CHECK-V8M-BASE-SIZE
-; RUN: opt -passes="print<cost-model>" 2>&1 -disable-output -cost-kind=code-size -mtriple=armv8r-none-eabi < %s | FileCheck %s --check-prefix=CHECK-V8R-SIZE
+; RUN: opt -passes="print<cost-model>" 2>&1 -disable-output -cost-kind=code-size -mtriple=armv8r-none-eabi -mattr=+neon,+fp-armv8 < %s | FileCheck %s --check-prefix=CHECK-V8R-SIZE
 
 target datalayout = "e-m:e-p:32:32-Fi8-i64:64-v128:64:128-a:0:32-n32-S64"
 
diff --git a/llvm/test/Analysis/CostModel/ARM/cast_ldst.ll b/llvm/test/Analysis/CostModel/ARM/cast_ldst.ll
index db700eb3baeefe..4a2f9a25dc152f 100644
--- a/llvm/test/Analysis/CostModel/ARM/cast_ldst.ll
+++ b/llvm/test/Analysis/CostModel/ARM/cast_ldst.ll
@@ -3,11 +3,11 @@
 ; RUN: opt -passes="print<cost-model>" 2>&1 -disable-output -mtriple=thumbv8.1m.main-none-eabi -mattr=+mve.fp < %s | FileCheck %s --check-prefix=CHECK-MVE-RECIP
 ; RUN: opt -passes="print<cost-model>" 2>&1 -disable-output -mtriple=thumbv8m.main-none-eabi < %s | FileCheck %s --check-prefix=CHECK-V8M-MAIN-RECIP
 ; RUN: opt -passes="print<cost-model>" 2>&1 -disable-output -mtriple=thumbv8m.base-none-eabi < %s | FileCheck %s --check-prefix=CHECK-V8M-BASE-RECIP
-; RUN: opt -passes="print<cost-model>" 2>&1 -disable-output -mtriple=armv8r-none-eabi < %s | FileCheck %s --check-prefix=CHECK-V8R-RECIP
+; RUN: opt -passes="print<cost-model>" 2>&1 -disable-output -mtriple=armv8r-none-eabi -mattr=+neon,+fp-armv8 < %s | FileCheck %s --check-prefix=CHECK-V8R-RECIP
 ; RUN: opt -passes="print<cost-model>" 2>&1 -disable-output -cost-kind=code-size -mtriple=thumbv8.1m.main-none-eabi -mattr=+mve.fp < %s | FileCheck %s --check-prefix=CHECK-MVE-SIZE
 ; RUN: opt -passes="print<cost-model>" 2>&1 -disable-output -cost-kind=code-size -mtriple=thumbv8m.main-none-eabi < %s | FileCheck %s --check-prefix=CHECK-V8M-MAIN-SIZE
 ; RUN: opt -passes="print<cost-model>" 2>&1 -disable-output -cost-kind=code-size -mtriple=thumbv8m.base-none-eabi < %s | FileCheck %s --check-prefix=CHECK-V8M-BASE-SIZE
-; RUN: opt -passes="print<cost-model>" 2>&1 -disable-output -cost-kind=code-size -mtriple=armv8r-none-eabi < %s | FileCheck %s --check-prefix=CHECK-V8R-SIZE
+; RUN: opt -passes="print<cost-model>" 2>&1 -disable-output -cost-kind=code-size -mtriple=armv8r-none-eabi -mattr=+neon,+fp-armv8 < %s | FileCheck %s --check-prefix=CHECK-V8R-SIZE
 
 target datalayout = "e-m:e-p:32:32-Fi8-i64:64-v128:64:128-a:0:32-n32-S64"
 
diff --git a/llvm/test/Analysis/CostModel/ARM/cmps.ll b/llvm/test/Analysis/CostModel/ARM/cmps.ll
index 7f89f521e77cbc..184b7076d02bea 100644
--- a/llvm/test/Analysis/CostModel/ARM/cmps.ll
+++ b/llvm/test/Analysis/CostModel/ARM/cmps.ll
@@ -2,11 +2,11 @@
 ; RUN: opt -passes="print<cost-model>" 2>&1 -disable-output -mtriple=thumbv8.1m.main-none-eabi -mattr=+mve.fp < %s | FileCheck %s --check-prefix=CHECK-MVE-RECIP
 ; RUN: opt -passes="print<cost-model>" 2>&1 -disable-output -mtriple=thumbv8m.main-none-eabi < %s | FileCheck %s --check-prefix=CHECK-V8M-MAIN-RECIP
 ; RUN: opt -passes="print<cost-model>" 2>&1 -disable-output -mtriple=thumbv8m.base-none-eabi < %s | FileCheck %s --check-prefix=CHECK-V8M-BASE-RECIP
-; RUN: opt -passes="print<cost-model>" 2>&1 -disable-output -mtriple=armv8r-none-eabi < %s | FileCheck %s --check-prefix=CHECK-V8R-RECIP
+; RUN: opt -passes="print<cost-model>" 2>&1 -disable-output -mtriple=armv8r-none-eabi -mattr=+neon,+fp-armv8 < %s | FileCheck %s --check-prefix=CHECK-V8R-RECIP
 ; RUN: opt -passes="print<cost-model>" 2>&1 -disable-output -cost-kind=code-size -mtriple=thumbv8.1m.main-none-eabi -mattr=+mve.fp < %s | FileCheck %s --check-prefix=CHECK-MVE-SIZE
 ; RUN: opt -passes="print<cost-model>" 2>&1 -disable-output -cost-kind=code-size -mtriple=thumbv8m.main-none-eabi < %s | FileCheck %s --check-prefix=CHECK-V8M-MAIN-SIZE
 ; RUN: opt -passes="print<cost-model>" 2>&1 -disable-output -cost-kind=code-size -mtriple=thumbv8m.base-none-eabi < %s | FileCheck %s --check-prefix=CHECK-V8M-BASE-SIZE
-; RUN: opt -passes="print<cost-model>" 2>&1 -disable-output -cost-kind=code-size -mtriple=armv8r-none-eabi < %s | FileCheck %s --check-prefix=CHECK-V8R-SIZE
+; RUN: opt -passes="print<cost-model>" 2>&1 -disable-output -cost-kind=code-size -mtriple=armv8r-none-eabi -mattr=+neon,+fp-armv8 < %s | FileCheck %s --check-prefix=CHECK-V8R-SIZE
 
 target datalayout = "e-m:e-p:32:32-Fi8-i64:64-v128:64:128-a:0:32-n32-S64"
 
diff --git a/llvm/test/Analysis/CostModel/ARM/divrem.ll b/llvm/test/Analysis/CostModel/ARM/divrem.ll
index b582a61c2a0fc8..9aba39327f6601 100644
--- a/llvm/test/Analysis/CostModel/ARM/divrem.ll
+++ b/llvm/test/Analysis/CostModel/ARM/divrem.ll
@@ -3,7 +3,7 @@
 ; RUN: opt -passes="print<cost-model>" 2>&1 -disable-output -mtriple=thumbv8.1m.main-none-eabi -mattr=+mve.fp < %s | FileCheck %s --check-prefix=CHECK-MVE
 ; RUN: opt -passes="print<cost-model>" 2>&1 -disable-output -mtriple=thumbv8m.main-none-eabi < %s | FileCheck %s --check-prefix=CHECK-V8M-MAIN
 ; RUN: opt -passes="print<cost-model>" 2>&1 -disable-output -mtriple=thumbv8m.base-none-eabi < %s | FileCheck %s --check-prefix=CHECK-V8M-BASE
-; RUN: opt -passes="print<cost-model>" 2>&1 -disable-output -mtriple=armv8r-none-eabi < %s | FileCheck %s --check-prefix=CHECK-V8R
+; RUN: opt -passes="print<cost-model>" 2>&1 -disable-output -mtriple=armv8r-none-eabi -mattr=+neon,+fp-armv8,+fp16 < %s | FileCheck %s --check-prefix=CHECK-V8R
 
 target datalayout = "e-m:e-p:32:32-Fi8-i64:64-v128:64:128-a:0:32-n32-S64"
 
diff --git a/llvm/test/CodeGen/ARM/build-attributes.ll b/llvm/test/CodeGen/ARM/build-attributes.ll
index e8c83b92f3f947..2812bf81acf98c 100644
--- a/llvm/test/CodeGen/ARM/build-attributes.ll
+++ b/llvm/test/CodeGen/ARM/build-attributes.ll
@@ -221,8 +221,8 @@
 
 ; ARMv8-R
 ; RUN: llc < %s -mtriple=arm-none-none-eabi -mcpu=cortex-r52 -mattr=-vfp2sp,-fp16 | FileCheck %s --check-prefix=ARMv8R --check-prefix=ARMv8R-NOFPU
-; RUN: llc < %s -mtriple=arm-none-none-eabi -mcpu=cortex-r52 -mattr=-neon,-fp64,-d32 | FileCheck %s --check-prefix=ARMv8R --check-prefix=ARMv8R-SP
-; RUN: llc < %s -mtriple=arm-none-none-eabi -mcpu=cortex-r52 | FileCheck %s --check-prefix=ARMv8R --check-prefix=ARMv8R-NEON
+; RUN: llc < %s -mtriple=arm-none-none-eabi -mcpu=cortex-r52 | FileCheck %s --check-prefix=ARMv8R --check-prefix=ARMv8R-SP
+; RUN: llc < %s -mtriple=arm-none-none-eabi -mcpu=cortex-r52 -mattr=+neon,+fp-armv8 | FileCheck %s --check-prefix=ARMv8R --check-prefix=ARMv8R-NEON
 
 ; ARMv8-M
 ; RUN: llc < %s -mtriple=thumbv8-none-none-eabi -mcpu=cortex-m23 | FileCheck %s --check-prefix=STRICT-ALIGN
diff --git a/llvm/test/CodeGen/ARM/fpconv.ll b/llvm/test/CodeGen/ARM/fpconv.ll
index 929da5f18c813e..5dfc2a6a4629b1 100644
--- a/llvm/test/CodeGen/ARM/fpconv.ll
+++ b/llvm/test/CodeGen/ARM/fpconv.ll
@@ -1,7 +1,7 @@
 ; RUN: llc -mtriple=arm-eabi -mattr=+vfp2 %s -o - | FileCheck %s --check-prefix=CHECK-VFP
 ; RUN: llc -mtriple=arm-apple-darwin %s -o - | FileCheck %s
-; RUN: llc -mtriple=armv8r-none-none-eabi %s -o - | FileCheck %s --check-prefix=CHECK-VFP
-; RUN: llc -mtriple=armv8r-none-none-eabi -mattr=-fp64 %s -o - | FileCheck %s --check-prefix=CHECK-VFP-SP
+; RUN: llc -mtriple=armv8r-none-none-eabi -mattr=+neon,+fp-armv8 %s -o - | FileCheck %s --check-prefix=CHECK-VFP
+; RUN: llc -mtriple=armv8r-none-none-eabi -mattr=+fp-armv8,-fp64,-d32 %s -o - | FileCheck %s --check-prefix=CHECK-VFP-SP
 
 define float @f1(double %x) {
 ;CHECK-VFP-LABEL: f1:
diff --git a/llvm/test/CodeGen/ARM/half.ll b/llvm/test/CodeGen/ARM/half.ll
index 9b53dc77f22739..050b400539b4dd 100644
--- a/llvm/test/CodeGen/ARM/half.ll
+++ b/llvm/test/CodeGen/ARM/half.ll
@@ -1,8 +1,8 @@
 ; RUN: llc < %s -mtriple=thumbv7-apple-ios7.0 | FileCheck %s --check-prefix=CHECK --check-prefix=CHECK-OLD
 ; RUN: llc < %s -mtriple=thumbv7s-apple-ios7.0 | FileCheck %s --check-prefix=CHECK --check-prefix=CHECK-F16
 ; RUN: llc < %s -mtriple=thumbv8-apple-ios7.0 | FileCheck %s --check-prefix=CHECK  --check-prefix=CHECK-V8
-; RUN: llc < %s -mtriple=armv8r-none-none-eabi | FileCheck %s --check-prefix=CHECK  --check-prefix=CHECK-V8
-; RUN: llc < %s -mtriple=armv8r-none-none-eabi -mattr=-fp64 | FileCheck %s --check-prefix=CHECK  --check-prefix=CHECK-V8-SP
+; RUN: llc < %s -mtriple=armv8r-none-none-eabi -mattr=+fp-armv8,+fp16 | FileCheck %s --check-prefix=CHECK  --check-prefix=CHECK-V8
+; RUN: llc < %s -mtriple=armv8r-none-none-eabi -mattr=+fp-armv8,+fp16,-fp64 | FileCheck %s --check-prefix=CHECK  --check-prefix=CHECK-V8-SP
 ; RUN: llc < %s -mtriple=armv8.1m-none-none-eabi -mattr=+fp-armv8 | FileCheck %s --check-prefix=CHECK --check-prefix=CHECK-V8
 ; RUN: llc < %s -mtriple=armv8.1m-none-none-eabi -mattr=+fp-armv8,-fp64 | FileCheck %s --check-prefix=CHECK --check-prefix=CHECK-V8-SP
 ; RUN: llc < %s -mtriple=armv8.1m-none-none-eabi -mattr=+mve.fp,+fp64 | FileCheck %s --check-prefix=CHECK-V8
diff --git a/llvm/test/CodeGen/ARM/pr42638-VMOVRRDCombine.ll b/llvm/test/CodeGen/ARM/pr42638-VMOVRRDCombine.ll
index 8ee88fd18f3c81..245780e6836ea2 100644
--- a/llvm/test/CodeGen/ARM/pr42638-VMOVRRDCombine.ll
+++ b/llvm/test/CodeGen/ARM/pr42638-VMOVRRDCombine.ll
@@ -16,7 +16,7 @@ entry:
 
 declare void @llvm.va_start(ptr)
 
-attributes #0 = { "target-cpu"="cortex-r52" "target-features"="-fp64"  }
+attributes #0 = { "target-cpu"="cortex-r52" }
 
 ; Ensures that the machine scheduler does not move accessing the upper
 ; 32 bits of the double to before actually storing it to memory
diff --git a/llvm/test/CodeGen/ARM/useaa.ll b/llvm/test/CodeGen/ARM/useaa.ll
index f8207a1056e3b4..bbf2bb043d3f31 100644
--- a/llvm/test/CodeGen/ARM/useaa.ll
+++ b/llvm/test/CodeGen/ARM/useaa.ll
@@ -1,7 +1,7 @@
-; RUN: llc < %s -mtriple=armv8r-eabi -mcpu=cortex-r52 | FileCheck %s --check-prefix=CHECK --check-prefix=USEAA
+; RUN: llc < %s -mtriple=armv8r-eabi -mcpu=cortex-r52 -mattr=+neon,+fp-armv8 | FileCheck %s --check-prefix=CHECK --check-prefix=USEAA
 ; RUN: llc < %s -mtriple=armv7m-eabi -mcpu=cortex-m4 | FileCheck %s --check-prefix=CHECK --check-prefix=USEAA
 ; RUN: llc < %s -mtriple=armv8m-eabi -mcpu=cortex-m33 | FileCheck %s --check-prefix=CHECK --check-prefix=USEAA
-; RUN: llc < %s -mtriple=armv8r-eabi -mcpu=generic | FileCheck %s --check-prefix=CHECK --check-prefix=GENERIC
+; RUN: llc < %s -mtriple=armv8r-eabi -mcpu=generic -mattr=+neon,+fp-armv8 | FileCheck %s --check-prefix=CHECK --check-prefix=GENERIC
 
 ; Check we use AA during codegen, so can interleave these loads/stores.
 
diff --git a/llvm/unittests/TargetParser/TargetParserTest.cpp b/llvm/unittests/TargetParser/TargetParserTest.cpp
index 2c72a7229b5274..82c5c0d6c12175 100644
--- a/llvm/unittests/TargetParser/TargetParserTest.cpp
+++ b/llvm/unittests/TargetParser/TargetParserTest.cpp
@@ -351,7 +351,7 @@ INSTANTIATE_TEST_SUITE_P(
                                    ARM::AEK_MP | ARM::AEK_HWDIVARM |
                                        ARM::AEK_HWDIVTHUMB | ARM::AEK_DSP,
                                    "7-R"),
-        ARMCPUTestParams<uint64_t>("cortex-r52", "armv8-r", "neon-fp-armv8",
+        ARMCPUTestParams<uint64_t>("cortex-r52", "armv8-r", "fpv5-sp-d16",
                                    ARM::AEK_NONE | ARM::AEK_CRC | ARM::AEK_MP |
                                        ARM::AEK_VIRT | ARM::AEK_HWDIVARM |
                                        ARM::AEK_HWDIVTHUMB | ARM::AEK_DSP,

llvmbot · 2024-04-10T16:10:01Z

@llvm/pr-subscribers-clang

Author: Chris Copeland (chrisnc)

Changes

Full diff: https://github.com/llvm/llvm-project/pull/88287.diff

14 Files Affected:

(modified) clang/test/Preprocessor/arm-target-features.c (+1-1)
(modified) llvm/include/llvm/TargetParser/ARMTargetParser.def (+1-1)
(modified) llvm/lib/Target/ARM/ARM.td (+3-3)
(modified) llvm/test/Analysis/CostModel/ARM/arith.ll (+1-1)
(modified) llvm/test/Analysis/CostModel/ARM/cast.ll (+2-2)
(modified) llvm/test/Analysis/CostModel/ARM/cast_ldst.ll (+2-2)
(modified) llvm/test/Analysis/CostModel/ARM/cmps.ll (+2-2)
(modified) llvm/test/Analysis/CostModel/ARM/divrem.ll (+1-1)
(modified) llvm/test/CodeGen/ARM/build-attributes.ll (+2-2)
(modified) llvm/test/CodeGen/ARM/fpconv.ll (+2-2)
(modified) llvm/test/CodeGen/ARM/half.ll (+2-2)
(modified) llvm/test/CodeGen/ARM/pr42638-VMOVRRDCombine.ll (+1-1)
(modified) llvm/test/CodeGen/ARM/useaa.ll (+2-2)
(modified) llvm/unittests/TargetParser/TargetParserTest.cpp (+1-1)

diff --git a/clang/test/Preprocessor/arm-target-features.c b/clang/test/Preprocessor/arm-target-features.c
index 236c9f2479b705..5805e9e0bf0b50 100644
--- a/clang/test/Preprocessor/arm-target-features.c
+++ b/clang/test/Preprocessor/arm-target-features.c
@@ -96,7 +96,7 @@
 // CHECK-V8R-ALLOW-FP-INSTR: #define __ARM_FEATURE_CRC32 1
 // CHECK-V8R-ALLOW-FP-INSTR: #define __ARM_FEATURE_DIRECTED_ROUNDING 1
 // CHECK-V8R-ALLOW-FP-INSTR: #define __ARM_FEATURE_NUMERIC_MAXMIN 1
-// CHECK-V8R-ALLOW-FP-INSTR: #define __ARM_FP 0xe
+// CHECK-V8R-ALLOW-FP-INSTR: #define __ARM_FP 0x4
 
 // RUN: %clang -target armv7a-none-linux-gnu -x c -E -dM %s -o - | FileCheck -match-full-lines --check-prefix=CHECK-V7 %s
 // CHECK-V7: #define __ARMEL__ 1
diff --git a/llvm/include/llvm/TargetParser/ARMTargetParser.def b/llvm/include/llvm/TargetParser/ARMTargetParser.def
index b821d224d7a82c..dee0f383b4b234 100644
--- a/llvm/include/llvm/TargetParser/ARMTargetParser.def
+++ b/llvm/include/llvm/TargetParser/ARMTargetParser.def
@@ -329,7 +329,7 @@ ARM_CPU_NAME("cortex-r7", ARMV7R, FK_VFPV3_D16_FP16, false,
              (ARM::AEK_MP | ARM::AEK_HWDIVARM))
 ARM_CPU_NAME("cortex-r8", ARMV7R, FK_VFPV3_D16_FP16, false,
              (ARM::AEK_MP | ARM::AEK_HWDIVARM))
-ARM_CPU_NAME("cortex-r52", ARMV8R, FK_NEON_FP_ARMV8, true, ARM::AEK_NONE)
+ARM_CPU_NAME("cortex-r52", ARMV8R, FK_FPV5_SP_D16, true, ARM::AEK_NONE)
 ARM_CPU_NAME("sc300", ARMV7M, FK_NONE, false, ARM::AEK_NONE)
 ARM_CPU_NAME("cortex-m3", ARMV7M, FK_NONE, true, ARM::AEK_NONE)
 ARM_CPU_NAME("cortex-m4", ARMV7EM, FK_FPV4_SP_D16, true, ARM::AEK_NONE)
diff --git a/llvm/lib/Target/ARM/ARM.td b/llvm/lib/Target/ARM/ARM.td
index 66596dbda83c95..6ee8c0540e6488 100644
--- a/llvm/lib/Target/ARM/ARM.td
+++ b/llvm/lib/Target/ARM/ARM.td
@@ -1167,9 +1167,7 @@ def ARMv8r    : Architecture<"armv8-r",   "ARMv8r",   [HasV8Ops,
                                                        FeatureDSP,
                                                        FeatureCRC,
                                                        FeatureMP,
-                                                       FeatureVirtualization,
-                                                       FeatureFPARMv8,
-                                                       FeatureNEON]>;
+                                                       FeatureVirtualization]>;
 
 def ARMv8mBaseline : Architecture<"armv8-m.base", "ARMv8mBaseline",
                                                       [HasV8MBaselineOps,
@@ -1726,6 +1724,8 @@ def : ProcNoItin<"kryo",                                [ARMv8a, ProcKryo,
                                                          FeatureCRC]>;
 
 def : ProcessorModel<"cortex-r52", CortexR52Model,      [ARMv8r, ProcR52,
+                                                         FeatureFPARMv8_D16_SP,
+                                                         FeatureFP16,
                                                          FeatureUseMISched,
                                                          FeatureFPAO]>;
 
diff --git a/llvm/test/Analysis/CostModel/ARM/arith.ll b/llvm/test/Analysis/CostModel/ARM/arith.ll
index 3a137a5af36664..8f173596c3b9a0 100644
--- a/llvm/test/Analysis/CostModel/ARM/arith.ll
+++ b/llvm/test/Analysis/CostModel/ARM/arith.ll
@@ -4,7 +4,7 @@
 ; RUN: opt -passes="print<cost-model>" 2>&1 -disable-output -mtriple=thumbv8.1m.main-none-eabi -mattr=+mve,+mve4beat < %s | FileCheck %s --check-prefix=CHECK --check-prefix=CHECK-MVE4
 ; RUN: opt -passes="print<cost-model>" 2>&1 -disable-output -mtriple=thumbv8m.main-none-eabi < %s | FileCheck %s --check-prefix=CHECK-V8M-MAIN
 ; RUN: opt -passes="print<cost-model>" 2>&1 -disable-output -mtriple=thumbv8m.base-none-eabi < %s | FileCheck %s --check-prefix=CHECK-V8M-BASE
-; RUN: opt -passes="print<cost-model>" 2>&1 -disable-output -mtriple=armv8r-none-eabi < %s | FileCheck %s --check-prefix=CHECK-V8R
+; RUN: opt -passes="print<cost-model>" 2>&1 -disable-output -mtriple=armv8r-none-eabi -mattr=+neon,+fp-armv8 < %s | FileCheck %s --check-prefix=CHECK-V8R
 ; RUN: opt -passes="print<cost-model>" -cost-kind=code-size 2>&1 -disable-output -mtriple=thumbv8.1m.main -mattr=+mve < %s | FileCheck %s --check-prefix=CHECK-MVE-SIZE
 
 target datalayout = "e-m:e-p:32:32-Fi8-i64:64-v128:64:128-a:0:32-n32-S64"
diff --git a/llvm/test/Analysis/CostModel/ARM/cast.ll b/llvm/test/Analysis/CostModel/ARM/cast.ll
index 60addd3077ed14..ae0d2347ec8be0 100644
--- a/llvm/test/Analysis/CostModel/ARM/cast.ll
+++ b/llvm/test/Analysis/CostModel/ARM/cast.ll
@@ -3,11 +3,11 @@
 ; RUN: opt -passes="print<cost-model>" 2>&1 -disable-output -mtriple=thumbv8.1m.main-none-eabi -mattr=+mve.fp < %s | FileCheck %s --check-prefix=CHECK-MVE-RECIP
 ; RUN: opt -passes="print<cost-model>" 2>&1 -disable-output -mtriple=thumbv8m.main-none-eabi < %s | FileCheck %s --check-prefix=CHECK-V8M-MAIN-RECIP
 ; RUN: opt -passes="print<cost-model>" 2>&1 -disable-output -mtriple=thumbv8m.base-none-eabi < %s | FileCheck %s --check-prefix=CHECK-V8M-BASE-RECIP
-; RUN: opt -passes="print<cost-model>" 2>&1 -disable-output -mtriple=armv8r-none-eabi < %s | FileCheck %s --check-prefix=CHECK-V8R-RECIP
+; RUN: opt -passes="print<cost-model>" 2>&1 -disable-output -mtriple=armv8r-none-eabi -mattr=+neon,+fp-armv8 < %s | FileCheck %s --check-prefix=CHECK-V8R-RECIP
 ; RUN: opt -passes="print<cost-model>" 2>&1 -disable-output -cost-kind=code-size -mtriple=thumbv8.1m.main-none-eabi -mattr=+mve.fp < %s | FileCheck %s --check-prefix=CHECK-MVE-SIZE
 ; RUN: opt -passes="print<cost-model>" 2>&1 -disable-output -cost-kind=code-size -mtriple=thumbv8m.main-none-eabi < %s | FileCheck %s --check-prefix=CHECK-V8M-MAIN-SIZE
 ; RUN: opt -passes="print<cost-model>" 2>&1 -disable-output -cost-kind=code-size -mtriple=thumbv8m.base-none-eabi < %s | FileCheck %s --check-prefix=CHECK-V8M-BASE-SIZE
-; RUN: opt -passes="print<cost-model>" 2>&1 -disable-output -cost-kind=code-size -mtriple=armv8r-none-eabi < %s | FileCheck %s --check-prefix=CHECK-V8R-SIZE
+; RUN: opt -passes="print<cost-model>" 2>&1 -disable-output -cost-kind=code-size -mtriple=armv8r-none-eabi -mattr=+neon,+fp-armv8 < %s | FileCheck %s --check-prefix=CHECK-V8R-SIZE
 
 target datalayout = "e-m:e-p:32:32-Fi8-i64:64-v128:64:128-a:0:32-n32-S64"
 
diff --git a/llvm/test/Analysis/CostModel/ARM/cast_ldst.ll b/llvm/test/Analysis/CostModel/ARM/cast_ldst.ll
index db700eb3baeefe..4a2f9a25dc152f 100644
--- a/llvm/test/Analysis/CostModel/ARM/cast_ldst.ll
+++ b/llvm/test/Analysis/CostModel/ARM/cast_ldst.ll
@@ -3,11 +3,11 @@
 ; RUN: opt -passes="print<cost-model>" 2>&1 -disable-output -mtriple=thumbv8.1m.main-none-eabi -mattr=+mve.fp < %s | FileCheck %s --check-prefix=CHECK-MVE-RECIP
 ; RUN: opt -passes="print<cost-model>" 2>&1 -disable-output -mtriple=thumbv8m.main-none-eabi < %s | FileCheck %s --check-prefix=CHECK-V8M-MAIN-RECIP
 ; RUN: opt -passes="print<cost-model>" 2>&1 -disable-output -mtriple=thumbv8m.base-none-eabi < %s | FileCheck %s --check-prefix=CHECK-V8M-BASE-RECIP
-; RUN: opt -passes="print<cost-model>" 2>&1 -disable-output -mtriple=armv8r-none-eabi < %s | FileCheck %s --check-prefix=CHECK-V8R-RECIP
+; RUN: opt -passes="print<cost-model>" 2>&1 -disable-output -mtriple=armv8r-none-eabi -mattr=+neon,+fp-armv8 < %s | FileCheck %s --check-prefix=CHECK-V8R-RECIP
 ; RUN: opt -passes="print<cost-model>" 2>&1 -disable-output -cost-kind=code-size -mtriple=thumbv8.1m.main-none-eabi -mattr=+mve.fp < %s | FileCheck %s --check-prefix=CHECK-MVE-SIZE
 ; RUN: opt -passes="print<cost-model>" 2>&1 -disable-output -cost-kind=code-size -mtriple=thumbv8m.main-none-eabi < %s | FileCheck %s --check-prefix=CHECK-V8M-MAIN-SIZE
 ; RUN: opt -passes="print<cost-model>" 2>&1 -disable-output -cost-kind=code-size -mtriple=thumbv8m.base-none-eabi < %s | FileCheck %s --check-prefix=CHECK-V8M-BASE-SIZE
-; RUN: opt -passes="print<cost-model>" 2>&1 -disable-output -cost-kind=code-size -mtriple=armv8r-none-eabi < %s | FileCheck %s --check-prefix=CHECK-V8R-SIZE
+; RUN: opt -passes="print<cost-model>" 2>&1 -disable-output -cost-kind=code-size -mtriple=armv8r-none-eabi -mattr=+neon,+fp-armv8 < %s | FileCheck %s --check-prefix=CHECK-V8R-SIZE
 
 target datalayout = "e-m:e-p:32:32-Fi8-i64:64-v128:64:128-a:0:32-n32-S64"
 
diff --git a/llvm/test/Analysis/CostModel/ARM/cmps.ll b/llvm/test/Analysis/CostModel/ARM/cmps.ll
index 7f89f521e77cbc..184b7076d02bea 100644
--- a/llvm/test/Analysis/CostModel/ARM/cmps.ll
+++ b/llvm/test/Analysis/CostModel/ARM/cmps.ll
@@ -2,11 +2,11 @@
 ; RUN: opt -passes="print<cost-model>" 2>&1 -disable-output -mtriple=thumbv8.1m.main-none-eabi -mattr=+mve.fp < %s | FileCheck %s --check-prefix=CHECK-MVE-RECIP
 ; RUN: opt -passes="print<cost-model>" 2>&1 -disable-output -mtriple=thumbv8m.main-none-eabi < %s | FileCheck %s --check-prefix=CHECK-V8M-MAIN-RECIP
 ; RUN: opt -passes="print<cost-model>" 2>&1 -disable-output -mtriple=thumbv8m.base-none-eabi < %s | FileCheck %s --check-prefix=CHECK-V8M-BASE-RECIP
-; RUN: opt -passes="print<cost-model>" 2>&1 -disable-output -mtriple=armv8r-none-eabi < %s | FileCheck %s --check-prefix=CHECK-V8R-RECIP
+; RUN: opt -passes="print<cost-model>" 2>&1 -disable-output -mtriple=armv8r-none-eabi -mattr=+neon,+fp-armv8 < %s | FileCheck %s --check-prefix=CHECK-V8R-RECIP
 ; RUN: opt -passes="print<cost-model>" 2>&1 -disable-output -cost-kind=code-size -mtriple=thumbv8.1m.main-none-eabi -mattr=+mve.fp < %s | FileCheck %s --check-prefix=CHECK-MVE-SIZE
 ; RUN: opt -passes="print<cost-model>" 2>&1 -disable-output -cost-kind=code-size -mtriple=thumbv8m.main-none-eabi < %s | FileCheck %s --check-prefix=CHECK-V8M-MAIN-SIZE
 ; RUN: opt -passes="print<cost-model>" 2>&1 -disable-output -cost-kind=code-size -mtriple=thumbv8m.base-none-eabi < %s | FileCheck %s --check-prefix=CHECK-V8M-BASE-SIZE
-; RUN: opt -passes="print<cost-model>" 2>&1 -disable-output -cost-kind=code-size -mtriple=armv8r-none-eabi < %s | FileCheck %s --check-prefix=CHECK-V8R-SIZE
+; RUN: opt -passes="print<cost-model>" 2>&1 -disable-output -cost-kind=code-size -mtriple=armv8r-none-eabi -mattr=+neon,+fp-armv8 < %s | FileCheck %s --check-prefix=CHECK-V8R-SIZE
 
 target datalayout = "e-m:e-p:32:32-Fi8-i64:64-v128:64:128-a:0:32-n32-S64"
 
diff --git a/llvm/test/Analysis/CostModel/ARM/divrem.ll b/llvm/test/Analysis/CostModel/ARM/divrem.ll
index b582a61c2a0fc8..9aba39327f6601 100644
--- a/llvm/test/Analysis/CostModel/ARM/divrem.ll
+++ b/llvm/test/Analysis/CostModel/ARM/divrem.ll
@@ -3,7 +3,7 @@
 ; RUN: opt -passes="print<cost-model>" 2>&1 -disable-output -mtriple=thumbv8.1m.main-none-eabi -mattr=+mve.fp < %s | FileCheck %s --check-prefix=CHECK-MVE
 ; RUN: opt -passes="print<cost-model>" 2>&1 -disable-output -mtriple=thumbv8m.main-none-eabi < %s | FileCheck %s --check-prefix=CHECK-V8M-MAIN
 ; RUN: opt -passes="print<cost-model>" 2>&1 -disable-output -mtriple=thumbv8m.base-none-eabi < %s | FileCheck %s --check-prefix=CHECK-V8M-BASE
-; RUN: opt -passes="print<cost-model>" 2>&1 -disable-output -mtriple=armv8r-none-eabi < %s | FileCheck %s --check-prefix=CHECK-V8R
+; RUN: opt -passes="print<cost-model>" 2>&1 -disable-output -mtriple=armv8r-none-eabi -mattr=+neon,+fp-armv8,+fp16 < %s | FileCheck %s --check-prefix=CHECK-V8R
 
 target datalayout = "e-m:e-p:32:32-Fi8-i64:64-v128:64:128-a:0:32-n32-S64"
 
diff --git a/llvm/test/CodeGen/ARM/build-attributes.ll b/llvm/test/CodeGen/ARM/build-attributes.ll
index e8c83b92f3f947..2812bf81acf98c 100644
--- a/llvm/test/CodeGen/ARM/build-attributes.ll
+++ b/llvm/test/CodeGen/ARM/build-attributes.ll
@@ -221,8 +221,8 @@
 
 ; ARMv8-R
 ; RUN: llc < %s -mtriple=arm-none-none-eabi -mcpu=cortex-r52 -mattr=-vfp2sp,-fp16 | FileCheck %s --check-prefix=ARMv8R --check-prefix=ARMv8R-NOFPU
-; RUN: llc < %s -mtriple=arm-none-none-eabi -mcpu=cortex-r52 -mattr=-neon,-fp64,-d32 | FileCheck %s --check-prefix=ARMv8R --check-prefix=ARMv8R-SP
-; RUN: llc < %s -mtriple=arm-none-none-eabi -mcpu=cortex-r52 | FileCheck %s --check-prefix=ARMv8R --check-prefix=ARMv8R-NEON
+; RUN: llc < %s -mtriple=arm-none-none-eabi -mcpu=cortex-r52 | FileCheck %s --check-prefix=ARMv8R --check-prefix=ARMv8R-SP
+; RUN: llc < %s -mtriple=arm-none-none-eabi -mcpu=cortex-r52 -mattr=+neon,+fp-armv8 | FileCheck %s --check-prefix=ARMv8R --check-prefix=ARMv8R-NEON
 
 ; ARMv8-M
 ; RUN: llc < %s -mtriple=thumbv8-none-none-eabi -mcpu=cortex-m23 | FileCheck %s --check-prefix=STRICT-ALIGN
diff --git a/llvm/test/CodeGen/ARM/fpconv.ll b/llvm/test/CodeGen/ARM/fpconv.ll
index 929da5f18c813e..5dfc2a6a4629b1 100644
--- a/llvm/test/CodeGen/ARM/fpconv.ll
+++ b/llvm/test/CodeGen/ARM/fpconv.ll
@@ -1,7 +1,7 @@
 ; RUN: llc -mtriple=arm-eabi -mattr=+vfp2 %s -o - | FileCheck %s --check-prefix=CHECK-VFP
 ; RUN: llc -mtriple=arm-apple-darwin %s -o - | FileCheck %s
-; RUN: llc -mtriple=armv8r-none-none-eabi %s -o - | FileCheck %s --check-prefix=CHECK-VFP
-; RUN: llc -mtriple=armv8r-none-none-eabi -mattr=-fp64 %s -o - | FileCheck %s --check-prefix=CHECK-VFP-SP
+; RUN: llc -mtriple=armv8r-none-none-eabi -mattr=+neon,+fp-armv8 %s -o - | FileCheck %s --check-prefix=CHECK-VFP
+; RUN: llc -mtriple=armv8r-none-none-eabi -mattr=+fp-armv8,-fp64,-d32 %s -o - | FileCheck %s --check-prefix=CHECK-VFP-SP
 
 define float @f1(double %x) {
 ;CHECK-VFP-LABEL: f1:
diff --git a/llvm/test/CodeGen/ARM/half.ll b/llvm/test/CodeGen/ARM/half.ll
index 9b53dc77f22739..050b400539b4dd 100644
--- a/llvm/test/CodeGen/ARM/half.ll
+++ b/llvm/test/CodeGen/ARM/half.ll
@@ -1,8 +1,8 @@
 ; RUN: llc < %s -mtriple=thumbv7-apple-ios7.0 | FileCheck %s --check-prefix=CHECK --check-prefix=CHECK-OLD
 ; RUN: llc < %s -mtriple=thumbv7s-apple-ios7.0 | FileCheck %s --check-prefix=CHECK --check-prefix=CHECK-F16
 ; RUN: llc < %s -mtriple=thumbv8-apple-ios7.0 | FileCheck %s --check-prefix=CHECK  --check-prefix=CHECK-V8
-; RUN: llc < %s -mtriple=armv8r-none-none-eabi | FileCheck %s --check-prefix=CHECK  --check-prefix=CHECK-V8
-; RUN: llc < %s -mtriple=armv8r-none-none-eabi -mattr=-fp64 | FileCheck %s --check-prefix=CHECK  --check-prefix=CHECK-V8-SP
+; RUN: llc < %s -mtriple=armv8r-none-none-eabi -mattr=+fp-armv8,+fp16 | FileCheck %s --check-prefix=CHECK  --check-prefix=CHECK-V8
+; RUN: llc < %s -mtriple=armv8r-none-none-eabi -mattr=+fp-armv8,+fp16,-fp64 | FileCheck %s --check-prefix=CHECK  --check-prefix=CHECK-V8-SP
 ; RUN: llc < %s -mtriple=armv8.1m-none-none-eabi -mattr=+fp-armv8 | FileCheck %s --check-prefix=CHECK --check-prefix=CHECK-V8
 ; RUN: llc < %s -mtriple=armv8.1m-none-none-eabi -mattr=+fp-armv8,-fp64 | FileCheck %s --check-prefix=CHECK --check-prefix=CHECK-V8-SP
 ; RUN: llc < %s -mtriple=armv8.1m-none-none-eabi -mattr=+mve.fp,+fp64 | FileCheck %s --check-prefix=CHECK-V8
diff --git a/llvm/test/CodeGen/ARM/pr42638-VMOVRRDCombine.ll b/llvm/test/CodeGen/ARM/pr42638-VMOVRRDCombine.ll
index 8ee88fd18f3c81..245780e6836ea2 100644
--- a/llvm/test/CodeGen/ARM/pr42638-VMOVRRDCombine.ll
+++ b/llvm/test/CodeGen/ARM/pr42638-VMOVRRDCombine.ll
@@ -16,7 +16,7 @@ entry:
 
 declare void @llvm.va_start(ptr)
 
-attributes #0 = { "target-cpu"="cortex-r52" "target-features"="-fp64"  }
+attributes #0 = { "target-cpu"="cortex-r52" }
 
 ; Ensures that the machine scheduler does not move accessing the upper
 ; 32 bits of the double to before actually storing it to memory
diff --git a/llvm/test/CodeGen/ARM/useaa.ll b/llvm/test/CodeGen/ARM/useaa.ll
index f8207a1056e3b4..bbf2bb043d3f31 100644
--- a/llvm/test/CodeGen/ARM/useaa.ll
+++ b/llvm/test/CodeGen/ARM/useaa.ll
@@ -1,7 +1,7 @@
-; RUN: llc < %s -mtriple=armv8r-eabi -mcpu=cortex-r52 | FileCheck %s --check-prefix=CHECK --check-prefix=USEAA
+; RUN: llc < %s -mtriple=armv8r-eabi -mcpu=cortex-r52 -mattr=+neon,+fp-armv8 | FileCheck %s --check-prefix=CHECK --check-prefix=USEAA
 ; RUN: llc < %s -mtriple=armv7m-eabi -mcpu=cortex-m4 | FileCheck %s --check-prefix=CHECK --check-prefix=USEAA
 ; RUN: llc < %s -mtriple=armv8m-eabi -mcpu=cortex-m33 | FileCheck %s --check-prefix=CHECK --check-prefix=USEAA
-; RUN: llc < %s -mtriple=armv8r-eabi -mcpu=generic | FileCheck %s --check-prefix=CHECK --check-prefix=GENERIC
+; RUN: llc < %s -mtriple=armv8r-eabi -mcpu=generic -mattr=+neon,+fp-armv8 | FileCheck %s --check-prefix=CHECK --check-prefix=GENERIC
 
 ; Check we use AA during codegen, so can interleave these loads/stores.
 
diff --git a/llvm/unittests/TargetParser/TargetParserTest.cpp b/llvm/unittests/TargetParser/TargetParserTest.cpp
index 2c72a7229b5274..82c5c0d6c12175 100644
--- a/llvm/unittests/TargetParser/TargetParserTest.cpp
+++ b/llvm/unittests/TargetParser/TargetParserTest.cpp
@@ -351,7 +351,7 @@ INSTANTIATE_TEST_SUITE_P(
                                    ARM::AEK_MP | ARM::AEK_HWDIVARM |
                                        ARM::AEK_HWDIVTHUMB | ARM::AEK_DSP,
                                    "7-R"),
-        ARMCPUTestParams<uint64_t>("cortex-r52", "armv8-r", "neon-fp-armv8",
+        ARMCPUTestParams<uint64_t>("cortex-r52", "armv8-r", "fpv5-sp-d16",
                                    ARM::AEK_NONE | ARM::AEK_CRC | ARM::AEK_MP |
                                        ARM::AEK_VIRT | ARM::AEK_HWDIVARM |
                                        ARM::AEK_HWDIVTHUMB | ARM::AEK_DSP,

chrisnc · 2024-04-18T05:12:16Z

ping @ostannard

ostannard

LGTM

davemgreen · 2024-04-18T09:37:13Z

Does this disable neon by default for cortex-r52? If so I don't think it should be doing that, it would be a break in the existing behaviour, and should at least be in the release notes.

The general rule for -mcpu options is that they should (roughly) enable the maximum set of features available, and from there can be disabled with -mfpu or +nofp options. Essentially the default is -mfpu=auto. Keeping to this keeps things consistent between cores, so long as people understand the rule. For architectures the default should be without, but as far as I understand we still default to -mfpu=auto.

chrisnc · 2024-04-18T16:25:50Z

@davemgreen yes, this does change the cortex-r52 default features as well. The problem is that the cortex-r52 is the "default" CPU for armv8r (probably because it's the only one, ignoring r52plus), so if I don't reduce the feature set of that CPU, armv8r will continue to imply features that are not true of all processors in that family. I can add this to the release notes if there's no other way to achieve what is needed here. Would removing the default CPU for armv8r be acceptable?
Regardless, this is an intentional behavior change because currently you get all neon and dp features just by using this architecture, so I can add release notes about that if necessary.

github-actions · 2024-04-19T04:38:32Z

✅ With the latest revision this PR passed the C/C++ code formatter.

chrisnc · 2024-04-19T17:52:04Z

I've updated the PR to use the proposed approach of making generic the default CPU for armv8r and let cortex-r52 have the full dp+neon suite. Let me know if this is alright. I still need to fix up a few of the tests to match this new behavior.

chrisnc · 2024-04-21T19:09:42Z

Another option is to include FeatureFPARMv8_D16_SP in ARMv8r. The R-profile supplement of the Arm manual does say that this is a minimum feature requirement (as opposed to just being a variant of the R52).

davemgreen · 2024-04-21T19:12:03Z

As far as I understand this will remove the tuning we do for cortex-r52 when using armv8r, which will mean a little less performance but the tuning features in the Arm backend are not handled as well as they could be.

Can you add release note explaining what will change? Thanks.

chrisnc · 2024-04-22T00:35:51Z

Added an item to the release notes and fixed another place where fp64+neon was being added (the target parser); now I see the expected results when using just armv8r-none-eabi (sans -mcpu=cortex-r52).

$ bin/clang --target=armv8r-none-eabihf -O3 -c -o foo.o foo.c
$ arm-none-eabi-readelf -a
Attribute Section: aeabi
File Attributes
  Tag_conformance: "2.09"
  Tag_CPU_arch: v8-R
  Tag_CPU_arch_profile: Realtime
  Tag_ARM_ISA_use: Yes
  Tag_THUMB_ISA_use: Thumb-2
  Tag_FP_arch: FPv5/FP-D16 for ARMv8
  Tag_ABI_PCS_R9_use: V6
  Tag_ABI_PCS_GOT_use: direct
  Tag_ABI_PCS_wchar_t: 4
  Tag_ABI_FP_denormal: Needed
  Tag_ABI_FP_exceptions: Unused
  Tag_ABI_FP_number_model: IEEE 754
  Tag_ABI_align_needed: 8-byte
  Tag_ABI_align_preserved: 8-byte, except leaf SP
  Tag_ABI_enum_size: int
  Tag_ABI_HardFP_use: SP only
  Tag_ABI_VFP_args: VFP registers
  Tag_ABI_optimization_goals: Aggressive Speed
  Tag_CPU_unaligned_access: v6
  Tag_FP_HP_extension: Allowed
  Tag_ABI_FP_16bit_format: IEEE 754
  Tag_MPextension_use: Allowed
  Tag_Virtualization_use: Virtualization Extensions

davemgreen · 2024-04-24T06:17:07Z

I'm not sure I would make this change, mostly due to it potentially causing a break for existing users and making performance worse, but can see the reasoning. I am willing to defer to others if they have an opinion.

chrisnc · 2024-04-24T06:32:29Z

@davemgreen which change? Specifying -mcpu=cortex-r52 will behave the same way as before. The original manual for the R52 provided for a no-neon sp-only variant, and they exist in the wild, and this lets "architecture-generic" builds automatically support both.

One example where this comes up is in the Rust project, which recently gained armv8r support, but due to the over-spec'ing of the base armv8r in LLVM, it requires -feature negations in the default target feature list to build the core crate in a way that will support lesser r52s. Then, when users specify cortex-r52 as their target cpu, they are still left with the -feature negations overriding what the r52 should enable by default. rust-lang/rust#123159 (comment) This change will allow the feature flags and target CPUs to interact in a more predictable way as compared to other Cortex-R and -M CPUs.

chrisnc · 2024-05-02T17:40:52Z

Friendly bump. :) @ostannard @davemgreen

davemgreen · 2024-05-02T20:16:12Z

which change? Specifying -mcpu=cortex-r52 will behave the same way as before. The original manual for the R52 provided for a no-neon sp-only variant, and they exist in the wild, and this lets "architecture-generic" builds automatically support both.

I just meant the break in functionality for existing users of -march=armv8r, and the drop in performance from less R52 tuning features/scheduling and the lack of Neon. I believe this is more correct, but users in the past could still get the same results by specifying -march=armv8r -mfpu=fpv5-sp-d16. I'm hoping someone else can come along and have an opinion on it one way or another.

chrisnc · 2024-05-02T23:50:29Z

Understood. FWIW, arm-none-eabi-gcc does not enable any FPU features by default with -march=armv8-r (and will error out if you combine that with -mfloat-abi=hard), which is what the first pass of this PR went for, but I think where we landed is a decent middle ground.

smithp35 · 2024-05-03T09:37:37Z

I can chime in with an opinion but unfortunately I think it may be different to everyone elses!

This is a bit of an awkward situation as I think we have to balance several things:

Consistency between v8-R and AArch32 (ARM) and AArch64 (more consistent the better)
Consistency between clang and gcc (more consistent the better)
Historical behaviour of -march=armv8-r (how often is it used vs -mcpu)
Historical behaviour of -mcpu=cortex-r82 (keeping these stable as this is the most specific target)
Consistency with other -march and -mcpu with respect to feature enablement heuristics (helps transition between options).

As a general rule/heuristic for Arm feature enablement going forward, you will find exceptions to these rules in LLVM, particularly pre Armv8. we tend towards:

-march enables the mandatory features, but not the optional features, with specific exceptions such as SVE for Armv9 and floating point and SIMD for AArch64 Armv8
-mcpu enables the mandatory and optional features, with specific exceptions such as crypto

The Cortex-R series are intended for deeply embedded applications so it is normally possible for a user to directly target the CPU rather than the architecture. As there is only one v8-R CPU I'm fully expecting most users to use -mcpu=cortex-r82 rather than -march=armv8-r.

With these in mind. I propose:

-march=armv8-r enables the mandatory architectural features and including the mandatory minimum FPU given that all implementations of v8-R have that. It doesn't make sense to make it optional for puritys sake.
Make -march=armv8-r use a generic CPU so that we don't need to change cortex-r82. Yes this will make the performance of v8-R worse without -mtune, but I think it is preferable to keep -mcpu=cortex-r82 unaltered as that is what most people will be using.
Do not change -mcpu=cortex-r82 from its current default.

This is again an opinion based the weights that I put on the trade-off dimensions.

smithp35 · 2024-05-03T09:52:09Z

Apologies I read the comments, rather than go through the code. It looks the code has already made the v8-r imply a generic CPU.

So it looks like the code is doing what my proposal states so no objections from me and apologies for the noise.

chrisnc · 2024-05-03T14:01:16Z

@smithp35 thanks! Yes, I iterated on the specific direction here after the feedback about not changing the behavior of -mcpu=cortex-r52, but the first comment in the thread reflects the initial state, so I'll change that. I believe the cortex-r82 is AArch64-only, and so is not relevant here except w.r.t. consistency. I'm not familiar with how default CPUs are set in AArch64, but experimentally, --target=aarch64-none-elf -march=armv8-r does not generate the same code on a simple example as using --target=aarch64-none-elf -mcpu=cortex-r82 currently, so it does not appear to be a default there, unlike cortex-r52 without this change. Other than s/82/52/g, I confirm that what you've written is in line with what this change is doing.

smithp35

LGTM, thanks for the confirmation.

chrisnc · 2024-05-05T17:59:28Z

If there's no other feedback, could someone hit the merge button for me?

github-actions · 2024-05-07T10:48:47Z

@chrisnc Congratulations on having your first Pull Request (PR) merged into the LLVM Project!

Your changes will be combined with recent changes from other authors, then tested
by our build bots. If there is a problem with a build, you may receive a report in an email or a comment on this PR.

Please check whether problems have been caused by your change specifically, as
the builds can include changes from many authors. It is not uncommon for your
change to be included in a build that fails due to someone else's changes, or
infrastructure issues.

How to do this, and the rest of the post-merge process, is covered in detail here.

If your change does cause a problem, it may be reverted, or you can revert it yourself.
This is a normal part of LLVM development. You can fix your changes and open a new PR to merge them again.

If you don't get any reports, no action is required from you. Your changes are working as expected, well done!

This is a follow-up to rust-lang#123159, but applied to Armv8-R. This required llvm/llvm-project#88287 to work properly. Now that this change exists in rustc's llvm, we can fix Armv8-R's default fpu features.

This is a follow-up to rust-lang#123159, but applied to Armv8-R. This required llvm/llvm-project#88287 to work properly. Now that this change exists in rustc's llvm, we can fix Armv8-R's default fpu features. Add a run-make test that target-cpu=cortex-r52 enables double-precision and neon.

This is a follow-up to rust-lang#123159, but applied to Armv8-R. This required llvm/llvm-project#88287 to work properly. Now that this change exists in rustc's llvm, we can fix Armv8-R's default fpu features. In Armv8-R's case, the default features from LLVM for floating-point are sufficient, because there is no integer-only variant of this architecture. Add a run-make test that target-cpu=cortex-r52 enables double-precision and neon.

This is a follow-up to rust-lang#123159, but applied to Armv8-R. This required llvm/llvm-project#88287 to work properly. Now that this change exists in rustc's llvm, we can fix Armv8-R's default fpu features. In Armv8-R's case, the default features from LLVM for floating-point are sufficient, because there is no integer-only variant of this architecture. Add a run-make test that an appropriate target-cpu enables double-precision and neon for thumbv8m.main-none-eabihf and armv8r-none-eabihf.

This is a follow-up to rust-lang#123159, but applied to Armv8-R. This required llvm/llvm-project#88287 to work properly. Now that this change exists in rustc's llvm, we can fix Armv8-R's default fpu features. In Armv8-R's case, the default features from LLVM for floating-point are sufficient, because there is no integer-only variant of this architecture. Add a run-make test that an appropriate target-cpu enables double-precision and neon for thumbv7em-none-eabihf and thumbv8m.main-none-eabihf.

Fix target-cpu fpu features on Armv8-R. This is a follow-up to rust-lang#123159, but applied to Armv8-R. This required llvm/llvm-project#88287 to work properly. Now that this change exists in rustc's llvm, we can fix Armv8-R's default fpu features. In Armv8-R's case, the default features from LLVM for floating-point are sufficient, because there is no integer-only variant of this architecture. Add a run-make test that target-cpu=cortex-r52 enables double-precision and neon. try-job: dist-aarch64-linux try-job: dist-various-1 try-job: dist-various-2 try-job: dist-x86_64-linux try-job: dist-x86_64-msvc

This is a follow-up to rust-lang#123159, but applied to Armv8-R. This required llvm/llvm-project#88287 to work properly. Now that this change exists in rustc's llvm, we can fix Armv8-R's default fpu features. In Armv8-R's case, the default features from LLVM for floating-point are sufficient, because there is no integer-only variant of this architecture.

…ingjubilee Fix target-cpu fpu features on Armv8-R. This is a follow-up to rust-lang#123159, but applied to Armv8-R. This required llvm/llvm-project#88287 to work properly. Now that this change exists in rustc's llvm, we can fix Armv8-R's default fpu features. In Armv8-R's case, the default features from LLVM for floating-point are sufficient, because there is no integer-only variant of this architecture.

Rollup merge of rust-lang#130295 - chrisnc:armv8r-feature-fix, r=workingjubilee Fix target-cpu fpu features on Armv8-R. This is a follow-up to rust-lang#123159, but applied to Armv8-R. This required llvm/llvm-project#88287 to work properly. Now that this change exists in rustc's llvm, we can fix Armv8-R's default fpu features. In Armv8-R's case, the default features from LLVM for floating-point are sufficient, because there is no integer-only variant of this architecture.

llvmbot added clang Clang issues not falling into any other category backend:ARM llvm:analysis labels Apr 10, 2024

chrisnc force-pushed the armv8r-no-fp-simd branch 3 times, most recently from 58f4d1a to f707f29 Compare April 14, 2024 07:33

chrisnc mentioned this pull request Apr 14, 2024

Fix target-cpu fpu features on Arm R/M-profile rust-lang/rust#123159

Merged

chrisnc force-pushed the armv8r-no-fp-simd branch from f707f29 to 9317b17 Compare April 16, 2024 13:35

ostannard approved these changes Apr 18, 2024

View reviewed changes

chrisnc force-pushed the armv8r-no-fp-simd branch 2 times, most recently from ccc3022 to a021b57 Compare April 19, 2024 04:35

chrisnc force-pushed the armv8r-no-fp-simd branch from a021b57 to 4681c30 Compare April 19, 2024 04:39

chrisnc force-pushed the armv8r-no-fp-simd branch 2 times, most recently from 65e44d6 to a4a27e9 Compare April 20, 2024 21:19

llvmbot added the clang:driver 'clang' and 'clang++' user-facing binaries. Not 'clang-cl' label Apr 20, 2024

davemgreen requested a review from smithp35 April 21, 2024 19:12

chrisnc force-pushed the armv8r-no-fp-simd branch from a4a27e9 to 575128c Compare April 22, 2024 00:29

chrisnc requested a review from ostannard April 22, 2024 00:38

chrisnc force-pushed the armv8r-no-fp-simd branch from 575128c to 46803a6 Compare April 22, 2024 00:45

smithp35 approved these changes May 3, 2024

View reviewed changes

ostannard merged commit 651bdb9 into llvm:main May 7, 2024
5 checks passed

chrisnc deleted the armv8r-no-fp-simd branch May 7, 2024 15:41

chrisnc mentioned this pull request Sep 13, 2024

Fix target-cpu fpu features on Armv8-R. rust-lang/rust#130295

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[ARM] Armv8-R does not require fp64 or neon. #88287

[ARM] Armv8-R does not require fp64 or neon. #88287

chrisnc commented Apr 10, 2024 •

edited

Loading

github-actions bot commented Apr 10, 2024

llvmbot commented Apr 10, 2024 •

edited

Loading

llvmbot commented Apr 10, 2024

chrisnc commented Apr 18, 2024

ostannard left a comment

davemgreen commented Apr 18, 2024

chrisnc commented Apr 18, 2024 •

edited

Loading

github-actions bot commented Apr 19, 2024 •

edited

Loading

chrisnc commented Apr 19, 2024 •

edited

Loading

chrisnc commented Apr 21, 2024

davemgreen commented Apr 21, 2024

chrisnc commented Apr 22, 2024 •

edited

Loading

davemgreen commented Apr 24, 2024

chrisnc commented Apr 24, 2024 •

edited

Loading

chrisnc commented May 2, 2024

davemgreen commented May 2, 2024

chrisnc commented May 2, 2024

smithp35 commented May 3, 2024

smithp35 commented May 3, 2024

chrisnc commented May 3, 2024 •

edited

Loading

smithp35 left a comment

chrisnc commented May 5, 2024

github-actions bot commented May 7, 2024

[ARM] Armv8-R does not require fp64 or neon. #88287

[ARM] Armv8-R does not require fp64 or neon. #88287

Conversation

chrisnc commented Apr 10, 2024 • edited Loading

github-actions bot commented Apr 10, 2024

llvmbot commented Apr 10, 2024 • edited Loading

llvmbot commented Apr 10, 2024

chrisnc commented Apr 18, 2024

ostannard left a comment

Choose a reason for hiding this comment

davemgreen commented Apr 18, 2024

chrisnc commented Apr 18, 2024 • edited Loading

github-actions bot commented Apr 19, 2024 • edited Loading

chrisnc commented Apr 19, 2024 • edited Loading

chrisnc commented Apr 21, 2024

davemgreen commented Apr 21, 2024

chrisnc commented Apr 22, 2024 • edited Loading

davemgreen commented Apr 24, 2024

chrisnc commented Apr 24, 2024 • edited Loading

chrisnc commented May 2, 2024

davemgreen commented May 2, 2024

chrisnc commented May 2, 2024

smithp35 commented May 3, 2024

smithp35 commented May 3, 2024

chrisnc commented May 3, 2024 • edited Loading

smithp35 left a comment

Choose a reason for hiding this comment

chrisnc commented May 5, 2024

github-actions bot commented May 7, 2024

chrisnc commented Apr 10, 2024 •

edited

Loading

llvmbot commented Apr 10, 2024 •

edited

Loading

chrisnc commented Apr 18, 2024 •

edited

Loading

github-actions bot commented Apr 19, 2024 •

edited

Loading

chrisnc commented Apr 19, 2024 •

edited

Loading

chrisnc commented Apr 22, 2024 •

edited

Loading

chrisnc commented Apr 24, 2024 •

edited

Loading

chrisnc commented May 3, 2024 •

edited

Loading