-
Notifications
You must be signed in to change notification settings - Fork 12.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[RISCV] Add canonical ISA string as Module metadata in IR. #80760
Conversation
In an LTO build, we don't set the ELF attributes to indicate what extensions were compiled with. The target CPU/Attrs in RISCVTargetMachine do not get set for an LTO build. Each function gets a target-cpu/feature attribute, but this isn't usable to set ELF attributs since we wouldn't know what function to use. We can't just once since it might have been compiler with an attribute likes target_verson. This patch adds the ISA as Module metadata so we can retrieve it in the backend. Individual translation units can still be compiled with different strings so we need to collect the unique set when Modules are merged. The backend will need to combine the unique ISA strings to produce a single value for the ELF attributes. This will be done in a separate patch.
@llvm/pr-subscribers-clang @llvm/pr-subscribers-clang-codegen Author: Craig Topper (topperc) ChangesIn an LTO build, we don't set the ELF attributes to indicate what extensions were compiled with. The target CPU/Attrs in RISCVTargetMachine do not get set for an LTO build. Each function gets a target-cpu/feature attribute, but this isn't usable to set ELF attributs since we wouldn't know what function to use. We can't just once since it might have been compiler with an attribute likes target_verson. This patch adds the ISA as Module metadata so we can retrieve it in the backend. Individual translation units can still be compiled with different strings so we need to collect the unique set when Modules are merged. The backend will need to combine the unique ISA strings to produce a single value for the ELF attributes. This will be done in a separate patch. Patch is 49.81 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/80760.diff 3 Files Affected:
diff --git a/clang/lib/CodeGen/CodeGenModule.cpp b/clang/lib/CodeGen/CodeGenModule.cpp
index 36b63d78b06f83..25b8d5cae36d3a 100644
--- a/clang/lib/CodeGen/CodeGenModule.cpp
+++ b/clang/lib/CodeGen/CodeGenModule.cpp
@@ -67,6 +67,7 @@
#include "llvm/Support/CommandLine.h"
#include "llvm/Support/ConvertUTF.h"
#include "llvm/Support/ErrorHandling.h"
+#include "llvm/Support/RISCVISAInfo.h"
#include "llvm/Support/TimeProfiler.h"
#include "llvm/Support/xxhash.h"
#include "llvm/TargetParser/Triple.h"
@@ -1056,6 +1057,16 @@ void CodeGenModule::Release() {
llvm::LLVMContext &Ctx = TheModule.getContext();
getModule().addModuleFlag(llvm::Module::Error, "target-abi",
llvm::MDString::get(Ctx, ABIStr));
+
+ const std::vector<std::string> &Features =
+ getTarget().getTargetOpts().Features;
+ auto ParseResult =
+ llvm::RISCVISAInfo::parseFeatures(T.isRISCV64() ? 64 : 32, Features);
+ if (!errorToBool(ParseResult.takeError()))
+ getModule().addModuleFlag(
+ llvm::Module::AppendUnique, "riscv-arch",
+ llvm::MDNode::get(
+ Ctx, llvm::MDString::get(Ctx, (*ParseResult)->toString())));
}
if (CodeGenOpts.SanitizeCfiCrossDso) {
diff --git a/clang/test/CodeGen/RISCV/ntlh-intrinsics/riscv32-zihintntl.c b/clang/test/CodeGen/RISCV/ntlh-intrinsics/riscv32-zihintntl.c
index 897edbc6450af6..b11c2ca010e7ce 100644
--- a/clang/test/CodeGen/RISCV/ntlh-intrinsics/riscv32-zihintntl.c
+++ b/clang/test/CodeGen/RISCV/ntlh-intrinsics/riscv32-zihintntl.c
@@ -28,190 +28,190 @@ vint8m1_t *scvc1, *scvc2;
// clang-format off
void ntl_all_sizes() { // CHECK-LABEL: ntl_all_sizes
- uc = __riscv_ntl_load(&sc, __RISCV_NTLH_INNERMOST_PRIVATE); // CHECK: load i8{{.*}}align 1, !nontemporal !4, !riscv-nontemporal-domain !5
- sc = __riscv_ntl_load(&uc, __RISCV_NTLH_INNERMOST_PRIVATE); // CHECK: load i8{{.*}}align 1, !nontemporal !4, !riscv-nontemporal-domain !5
- us = __riscv_ntl_load(&ss, __RISCV_NTLH_INNERMOST_PRIVATE); // CHECK: load i16{{.*}}align 2, !nontemporal !4, !riscv-nontemporal-domain !5
- ss = __riscv_ntl_load(&us, __RISCV_NTLH_INNERMOST_PRIVATE); // CHECK: load i16{{.*}}align 2, !nontemporal !4, !riscv-nontemporal-domain !5
- ui = __riscv_ntl_load(&si, __RISCV_NTLH_INNERMOST_PRIVATE); // CHECK: load i32{{.*}}align 4, !nontemporal !4, !riscv-nontemporal-domain !5
- si = __riscv_ntl_load(&ui, __RISCV_NTLH_INNERMOST_PRIVATE); // CHECK: load i32{{.*}}align 4, !nontemporal !4, !riscv-nontemporal-domain !5
- ull = __riscv_ntl_load(&sll, __RISCV_NTLH_INNERMOST_PRIVATE); // CHECK: load i64{{.*}}align 8, !nontemporal !4, !riscv-nontemporal-domain !5
- sll = __riscv_ntl_load(&ull, __RISCV_NTLH_INNERMOST_PRIVATE); // CHECK: load i64{{.*}}align 8, !nontemporal !4, !riscv-nontemporal-domain !5
- h1 = __riscv_ntl_load(&h2, __RISCV_NTLH_INNERMOST_PRIVATE); // CHECK: load half{{.*}}align 2, !nontemporal !4, !riscv-nontemporal-domain !5
- f1 = __riscv_ntl_load(&f2, __RISCV_NTLH_INNERMOST_PRIVATE); // CHECK: load float{{.*}}align 4, !nontemporal !4, !riscv-nontemporal-domain !5
- d1 = __riscv_ntl_load(&d2, __RISCV_NTLH_INNERMOST_PRIVATE); // CHECK: load double{{.*}}align 8, !nontemporal !4, !riscv-nontemporal-domain !5
- v4si1 = __riscv_ntl_load(&v4si2, __RISCV_NTLH_INNERMOST_PRIVATE); // CHECK: load <4 x i32>{{.*}}align 16, !nontemporal !4, !riscv-nontemporal-domain !5
- v8ss1 = __riscv_ntl_load(&v8ss2, __RISCV_NTLH_INNERMOST_PRIVATE); // CHECK: load <8 x i16>{{.*}}align 16, !nontemporal !4, !riscv-nontemporal-domain !5
- v16sc1 = __riscv_ntl_load(&v16sc2, __RISCV_NTLH_INNERMOST_PRIVATE); // CHECK: load <16 x i8>{{.*}}align 16, !nontemporal !4, !riscv-nontemporal-domain !5
- *scvi1 = __riscv_ntl_load(scvi2, __RISCV_NTLH_INNERMOST_PRIVATE); // CHECK: load <vscale x 2 x i32>{{.*}}align 8, !nontemporal !4, !riscv-nontemporal-domain !5
- *scvs1 = __riscv_ntl_load(scvs2, __RISCV_NTLH_INNERMOST_PRIVATE); // CHECK: load <vscale x 4 x i16>{{.*}}align 8, !nontemporal !4, !riscv-nontemporal-domain !5
- *scvc1 = __riscv_ntl_load(scvc2, __RISCV_NTLH_INNERMOST_PRIVATE); // CHECK: load <vscale x 8 x i8>{{.*}}align 8, !nontemporal !4, !riscv-nontemporal-domain !5
+ uc = __riscv_ntl_load(&sc, __RISCV_NTLH_INNERMOST_PRIVATE); // CHECK: load i8{{.*}}align 1, !nontemporal !6, !riscv-nontemporal-domain !7
+ sc = __riscv_ntl_load(&uc, __RISCV_NTLH_INNERMOST_PRIVATE); // CHECK: load i8{{.*}}align 1, !nontemporal !6, !riscv-nontemporal-domain !7
+ us = __riscv_ntl_load(&ss, __RISCV_NTLH_INNERMOST_PRIVATE); // CHECK: load i16{{.*}}align 2, !nontemporal !6, !riscv-nontemporal-domain !7
+ ss = __riscv_ntl_load(&us, __RISCV_NTLH_INNERMOST_PRIVATE); // CHECK: load i16{{.*}}align 2, !nontemporal !6, !riscv-nontemporal-domain !7
+ ui = __riscv_ntl_load(&si, __RISCV_NTLH_INNERMOST_PRIVATE); // CHECK: load i32{{.*}}align 4, !nontemporal !6, !riscv-nontemporal-domain !7
+ si = __riscv_ntl_load(&ui, __RISCV_NTLH_INNERMOST_PRIVATE); // CHECK: load i32{{.*}}align 4, !nontemporal !6, !riscv-nontemporal-domain !7
+ ull = __riscv_ntl_load(&sll, __RISCV_NTLH_INNERMOST_PRIVATE); // CHECK: load i64{{.*}}align 8, !nontemporal !6, !riscv-nontemporal-domain !7
+ sll = __riscv_ntl_load(&ull, __RISCV_NTLH_INNERMOST_PRIVATE); // CHECK: load i64{{.*}}align 8, !nontemporal !6, !riscv-nontemporal-domain !7
+ h1 = __riscv_ntl_load(&h2, __RISCV_NTLH_INNERMOST_PRIVATE); // CHECK: load half{{.*}}align 2, !nontemporal !6, !riscv-nontemporal-domain !7
+ f1 = __riscv_ntl_load(&f2, __RISCV_NTLH_INNERMOST_PRIVATE); // CHECK: load float{{.*}}align 4, !nontemporal !6, !riscv-nontemporal-domain !7
+ d1 = __riscv_ntl_load(&d2, __RISCV_NTLH_INNERMOST_PRIVATE); // CHECK: load double{{.*}}align 8, !nontemporal !6, !riscv-nontemporal-domain !7
+ v4si1 = __riscv_ntl_load(&v4si2, __RISCV_NTLH_INNERMOST_PRIVATE); // CHECK: load <4 x i32>{{.*}}align 16, !nontemporal !6, !riscv-nontemporal-domain !7
+ v8ss1 = __riscv_ntl_load(&v8ss2, __RISCV_NTLH_INNERMOST_PRIVATE); // CHECK: load <8 x i16>{{.*}}align 16, !nontemporal !6, !riscv-nontemporal-domain !7
+ v16sc1 = __riscv_ntl_load(&v16sc2, __RISCV_NTLH_INNERMOST_PRIVATE); // CHECK: load <16 x i8>{{.*}}align 16, !nontemporal !6, !riscv-nontemporal-domain !7
+ *scvi1 = __riscv_ntl_load(scvi2, __RISCV_NTLH_INNERMOST_PRIVATE); // CHECK: load <vscale x 2 x i32>{{.*}}align 8, !nontemporal !6, !riscv-nontemporal-domain !7
+ *scvs1 = __riscv_ntl_load(scvs2, __RISCV_NTLH_INNERMOST_PRIVATE); // CHECK: load <vscale x 4 x i16>{{.*}}align 8, !nontemporal !6, !riscv-nontemporal-domain !7
+ *scvc1 = __riscv_ntl_load(scvc2, __RISCV_NTLH_INNERMOST_PRIVATE); // CHECK: load <vscale x 8 x i8>{{.*}}align 8, !nontemporal !6, !riscv-nontemporal-domain !7
- uc = __riscv_ntl_load(&sc, __RISCV_NTLH_ALL_PRIVATE); // CHECK: load i8{{.*}}align 1, !nontemporal !4, !riscv-nontemporal-domain !6
- sc = __riscv_ntl_load(&uc, __RISCV_NTLH_ALL_PRIVATE); // CHECK: load i8{{.*}}align 1, !nontemporal !4, !riscv-nontemporal-domain !6
- us = __riscv_ntl_load(&ss, __RISCV_NTLH_ALL_PRIVATE); // CHECK: load i16{{.*}}align 2, !nontemporal !4, !riscv-nontemporal-domain !6
- ss = __riscv_ntl_load(&us, __RISCV_NTLH_ALL_PRIVATE); // CHECK: load i16{{.*}}align 2, !nontemporal !4, !riscv-nontemporal-domain !6
- ui = __riscv_ntl_load(&si, __RISCV_NTLH_ALL_PRIVATE); // CHECK: load i32{{.*}}align 4, !nontemporal !4, !riscv-nontemporal-domain !6
- si = __riscv_ntl_load(&ui, __RISCV_NTLH_ALL_PRIVATE); // CHECK: load i32{{.*}}align 4, !nontemporal !4, !riscv-nontemporal-domain !6
- ull = __riscv_ntl_load(&sll, __RISCV_NTLH_ALL_PRIVATE); // CHECK: load i64{{.*}}align 8, !nontemporal !4, !riscv-nontemporal-domain !6
- sll = __riscv_ntl_load(&ull, __RISCV_NTLH_ALL_PRIVATE); // CHECK: load i64{{.*}}align 8, !nontemporal !4, !riscv-nontemporal-domain !6
- h1 = __riscv_ntl_load(&h2, __RISCV_NTLH_ALL_PRIVATE); // CHECK: load half{{.*}}align 2, !nontemporal !4, !riscv-nontemporal-domain !6
- f1 = __riscv_ntl_load(&f2, __RISCV_NTLH_ALL_PRIVATE); // CHECK: load float{{.*}}align 4, !nontemporal !4, !riscv-nontemporal-domain !6
- d1 = __riscv_ntl_load(&d2, __RISCV_NTLH_ALL_PRIVATE); // CHECK: load double{{.*}}align 8, !nontemporal !4, !riscv-nontemporal-domain !6
- v4si1 = __riscv_ntl_load(&v4si2, __RISCV_NTLH_ALL_PRIVATE); // CHECK: load <4 x i32>{{.*}}align 16, !nontemporal !4, !riscv-nontemporal-domain !6
- v8ss1 = __riscv_ntl_load(&v8ss2, __RISCV_NTLH_ALL_PRIVATE); // CHECK: load <8 x i16>{{.*}}align 16, !nontemporal !4, !riscv-nontemporal-domain !6
- v16sc1 = __riscv_ntl_load(&v16sc2, __RISCV_NTLH_ALL_PRIVATE); // CHECK: load <16 x i8>{{.*}}align 16, !nontemporal !4, !riscv-nontemporal-domain !6
- *scvi1 = __riscv_ntl_load(scvi2, __RISCV_NTLH_ALL_PRIVATE); // CHECK: load <vscale x 2 x i32>{{.*}}align 8, !nontemporal !4, !riscv-nontemporal-domain !6
- *scvs1 = __riscv_ntl_load(scvs2, __RISCV_NTLH_ALL_PRIVATE); // CHECK: load <vscale x 4 x i16>{{.*}}align 8, !nontemporal !4, !riscv-nontemporal-domain !6
- *scvc1 = __riscv_ntl_load(scvc2, __RISCV_NTLH_ALL_PRIVATE); // CHECK: load <vscale x 8 x i8>{{.*}}align 8, !nontemporal !4, !riscv-nontemporal-domain !6
+ uc = __riscv_ntl_load(&sc, __RISCV_NTLH_ALL_PRIVATE); // CHECK: load i8{{.*}}align 1, !nontemporal !6, !riscv-nontemporal-domain !8
+ sc = __riscv_ntl_load(&uc, __RISCV_NTLH_ALL_PRIVATE); // CHECK: load i8{{.*}}align 1, !nontemporal !6, !riscv-nontemporal-domain !8
+ us = __riscv_ntl_load(&ss, __RISCV_NTLH_ALL_PRIVATE); // CHECK: load i16{{.*}}align 2, !nontemporal !6, !riscv-nontemporal-domain !8
+ ss = __riscv_ntl_load(&us, __RISCV_NTLH_ALL_PRIVATE); // CHECK: load i16{{.*}}align 2, !nontemporal !6, !riscv-nontemporal-domain !8
+ ui = __riscv_ntl_load(&si, __RISCV_NTLH_ALL_PRIVATE); // CHECK: load i32{{.*}}align 4, !nontemporal !6, !riscv-nontemporal-domain !8
+ si = __riscv_ntl_load(&ui, __RISCV_NTLH_ALL_PRIVATE); // CHECK: load i32{{.*}}align 4, !nontemporal !6, !riscv-nontemporal-domain !8
+ ull = __riscv_ntl_load(&sll, __RISCV_NTLH_ALL_PRIVATE); // CHECK: load i64{{.*}}align 8, !nontemporal !6, !riscv-nontemporal-domain !8
+ sll = __riscv_ntl_load(&ull, __RISCV_NTLH_ALL_PRIVATE); // CHECK: load i64{{.*}}align 8, !nontemporal !6, !riscv-nontemporal-domain !8
+ h1 = __riscv_ntl_load(&h2, __RISCV_NTLH_ALL_PRIVATE); // CHECK: load half{{.*}}align 2, !nontemporal !6, !riscv-nontemporal-domain !8
+ f1 = __riscv_ntl_load(&f2, __RISCV_NTLH_ALL_PRIVATE); // CHECK: load float{{.*}}align 4, !nontemporal !6, !riscv-nontemporal-domain !8
+ d1 = __riscv_ntl_load(&d2, __RISCV_NTLH_ALL_PRIVATE); // CHECK: load double{{.*}}align 8, !nontemporal !6, !riscv-nontemporal-domain !8
+ v4si1 = __riscv_ntl_load(&v4si2, __RISCV_NTLH_ALL_PRIVATE); // CHECK: load <4 x i32>{{.*}}align 16, !nontemporal !6, !riscv-nontemporal-domain !8
+ v8ss1 = __riscv_ntl_load(&v8ss2, __RISCV_NTLH_ALL_PRIVATE); // CHECK: load <8 x i16>{{.*}}align 16, !nontemporal !6, !riscv-nontemporal-domain !8
+ v16sc1 = __riscv_ntl_load(&v16sc2, __RISCV_NTLH_ALL_PRIVATE); // CHECK: load <16 x i8>{{.*}}align 16, !nontemporal !6, !riscv-nontemporal-domain !8
+ *scvi1 = __riscv_ntl_load(scvi2, __RISCV_NTLH_ALL_PRIVATE); // CHECK: load <vscale x 2 x i32>{{.*}}align 8, !nontemporal !6, !riscv-nontemporal-domain !8
+ *scvs1 = __riscv_ntl_load(scvs2, __RISCV_NTLH_ALL_PRIVATE); // CHECK: load <vscale x 4 x i16>{{.*}}align 8, !nontemporal !6, !riscv-nontemporal-domain !8
+ *scvc1 = __riscv_ntl_load(scvc2, __RISCV_NTLH_ALL_PRIVATE); // CHECK: load <vscale x 8 x i8>{{.*}}align 8, !nontemporal !6, !riscv-nontemporal-domain !8
- uc = __riscv_ntl_load(&sc, __RISCV_NTLH_INNERMOST_SHARED); // CHECK: load i8{{.*}}align 1, !nontemporal !4, !riscv-nontemporal-domain !7
- sc = __riscv_ntl_load(&uc, __RISCV_NTLH_INNERMOST_SHARED); // CHECK: load i8{{.*}}align 1, !nontemporal !4, !riscv-nontemporal-domain !7
- us = __riscv_ntl_load(&ss, __RISCV_NTLH_INNERMOST_SHARED); // CHECK: load i16{{.*}}align 2, !nontemporal !4, !riscv-nontemporal-domain !7
- ss = __riscv_ntl_load(&us, __RISCV_NTLH_INNERMOST_SHARED); // CHECK: load i16{{.*}}align 2, !nontemporal !4, !riscv-nontemporal-domain !7
- ui = __riscv_ntl_load(&si, __RISCV_NTLH_INNERMOST_SHARED); // CHECK: load i32{{.*}}align 4, !nontemporal !4, !riscv-nontemporal-domain !7
- si = __riscv_ntl_load(&ui, __RISCV_NTLH_INNERMOST_SHARED); // CHECK: load i32{{.*}}align 4, !nontemporal !4, !riscv-nontemporal-domain !7
- ull = __riscv_ntl_load(&sll, __RISCV_NTLH_INNERMOST_SHARED); // CHECK: load i64{{.*}}align 8, !nontemporal !4, !riscv-nontemporal-domain !7
- sll = __riscv_ntl_load(&ull, __RISCV_NTLH_INNERMOST_SHARED); // CHECK: load i64{{.*}}align 8, !nontemporal !4, !riscv-nontemporal-domain !7
- h1 = __riscv_ntl_load(&h2, __RISCV_NTLH_INNERMOST_SHARED); // CHECK: load half{{.*}}align 2, !nontemporal !4, !riscv-nontemporal-domain !7
- f1 = __riscv_ntl_load(&f2, __RISCV_NTLH_INNERMOST_SHARED); // CHECK: load float{{.*}}align 4, !nontemporal !4, !riscv-nontemporal-domain !7
- d1 = __riscv_ntl_load(&d2, __RISCV_NTLH_INNERMOST_SHARED); // CHECK: load double{{.*}}align 8, !nontemporal !4, !riscv-nontemporal-domain !7
- v4si1 = __riscv_ntl_load(&v4si2, __RISCV_NTLH_INNERMOST_SHARED); // CHECK: load <4 x i32>{{.*}}align 16, !nontemporal !4, !riscv-nontemporal-domain !7
- v8ss1 = __riscv_ntl_load(&v8ss2, __RISCV_NTLH_INNERMOST_SHARED); // CHECK: load <8 x i16>{{.*}}align 16, !nontemporal !4, !riscv-nontemporal-domain !7
- v16sc1 = __riscv_ntl_load(&v16sc2, __RISCV_NTLH_INNERMOST_SHARED); // CHECK: load <16 x i8>{{.*}}align 16, !nontemporal !4, !riscv-nontemporal-domain !7
- *scvi1 = __riscv_ntl_load(scvi2, __RISCV_NTLH_INNERMOST_SHARED); // CHECK: load <vscale x 2 x i32>{{.*}}align 8, !nontemporal !4, !riscv-nontemporal-domain !7
- *scvs1 = __riscv_ntl_load(scvs2, __RISCV_NTLH_INNERMOST_SHARED); // CHECK: load <vscale x 4 x i16>{{.*}}align 8, !nontemporal !4, !riscv-nontemporal-domain !7
- *scvc1 = __riscv_ntl_load(scvc2, __RISCV_NTLH_INNERMOST_SHARED); // CHECK: load <vscale x 8 x i8>{{.*}}align 8, !nontemporal !4, !riscv-nontemporal-domain !7
+ uc = __riscv_ntl_load(&sc, __RISCV_NTLH_INNERMOST_SHARED); // CHECK: load i8{{.*}}align 1, !nontemporal !6, !riscv-nontemporal-domain !9
+ sc = __riscv_ntl_load(&uc, __RISCV_NTLH_INNERMOST_SHARED); // CHECK: load i8{{.*}}align 1, !nontemporal !6, !riscv-nontemporal-domain !9
+ us = __riscv_ntl_load(&ss, __RISCV_NTLH_INNERMOST_SHARED); // CHECK: load i16{{.*}}align 2, !nontemporal !6, !riscv-nontemporal-domain !9
+ ss = __riscv_ntl_load(&us, __RISCV_NTLH_INNERMOST_SHARED); // CHECK: load i16{{.*}}align 2, !nontemporal !6, !riscv-nontemporal-domain !9
+ ui = __riscv_ntl_load(&si, __RISCV_NTLH_INNERMOST_SHARED); // CHECK: load i32{{.*}}align 4, !nontemporal !6, !riscv-nontemporal-domain !9
+ si = __riscv_ntl_load(&ui, __RISCV_NTLH_INNERMOST_SHARED); // CHECK: load i32{{.*}}align 4, !nontemporal !6, !riscv-nontemporal-domain !9
+ ull = __riscv_ntl_load(&sll, __RISCV_NTLH_INNERMOST_SHARED); // CHECK: load i64{{.*}}align 8, !nontemporal !6, !riscv-nontemporal-domain !9
+ sll = __riscv_ntl_load(&ull, __RISCV_NTLH_INNERMOST_SHARED); // CHECK: load i64{{.*}}align 8, !nontemporal !6, !riscv-nontemporal-domain !9
+ h1 = __riscv_ntl_load(&h2, __RISCV_NTLH_INNERMOST_SHARED); // CHECK: load half{{.*}}align 2, !nontemporal !6, !riscv-nontemporal-domain !9
+ f1 = __riscv_ntl_load(&f2, __RISCV_NTLH_INNERMOST_SHARED); // CHECK: load float{{.*}}align 4, !nontemporal !6, !riscv-nontemporal-domain !9
+ d1 = __riscv_ntl_load(&d2, __RISCV_NTLH_INNERMOST_SHARED); // CHECK: load double{{.*}}align 8, !nontemporal !6, !riscv-nontemporal-domain !9
+ v4si1 = __riscv_ntl_load(&v4si2, __RISCV_NTLH_INNERMOST_SHARED); // CHECK: load <4 x i32>{{.*}}align 16, !nontemporal !6, !riscv-nontemporal-domain !9
+ v8ss1 = __riscv_ntl_load(&v8ss2, __RISCV_NTLH_INNERMOST_SHARED); // CHECK: load <8 x i16>{{.*}}align 16, !nontemporal !6, !riscv-nontemporal-domain !9
+ v16sc1 = __riscv_ntl_load(&v16sc2, __RISCV_NTLH_INNERMOST_SHARED); // CHECK: load <16 x i8>{{.*}}align 16, !nontemporal !6, !riscv-nontemporal-domain !9
+ *scvi1 = __riscv_ntl_load(scvi2, __RISCV_NTLH_INNERMOST_SHARED); // CHECK: load <vscale x 2 x i32>{{.*}}align 8, !nontemporal !6, !riscv-nontemporal-domain !9
+ *scvs1 = __riscv_ntl_load(scvs2, __RISCV_NTLH_INNERMOST_SHARED); // CHECK: load <vscale x 4 x i16>{{.*}}align 8, !nontemporal !6, !riscv-nontemporal-domain !9
+ *scvc1 = __riscv_ntl_load(scvc2, __RISCV_NTLH_INNERMOST_SHARED); // CHECK: load <vscale x 8 x i8>{{.*}}align 8, !nontemporal !6, !riscv-nontemporal-domain !9
- uc = __riscv_ntl_load(&sc, __RISCV_NTLH_ALL); // CHECK: load i8{{.*}}align 1, !nontemporal !4, !riscv-nontemporal-domain !8
- sc = __riscv_ntl_load(&uc, __RISCV_NTLH_ALL); // CHECK: load i8{{.*}}align 1, !nontemporal !4, !riscv-nontemporal-domain !8
- us = __riscv_ntl_load(&ss, __RISCV_NTLH_ALL); // CHECK: load i16{{.*}}align 2, !nontemporal !4, !riscv-nontemporal-domain !8
- ss = __riscv_ntl_load(&us, __RISCV_NTLH_ALL); // CHECK: load i16{{.*}}align 2, !nontemporal !4, !riscv-nontemporal-domain !8
- ui = __riscv_ntl_load(&si, __RISCV_NTLH_ALL); // CHECK: load i32{{.*}}align 4, !nontemporal !4, !riscv-nontemporal-domain !8
- si = __riscv_ntl_load(&ui, __RISCV_NTLH_ALL); // CHECK: load i32{{.*}}align 4, !nontemporal !4, !riscv-nontemporal-domain !8
- ull = __riscv_ntl_load(&sll, __RISCV_NTLH_ALL); // CHECK: load i64{{.*}}align 8, !nontemporal !4, !riscv-nontemporal-domain !8
- sll = __riscv_ntl_load(&ull, __RISCV_NTLH_ALL); // CHECK: load i64{{.*}}align 8, !nontemporal !4, !riscv-nontemporal-domain !8
- h1 = __riscv_ntl_load(&h2, __RISCV_NTLH_ALL); // CHECK: load half{{.*}}align 2, !nontemporal !4, !riscv-nontemporal-domain !8
- f1 = __riscv_ntl_load(&f2, __RISCV_NTLH_ALL); // CHECK: load float{{.*}}align 4, !nontemporal !4, !riscv-nontemporal-domain !8
- d1 = __riscv_ntl_load(&d2, __RISCV_NTLH_ALL); // CHECK: load double{{.*}}align 8, !nontemporal !4, !riscv-nontemporal-domain !8
- v4si1 = __riscv_ntl_load(&v4si2, __RISCV_NTLH_ALL); // CHECK: load <4 x i32>{{.*}}align 16, !nontemporal !4, !riscv-nontemporal-domain !8
- v8ss1 = __riscv_ntl_load(&v8ss2, __RISCV_NTLH_ALL); // CHECK: load <8 x i16>{{.*}}align 16, !nontemporal !4, !riscv-nontemporal-domain !8
- v16sc1 = __riscv_ntl_load(&v16sc2, __RISCV_NTLH_ALL); // CHECK: load <16 x i8>{{.*}}align 16, !nontemporal !4, !riscv-nontemporal-domain !8
- *scvi1 = __riscv_ntl_load(scvi2, __RISCV_NTLH_ALL); // CHECK: load <vscale x 2 x i32>{{.*}}align 8, !nontemporal !4, !riscv-nontemporal-domain !8
- *scvs1 = __riscv_ntl_load(scvs2, __RISCV_NTLH_ALL); // CHECK: load <vscale x 4 x i16>{{.*}}align 8, !nontemporal !4, !riscv-nontemporal-domain !8
- *scvc1 = __riscv_ntl_load(scvc2, __RISCV_NTLH_ALL); // CHECK: load <vscale x 8 x i8>{{.*}}align 8, !nontemporal !4, !riscv-nontemporal-domain !8
+ uc = __riscv_ntl_load(&sc, __RISCV_NTLH_ALL); // CHECK: load i8{{.*}}align 1, !nontemporal !6, !riscv-nontemporal-domain !10
+ sc = __riscv_ntl_load(&uc, __RISCV_NTLH_ALL); // CHECK: load i8{{.*}}align 1, !nontemporal !6, !riscv-nontemporal-domain !10
+ us = __riscv_ntl_load(&ss, __RISCV_NTLH_ALL); // CHECK: load i16{{.*}}align 2, !nontemporal !6, !riscv-nontemporal-domain !10
+ ss = __riscv_ntl_load(&us, __RISCV_NTLH_ALL); // CHECK: load i16{{.*}}align 2, !nontemporal !6, !riscv-nontemporal-domain !10
+ ui = __riscv_ntl_load(&si, __RISCV_NTLH_ALL); // CHECK: load i32{{.*}}align 4, !nontemporal !6, !riscv-nontemporal-domain !10
+ si = __riscv_ntl_load(&ui, __RISCV_NTLH_ALL); // CHECK: load i32{{.*}}align 4, !nontemporal !6, !riscv-nontemporal-domain !10
+ ull = __riscv_ntl_load(&sll, __RISCV_NTLH_ALL); // CHECK: load i64{{.*}}align 8, !nontemporal !6, !riscv-nontemporal-domain !10
+ sll = __riscv_ntl_load(&ull, __RISCV_NTLH_ALL); // CHECK: load i64{{.*}}align 8, !nontemporal !6, !riscv-nontemporal-domain !10
+ h1 = __riscv_ntl_load(&h2, __RISCV_NTLH_ALL); // CHECK: load half{{.*}}align 2, !nontemporal !6, !riscv-nontemporal-domain !10
+ f1 = __riscv_ntl_load(&f2, __RISCV_NTLH_ALL); // CHECK: load float{{.*}}align 4, !nontemporal !6, !riscv-nontemporal-domain !10
+ d1 = __riscv_ntl_load(&d2, __RISCV_NTL...
[truncated]
|
@llvm/pr-subscribers-backend-risc-v Author: Craig Topper (topperc) ChangesIn an LTO build, we don't set the ELF attributes to indicate what extensions were compiled with. The target CPU/Attrs in RISCVTargetMachine do not get set for an LTO build. Each function gets a target-cpu/feature attribute, but this isn't usable to set ELF attributs since we wouldn't know what function to use. We can't just once since it might have been compiler with an attribute likes target_verson. This patch adds the ISA as Module metadata so we can retrieve it in the backend. Individual translation units can still be compiled with different strings so we need to collect the unique set when Modules are merged. The backend will need to combine the unique ISA strings to produce a single value for the ELF attributes. This will be done in a separate patch. Patch is 49.80 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/80760.diff 3 Files Affected:
diff --git a/clang/lib/CodeGen/CodeGenModule.cpp b/clang/lib/CodeGen/CodeGenModule.cpp
index 36b63d78b06f8..25b8d5cae36d3 100644
--- a/clang/lib/CodeGen/CodeGenModule.cpp
+++ b/clang/lib/CodeGen/CodeGenModule.cpp
@@ -67,6 +67,7 @@
#include "llvm/Support/CommandLine.h"
#include "llvm/Support/ConvertUTF.h"
#include "llvm/Support/ErrorHandling.h"
+#include "llvm/Support/RISCVISAInfo.h"
#include "llvm/Support/TimeProfiler.h"
#include "llvm/Support/xxhash.h"
#include "llvm/TargetParser/Triple.h"
@@ -1056,6 +1057,16 @@ void CodeGenModule::Release() {
llvm::LLVMContext &Ctx = TheModule.getContext();
getModule().addModuleFlag(llvm::Module::Error, "target-abi",
llvm::MDString::get(Ctx, ABIStr));
+
+ const std::vector<std::string> &Features =
+ getTarget().getTargetOpts().Features;
+ auto ParseResult =
+ llvm::RISCVISAInfo::parseFeatures(T.isRISCV64() ? 64 : 32, Features);
+ if (!errorToBool(ParseResult.takeError()))
+ getModule().addModuleFlag(
+ llvm::Module::AppendUnique, "riscv-arch",
+ llvm::MDNode::get(
+ Ctx, llvm::MDString::get(Ctx, (*ParseResult)->toString())));
}
if (CodeGenOpts.SanitizeCfiCrossDso) {
diff --git a/clang/test/CodeGen/RISCV/ntlh-intrinsics/riscv32-zihintntl.c b/clang/test/CodeGen/RISCV/ntlh-intrinsics/riscv32-zihintntl.c
index 897edbc6450af..b11c2ca010e7c 100644
--- a/clang/test/CodeGen/RISCV/ntlh-intrinsics/riscv32-zihintntl.c
+++ b/clang/test/CodeGen/RISCV/ntlh-intrinsics/riscv32-zihintntl.c
@@ -28,190 +28,190 @@ vint8m1_t *scvc1, *scvc2;
// clang-format off
void ntl_all_sizes() { // CHECK-LABEL: ntl_all_sizes
- uc = __riscv_ntl_load(&sc, __RISCV_NTLH_INNERMOST_PRIVATE); // CHECK: load i8{{.*}}align 1, !nontemporal !4, !riscv-nontemporal-domain !5
- sc = __riscv_ntl_load(&uc, __RISCV_NTLH_INNERMOST_PRIVATE); // CHECK: load i8{{.*}}align 1, !nontemporal !4, !riscv-nontemporal-domain !5
- us = __riscv_ntl_load(&ss, __RISCV_NTLH_INNERMOST_PRIVATE); // CHECK: load i16{{.*}}align 2, !nontemporal !4, !riscv-nontemporal-domain !5
- ss = __riscv_ntl_load(&us, __RISCV_NTLH_INNERMOST_PRIVATE); // CHECK: load i16{{.*}}align 2, !nontemporal !4, !riscv-nontemporal-domain !5
- ui = __riscv_ntl_load(&si, __RISCV_NTLH_INNERMOST_PRIVATE); // CHECK: load i32{{.*}}align 4, !nontemporal !4, !riscv-nontemporal-domain !5
- si = __riscv_ntl_load(&ui, __RISCV_NTLH_INNERMOST_PRIVATE); // CHECK: load i32{{.*}}align 4, !nontemporal !4, !riscv-nontemporal-domain !5
- ull = __riscv_ntl_load(&sll, __RISCV_NTLH_INNERMOST_PRIVATE); // CHECK: load i64{{.*}}align 8, !nontemporal !4, !riscv-nontemporal-domain !5
- sll = __riscv_ntl_load(&ull, __RISCV_NTLH_INNERMOST_PRIVATE); // CHECK: load i64{{.*}}align 8, !nontemporal !4, !riscv-nontemporal-domain !5
- h1 = __riscv_ntl_load(&h2, __RISCV_NTLH_INNERMOST_PRIVATE); // CHECK: load half{{.*}}align 2, !nontemporal !4, !riscv-nontemporal-domain !5
- f1 = __riscv_ntl_load(&f2, __RISCV_NTLH_INNERMOST_PRIVATE); // CHECK: load float{{.*}}align 4, !nontemporal !4, !riscv-nontemporal-domain !5
- d1 = __riscv_ntl_load(&d2, __RISCV_NTLH_INNERMOST_PRIVATE); // CHECK: load double{{.*}}align 8, !nontemporal !4, !riscv-nontemporal-domain !5
- v4si1 = __riscv_ntl_load(&v4si2, __RISCV_NTLH_INNERMOST_PRIVATE); // CHECK: load <4 x i32>{{.*}}align 16, !nontemporal !4, !riscv-nontemporal-domain !5
- v8ss1 = __riscv_ntl_load(&v8ss2, __RISCV_NTLH_INNERMOST_PRIVATE); // CHECK: load <8 x i16>{{.*}}align 16, !nontemporal !4, !riscv-nontemporal-domain !5
- v16sc1 = __riscv_ntl_load(&v16sc2, __RISCV_NTLH_INNERMOST_PRIVATE); // CHECK: load <16 x i8>{{.*}}align 16, !nontemporal !4, !riscv-nontemporal-domain !5
- *scvi1 = __riscv_ntl_load(scvi2, __RISCV_NTLH_INNERMOST_PRIVATE); // CHECK: load <vscale x 2 x i32>{{.*}}align 8, !nontemporal !4, !riscv-nontemporal-domain !5
- *scvs1 = __riscv_ntl_load(scvs2, __RISCV_NTLH_INNERMOST_PRIVATE); // CHECK: load <vscale x 4 x i16>{{.*}}align 8, !nontemporal !4, !riscv-nontemporal-domain !5
- *scvc1 = __riscv_ntl_load(scvc2, __RISCV_NTLH_INNERMOST_PRIVATE); // CHECK: load <vscale x 8 x i8>{{.*}}align 8, !nontemporal !4, !riscv-nontemporal-domain !5
+ uc = __riscv_ntl_load(&sc, __RISCV_NTLH_INNERMOST_PRIVATE); // CHECK: load i8{{.*}}align 1, !nontemporal !6, !riscv-nontemporal-domain !7
+ sc = __riscv_ntl_load(&uc, __RISCV_NTLH_INNERMOST_PRIVATE); // CHECK: load i8{{.*}}align 1, !nontemporal !6, !riscv-nontemporal-domain !7
+ us = __riscv_ntl_load(&ss, __RISCV_NTLH_INNERMOST_PRIVATE); // CHECK: load i16{{.*}}align 2, !nontemporal !6, !riscv-nontemporal-domain !7
+ ss = __riscv_ntl_load(&us, __RISCV_NTLH_INNERMOST_PRIVATE); // CHECK: load i16{{.*}}align 2, !nontemporal !6, !riscv-nontemporal-domain !7
+ ui = __riscv_ntl_load(&si, __RISCV_NTLH_INNERMOST_PRIVATE); // CHECK: load i32{{.*}}align 4, !nontemporal !6, !riscv-nontemporal-domain !7
+ si = __riscv_ntl_load(&ui, __RISCV_NTLH_INNERMOST_PRIVATE); // CHECK: load i32{{.*}}align 4, !nontemporal !6, !riscv-nontemporal-domain !7
+ ull = __riscv_ntl_load(&sll, __RISCV_NTLH_INNERMOST_PRIVATE); // CHECK: load i64{{.*}}align 8, !nontemporal !6, !riscv-nontemporal-domain !7
+ sll = __riscv_ntl_load(&ull, __RISCV_NTLH_INNERMOST_PRIVATE); // CHECK: load i64{{.*}}align 8, !nontemporal !6, !riscv-nontemporal-domain !7
+ h1 = __riscv_ntl_load(&h2, __RISCV_NTLH_INNERMOST_PRIVATE); // CHECK: load half{{.*}}align 2, !nontemporal !6, !riscv-nontemporal-domain !7
+ f1 = __riscv_ntl_load(&f2, __RISCV_NTLH_INNERMOST_PRIVATE); // CHECK: load float{{.*}}align 4, !nontemporal !6, !riscv-nontemporal-domain !7
+ d1 = __riscv_ntl_load(&d2, __RISCV_NTLH_INNERMOST_PRIVATE); // CHECK: load double{{.*}}align 8, !nontemporal !6, !riscv-nontemporal-domain !7
+ v4si1 = __riscv_ntl_load(&v4si2, __RISCV_NTLH_INNERMOST_PRIVATE); // CHECK: load <4 x i32>{{.*}}align 16, !nontemporal !6, !riscv-nontemporal-domain !7
+ v8ss1 = __riscv_ntl_load(&v8ss2, __RISCV_NTLH_INNERMOST_PRIVATE); // CHECK: load <8 x i16>{{.*}}align 16, !nontemporal !6, !riscv-nontemporal-domain !7
+ v16sc1 = __riscv_ntl_load(&v16sc2, __RISCV_NTLH_INNERMOST_PRIVATE); // CHECK: load <16 x i8>{{.*}}align 16, !nontemporal !6, !riscv-nontemporal-domain !7
+ *scvi1 = __riscv_ntl_load(scvi2, __RISCV_NTLH_INNERMOST_PRIVATE); // CHECK: load <vscale x 2 x i32>{{.*}}align 8, !nontemporal !6, !riscv-nontemporal-domain !7
+ *scvs1 = __riscv_ntl_load(scvs2, __RISCV_NTLH_INNERMOST_PRIVATE); // CHECK: load <vscale x 4 x i16>{{.*}}align 8, !nontemporal !6, !riscv-nontemporal-domain !7
+ *scvc1 = __riscv_ntl_load(scvc2, __RISCV_NTLH_INNERMOST_PRIVATE); // CHECK: load <vscale x 8 x i8>{{.*}}align 8, !nontemporal !6, !riscv-nontemporal-domain !7
- uc = __riscv_ntl_load(&sc, __RISCV_NTLH_ALL_PRIVATE); // CHECK: load i8{{.*}}align 1, !nontemporal !4, !riscv-nontemporal-domain !6
- sc = __riscv_ntl_load(&uc, __RISCV_NTLH_ALL_PRIVATE); // CHECK: load i8{{.*}}align 1, !nontemporal !4, !riscv-nontemporal-domain !6
- us = __riscv_ntl_load(&ss, __RISCV_NTLH_ALL_PRIVATE); // CHECK: load i16{{.*}}align 2, !nontemporal !4, !riscv-nontemporal-domain !6
- ss = __riscv_ntl_load(&us, __RISCV_NTLH_ALL_PRIVATE); // CHECK: load i16{{.*}}align 2, !nontemporal !4, !riscv-nontemporal-domain !6
- ui = __riscv_ntl_load(&si, __RISCV_NTLH_ALL_PRIVATE); // CHECK: load i32{{.*}}align 4, !nontemporal !4, !riscv-nontemporal-domain !6
- si = __riscv_ntl_load(&ui, __RISCV_NTLH_ALL_PRIVATE); // CHECK: load i32{{.*}}align 4, !nontemporal !4, !riscv-nontemporal-domain !6
- ull = __riscv_ntl_load(&sll, __RISCV_NTLH_ALL_PRIVATE); // CHECK: load i64{{.*}}align 8, !nontemporal !4, !riscv-nontemporal-domain !6
- sll = __riscv_ntl_load(&ull, __RISCV_NTLH_ALL_PRIVATE); // CHECK: load i64{{.*}}align 8, !nontemporal !4, !riscv-nontemporal-domain !6
- h1 = __riscv_ntl_load(&h2, __RISCV_NTLH_ALL_PRIVATE); // CHECK: load half{{.*}}align 2, !nontemporal !4, !riscv-nontemporal-domain !6
- f1 = __riscv_ntl_load(&f2, __RISCV_NTLH_ALL_PRIVATE); // CHECK: load float{{.*}}align 4, !nontemporal !4, !riscv-nontemporal-domain !6
- d1 = __riscv_ntl_load(&d2, __RISCV_NTLH_ALL_PRIVATE); // CHECK: load double{{.*}}align 8, !nontemporal !4, !riscv-nontemporal-domain !6
- v4si1 = __riscv_ntl_load(&v4si2, __RISCV_NTLH_ALL_PRIVATE); // CHECK: load <4 x i32>{{.*}}align 16, !nontemporal !4, !riscv-nontemporal-domain !6
- v8ss1 = __riscv_ntl_load(&v8ss2, __RISCV_NTLH_ALL_PRIVATE); // CHECK: load <8 x i16>{{.*}}align 16, !nontemporal !4, !riscv-nontemporal-domain !6
- v16sc1 = __riscv_ntl_load(&v16sc2, __RISCV_NTLH_ALL_PRIVATE); // CHECK: load <16 x i8>{{.*}}align 16, !nontemporal !4, !riscv-nontemporal-domain !6
- *scvi1 = __riscv_ntl_load(scvi2, __RISCV_NTLH_ALL_PRIVATE); // CHECK: load <vscale x 2 x i32>{{.*}}align 8, !nontemporal !4, !riscv-nontemporal-domain !6
- *scvs1 = __riscv_ntl_load(scvs2, __RISCV_NTLH_ALL_PRIVATE); // CHECK: load <vscale x 4 x i16>{{.*}}align 8, !nontemporal !4, !riscv-nontemporal-domain !6
- *scvc1 = __riscv_ntl_load(scvc2, __RISCV_NTLH_ALL_PRIVATE); // CHECK: load <vscale x 8 x i8>{{.*}}align 8, !nontemporal !4, !riscv-nontemporal-domain !6
+ uc = __riscv_ntl_load(&sc, __RISCV_NTLH_ALL_PRIVATE); // CHECK: load i8{{.*}}align 1, !nontemporal !6, !riscv-nontemporal-domain !8
+ sc = __riscv_ntl_load(&uc, __RISCV_NTLH_ALL_PRIVATE); // CHECK: load i8{{.*}}align 1, !nontemporal !6, !riscv-nontemporal-domain !8
+ us = __riscv_ntl_load(&ss, __RISCV_NTLH_ALL_PRIVATE); // CHECK: load i16{{.*}}align 2, !nontemporal !6, !riscv-nontemporal-domain !8
+ ss = __riscv_ntl_load(&us, __RISCV_NTLH_ALL_PRIVATE); // CHECK: load i16{{.*}}align 2, !nontemporal !6, !riscv-nontemporal-domain !8
+ ui = __riscv_ntl_load(&si, __RISCV_NTLH_ALL_PRIVATE); // CHECK: load i32{{.*}}align 4, !nontemporal !6, !riscv-nontemporal-domain !8
+ si = __riscv_ntl_load(&ui, __RISCV_NTLH_ALL_PRIVATE); // CHECK: load i32{{.*}}align 4, !nontemporal !6, !riscv-nontemporal-domain !8
+ ull = __riscv_ntl_load(&sll, __RISCV_NTLH_ALL_PRIVATE); // CHECK: load i64{{.*}}align 8, !nontemporal !6, !riscv-nontemporal-domain !8
+ sll = __riscv_ntl_load(&ull, __RISCV_NTLH_ALL_PRIVATE); // CHECK: load i64{{.*}}align 8, !nontemporal !6, !riscv-nontemporal-domain !8
+ h1 = __riscv_ntl_load(&h2, __RISCV_NTLH_ALL_PRIVATE); // CHECK: load half{{.*}}align 2, !nontemporal !6, !riscv-nontemporal-domain !8
+ f1 = __riscv_ntl_load(&f2, __RISCV_NTLH_ALL_PRIVATE); // CHECK: load float{{.*}}align 4, !nontemporal !6, !riscv-nontemporal-domain !8
+ d1 = __riscv_ntl_load(&d2, __RISCV_NTLH_ALL_PRIVATE); // CHECK: load double{{.*}}align 8, !nontemporal !6, !riscv-nontemporal-domain !8
+ v4si1 = __riscv_ntl_load(&v4si2, __RISCV_NTLH_ALL_PRIVATE); // CHECK: load <4 x i32>{{.*}}align 16, !nontemporal !6, !riscv-nontemporal-domain !8
+ v8ss1 = __riscv_ntl_load(&v8ss2, __RISCV_NTLH_ALL_PRIVATE); // CHECK: load <8 x i16>{{.*}}align 16, !nontemporal !6, !riscv-nontemporal-domain !8
+ v16sc1 = __riscv_ntl_load(&v16sc2, __RISCV_NTLH_ALL_PRIVATE); // CHECK: load <16 x i8>{{.*}}align 16, !nontemporal !6, !riscv-nontemporal-domain !8
+ *scvi1 = __riscv_ntl_load(scvi2, __RISCV_NTLH_ALL_PRIVATE); // CHECK: load <vscale x 2 x i32>{{.*}}align 8, !nontemporal !6, !riscv-nontemporal-domain !8
+ *scvs1 = __riscv_ntl_load(scvs2, __RISCV_NTLH_ALL_PRIVATE); // CHECK: load <vscale x 4 x i16>{{.*}}align 8, !nontemporal !6, !riscv-nontemporal-domain !8
+ *scvc1 = __riscv_ntl_load(scvc2, __RISCV_NTLH_ALL_PRIVATE); // CHECK: load <vscale x 8 x i8>{{.*}}align 8, !nontemporal !6, !riscv-nontemporal-domain !8
- uc = __riscv_ntl_load(&sc, __RISCV_NTLH_INNERMOST_SHARED); // CHECK: load i8{{.*}}align 1, !nontemporal !4, !riscv-nontemporal-domain !7
- sc = __riscv_ntl_load(&uc, __RISCV_NTLH_INNERMOST_SHARED); // CHECK: load i8{{.*}}align 1, !nontemporal !4, !riscv-nontemporal-domain !7
- us = __riscv_ntl_load(&ss, __RISCV_NTLH_INNERMOST_SHARED); // CHECK: load i16{{.*}}align 2, !nontemporal !4, !riscv-nontemporal-domain !7
- ss = __riscv_ntl_load(&us, __RISCV_NTLH_INNERMOST_SHARED); // CHECK: load i16{{.*}}align 2, !nontemporal !4, !riscv-nontemporal-domain !7
- ui = __riscv_ntl_load(&si, __RISCV_NTLH_INNERMOST_SHARED); // CHECK: load i32{{.*}}align 4, !nontemporal !4, !riscv-nontemporal-domain !7
- si = __riscv_ntl_load(&ui, __RISCV_NTLH_INNERMOST_SHARED); // CHECK: load i32{{.*}}align 4, !nontemporal !4, !riscv-nontemporal-domain !7
- ull = __riscv_ntl_load(&sll, __RISCV_NTLH_INNERMOST_SHARED); // CHECK: load i64{{.*}}align 8, !nontemporal !4, !riscv-nontemporal-domain !7
- sll = __riscv_ntl_load(&ull, __RISCV_NTLH_INNERMOST_SHARED); // CHECK: load i64{{.*}}align 8, !nontemporal !4, !riscv-nontemporal-domain !7
- h1 = __riscv_ntl_load(&h2, __RISCV_NTLH_INNERMOST_SHARED); // CHECK: load half{{.*}}align 2, !nontemporal !4, !riscv-nontemporal-domain !7
- f1 = __riscv_ntl_load(&f2, __RISCV_NTLH_INNERMOST_SHARED); // CHECK: load float{{.*}}align 4, !nontemporal !4, !riscv-nontemporal-domain !7
- d1 = __riscv_ntl_load(&d2, __RISCV_NTLH_INNERMOST_SHARED); // CHECK: load double{{.*}}align 8, !nontemporal !4, !riscv-nontemporal-domain !7
- v4si1 = __riscv_ntl_load(&v4si2, __RISCV_NTLH_INNERMOST_SHARED); // CHECK: load <4 x i32>{{.*}}align 16, !nontemporal !4, !riscv-nontemporal-domain !7
- v8ss1 = __riscv_ntl_load(&v8ss2, __RISCV_NTLH_INNERMOST_SHARED); // CHECK: load <8 x i16>{{.*}}align 16, !nontemporal !4, !riscv-nontemporal-domain !7
- v16sc1 = __riscv_ntl_load(&v16sc2, __RISCV_NTLH_INNERMOST_SHARED); // CHECK: load <16 x i8>{{.*}}align 16, !nontemporal !4, !riscv-nontemporal-domain !7
- *scvi1 = __riscv_ntl_load(scvi2, __RISCV_NTLH_INNERMOST_SHARED); // CHECK: load <vscale x 2 x i32>{{.*}}align 8, !nontemporal !4, !riscv-nontemporal-domain !7
- *scvs1 = __riscv_ntl_load(scvs2, __RISCV_NTLH_INNERMOST_SHARED); // CHECK: load <vscale x 4 x i16>{{.*}}align 8, !nontemporal !4, !riscv-nontemporal-domain !7
- *scvc1 = __riscv_ntl_load(scvc2, __RISCV_NTLH_INNERMOST_SHARED); // CHECK: load <vscale x 8 x i8>{{.*}}align 8, !nontemporal !4, !riscv-nontemporal-domain !7
+ uc = __riscv_ntl_load(&sc, __RISCV_NTLH_INNERMOST_SHARED); // CHECK: load i8{{.*}}align 1, !nontemporal !6, !riscv-nontemporal-domain !9
+ sc = __riscv_ntl_load(&uc, __RISCV_NTLH_INNERMOST_SHARED); // CHECK: load i8{{.*}}align 1, !nontemporal !6, !riscv-nontemporal-domain !9
+ us = __riscv_ntl_load(&ss, __RISCV_NTLH_INNERMOST_SHARED); // CHECK: load i16{{.*}}align 2, !nontemporal !6, !riscv-nontemporal-domain !9
+ ss = __riscv_ntl_load(&us, __RISCV_NTLH_INNERMOST_SHARED); // CHECK: load i16{{.*}}align 2, !nontemporal !6, !riscv-nontemporal-domain !9
+ ui = __riscv_ntl_load(&si, __RISCV_NTLH_INNERMOST_SHARED); // CHECK: load i32{{.*}}align 4, !nontemporal !6, !riscv-nontemporal-domain !9
+ si = __riscv_ntl_load(&ui, __RISCV_NTLH_INNERMOST_SHARED); // CHECK: load i32{{.*}}align 4, !nontemporal !6, !riscv-nontemporal-domain !9
+ ull = __riscv_ntl_load(&sll, __RISCV_NTLH_INNERMOST_SHARED); // CHECK: load i64{{.*}}align 8, !nontemporal !6, !riscv-nontemporal-domain !9
+ sll = __riscv_ntl_load(&ull, __RISCV_NTLH_INNERMOST_SHARED); // CHECK: load i64{{.*}}align 8, !nontemporal !6, !riscv-nontemporal-domain !9
+ h1 = __riscv_ntl_load(&h2, __RISCV_NTLH_INNERMOST_SHARED); // CHECK: load half{{.*}}align 2, !nontemporal !6, !riscv-nontemporal-domain !9
+ f1 = __riscv_ntl_load(&f2, __RISCV_NTLH_INNERMOST_SHARED); // CHECK: load float{{.*}}align 4, !nontemporal !6, !riscv-nontemporal-domain !9
+ d1 = __riscv_ntl_load(&d2, __RISCV_NTLH_INNERMOST_SHARED); // CHECK: load double{{.*}}align 8, !nontemporal !6, !riscv-nontemporal-domain !9
+ v4si1 = __riscv_ntl_load(&v4si2, __RISCV_NTLH_INNERMOST_SHARED); // CHECK: load <4 x i32>{{.*}}align 16, !nontemporal !6, !riscv-nontemporal-domain !9
+ v8ss1 = __riscv_ntl_load(&v8ss2, __RISCV_NTLH_INNERMOST_SHARED); // CHECK: load <8 x i16>{{.*}}align 16, !nontemporal !6, !riscv-nontemporal-domain !9
+ v16sc1 = __riscv_ntl_load(&v16sc2, __RISCV_NTLH_INNERMOST_SHARED); // CHECK: load <16 x i8>{{.*}}align 16, !nontemporal !6, !riscv-nontemporal-domain !9
+ *scvi1 = __riscv_ntl_load(scvi2, __RISCV_NTLH_INNERMOST_SHARED); // CHECK: load <vscale x 2 x i32>{{.*}}align 8, !nontemporal !6, !riscv-nontemporal-domain !9
+ *scvs1 = __riscv_ntl_load(scvs2, __RISCV_NTLH_INNERMOST_SHARED); // CHECK: load <vscale x 4 x i16>{{.*}}align 8, !nontemporal !6, !riscv-nontemporal-domain !9
+ *scvc1 = __riscv_ntl_load(scvc2, __RISCV_NTLH_INNERMOST_SHARED); // CHECK: load <vscale x 8 x i8>{{.*}}align 8, !nontemporal !6, !riscv-nontemporal-domain !9
- uc = __riscv_ntl_load(&sc, __RISCV_NTLH_ALL); // CHECK: load i8{{.*}}align 1, !nontemporal !4, !riscv-nontemporal-domain !8
- sc = __riscv_ntl_load(&uc, __RISCV_NTLH_ALL); // CHECK: load i8{{.*}}align 1, !nontemporal !4, !riscv-nontemporal-domain !8
- us = __riscv_ntl_load(&ss, __RISCV_NTLH_ALL); // CHECK: load i16{{.*}}align 2, !nontemporal !4, !riscv-nontemporal-domain !8
- ss = __riscv_ntl_load(&us, __RISCV_NTLH_ALL); // CHECK: load i16{{.*}}align 2, !nontemporal !4, !riscv-nontemporal-domain !8
- ui = __riscv_ntl_load(&si, __RISCV_NTLH_ALL); // CHECK: load i32{{.*}}align 4, !nontemporal !4, !riscv-nontemporal-domain !8
- si = __riscv_ntl_load(&ui, __RISCV_NTLH_ALL); // CHECK: load i32{{.*}}align 4, !nontemporal !4, !riscv-nontemporal-domain !8
- ull = __riscv_ntl_load(&sll, __RISCV_NTLH_ALL); // CHECK: load i64{{.*}}align 8, !nontemporal !4, !riscv-nontemporal-domain !8
- sll = __riscv_ntl_load(&ull, __RISCV_NTLH_ALL); // CHECK: load i64{{.*}}align 8, !nontemporal !4, !riscv-nontemporal-domain !8
- h1 = __riscv_ntl_load(&h2, __RISCV_NTLH_ALL); // CHECK: load half{{.*}}align 2, !nontemporal !4, !riscv-nontemporal-domain !8
- f1 = __riscv_ntl_load(&f2, __RISCV_NTLH_ALL); // CHECK: load float{{.*}}align 4, !nontemporal !4, !riscv-nontemporal-domain !8
- d1 = __riscv_ntl_load(&d2, __RISCV_NTLH_ALL); // CHECK: load double{{.*}}align 8, !nontemporal !4, !riscv-nontemporal-domain !8
- v4si1 = __riscv_ntl_load(&v4si2, __RISCV_NTLH_ALL); // CHECK: load <4 x i32>{{.*}}align 16, !nontemporal !4, !riscv-nontemporal-domain !8
- v8ss1 = __riscv_ntl_load(&v8ss2, __RISCV_NTLH_ALL); // CHECK: load <8 x i16>{{.*}}align 16, !nontemporal !4, !riscv-nontemporal-domain !8
- v16sc1 = __riscv_ntl_load(&v16sc2, __RISCV_NTLH_ALL); // CHECK: load <16 x i8>{{.*}}align 16, !nontemporal !4, !riscv-nontemporal-domain !8
- *scvi1 = __riscv_ntl_load(scvi2, __RISCV_NTLH_ALL); // CHECK: load <vscale x 2 x i32>{{.*}}align 8, !nontemporal !4, !riscv-nontemporal-domain !8
- *scvs1 = __riscv_ntl_load(scvs2, __RISCV_NTLH_ALL); // CHECK: load <vscale x 4 x i16>{{.*}}align 8, !nontemporal !4, !riscv-nontemporal-domain !8
- *scvc1 = __riscv_ntl_load(scvc2, __RISCV_NTLH_ALL); // CHECK: load <vscale x 8 x i8>{{.*}}align 8, !nontemporal !4, !riscv-nontemporal-domain !8
+ uc = __riscv_ntl_load(&sc, __RISCV_NTLH_ALL); // CHECK: load i8{{.*}}align 1, !nontemporal !6, !riscv-nontemporal-domain !10
+ sc = __riscv_ntl_load(&uc, __RISCV_NTLH_ALL); // CHECK: load i8{{.*}}align 1, !nontemporal !6, !riscv-nontemporal-domain !10
+ us = __riscv_ntl_load(&ss, __RISCV_NTLH_ALL); // CHECK: load i16{{.*}}align 2, !nontemporal !6, !riscv-nontemporal-domain !10
+ ss = __riscv_ntl_load(&us, __RISCV_NTLH_ALL); // CHECK: load i16{{.*}}align 2, !nontemporal !6, !riscv-nontemporal-domain !10
+ ui = __riscv_ntl_load(&si, __RISCV_NTLH_ALL); // CHECK: load i32{{.*}}align 4, !nontemporal !6, !riscv-nontemporal-domain !10
+ si = __riscv_ntl_load(&ui, __RISCV_NTLH_ALL); // CHECK: load i32{{.*}}align 4, !nontemporal !6, !riscv-nontemporal-domain !10
+ ull = __riscv_ntl_load(&sll, __RISCV_NTLH_ALL); // CHECK: load i64{{.*}}align 8, !nontemporal !6, !riscv-nontemporal-domain !10
+ sll = __riscv_ntl_load(&ull, __RISCV_NTLH_ALL); // CHECK: load i64{{.*}}align 8, !nontemporal !6, !riscv-nontemporal-domain !10
+ h1 = __riscv_ntl_load(&h2, __RISCV_NTLH_ALL); // CHECK: load half{{.*}}align 2, !nontemporal !6, !riscv-nontemporal-domain !10
+ f1 = __riscv_ntl_load(&f2, __RISCV_NTLH_ALL); // CHECK: load float{{.*}}align 4, !nontemporal !6, !riscv-nontemporal-domain !10
+ d1 = __riscv_ntl_load(&d2, __RISCV_NTLH_AL...
[truncated]
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for this. I expect this to be a big help w/ the various target features bugs we've seen in LTO builds.
I'm surprised only 1 test file needed the metadata updated as a result of this change.
Maybe other tests are better about using regexes to hide the numbers? |
My thoughts on this in the past have been:
RISC-V is the most sensitive to this, but it's just a symptom of not having the general mechanism in place, and I'm not a huge fan of adding ad-hoc stuff specific to RISC-V rather than addressing the underlying problem. Other targets just care less upstream today so get away with this lack of information in practice, but that won't always true (e.g. I believe we need this information on Morello (CHERI+AArch64 prototype) downstream, not just for CHERI-RISC-V). |
I agree that finding a generic cross-target solution to this problem is attractive...but I worry it will be hard and slow to proceed. It's possible the previous discussions just lacked someone following up with concrete patches, but I do worry we'd end up stuck in limbo vs starting with a target-specific approach that we later hope to replace with something more generic. With that in mind:
Two simple and early pieces of feedback on the patch:
|
We probably need it to set the I just noticed we only set EF_RISCV_RVC based on C, and not Zca. Is that a bug? |
I think so; binutils sets it for both. |
(R_RISCV_ALIGN can probably fail if you don't, I don't think it's just a missing optimisation) |
Does the name of "riscv-arch" give the impression that it might be "march" string? |
I agree with @asb's framing above. Assuming this doesn't commit us to something which is hard to forward version for some reason, I support addressing this in a target specific manner for the moment. |
That's fair. I just imagine this is going to result in people forgetting about the underlying issue and adding their own patchwork fixes to work around it in their targets, so it would be a nice motivation to fix it properly. |
I share a similar concern about not addressing the issue in a target-specific manner. However, I know a lot of confused users by the behavior, and this patch will significantly improve the status quo. I concur with asb's analysis. |
In an LTO build, we don't set the ELF attributes to indicate what extensions were compiled with. The target CPU/Attrs in RISCVTargetMachine do not get set for an LTO build. Each function gets a target-cpu/feature attribute, but this isn't usable to set ELF attributs since we wouldn't know what function to use. We can't just once since it might have been compiler with an attribute likes target_verson. This patch adds the ISA as Module metadata so we can retrieve it in the backend. Individual translation units can still be compiled with different strings so we need to collect the unique set when Modules are merged. The backend will need to combine the unique ISA strings to produce a single value for the ELF attributes. This will be done in a separate patch.
In an LTO build, we don't set the ELF attributes to indicate what extensions were compiled with. The target CPU/Attrs in RISCVTargetMachine do not get set for an LTO build. Each function gets a target-cpu/feature attribute, but this isn't usable to set ELF attributs since we wouldn't know what function to use. We can't just once since it might have been compiler with an attribute likes target_verson. This patch adds the ISA as Module metadata so we can retrieve it in the backend. Individual translation units can still be compiled with different strings so we need to collect the unique set when Modules are merged. The backend will need to combine the unique ISA strings to produce a single value for the ELF attributes. This will be done in a separate patch.
In an LTO build, we don't set the ELF attributes to indicate what extensions were compiled with. The target CPU/Attrs in RISCVTargetMachine do not get set for an LTO build. Each function gets a target-cpu/feature attribute, but this isn't usable to set ELF attributs since we wouldn't know what function to use. We can't just once since it might have been compiler with an attribute likes target_verson. This patch adds the ISA as Module metadata so we can retrieve it in the backend. Individual translation units can still be compiled with different strings so we need to collect the unique set when Modules are merged. The backend will need to combine the unique ISA strings to produce a single value for the ELF attributes. This will be done in a separate patch.
In an LTO build, we don't set the ELF attributes to indicate what extensions were compiled with. The target CPU/Attrs in RISCVTargetMachine do not get set for an LTO build. Each function gets a target-cpu/feature attribute, but this isn't usable to set ELF attributs since we wouldn't know what function to use. We can't just once since it might have been compiler with an attribute likes target_verson. This patch adds the ISA as Module metadata so we can retrieve it in the backend. Individual translation units can still be compiled with different strings so we need to collect the unique set when Modules are merged. The backend will need to combine the unique ISA strings to produce a single value for the ELF attributes. This will be done in a separate patch.
In an LTO build, we don't set the ELF attributes to indicate what extensions were compiled with. The target CPU/Attrs in RISCVTargetMachine do not get set for an LTO build. Each function gets a target-cpu/feature attribute, but this isn't usable to set ELF attributs since we wouldn't know what function to use. We can't just once since it might have been compiler with an attribute likes target_verson.
This patch adds the ISA as Module metadata so we can retrieve it in the backend. Individual translation units can still be compiled with different strings so we need to collect the unique set when Modules are merged.
The backend will need to combine the unique ISA strings to produce a single value for the ELF attributes. This will be done in a separate patch.