Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[llvm][IR] Add per-global code model attribute #72077

Merged
merged 4 commits into from
Dec 5, 2023

Conversation

heiher
Copy link
Member

@heiher heiher commented Nov 13, 2023

This adds a per-global code model attribute, which can override the target's code model to access global variables.

Link: https://discourse.llvm.org/t/how-to-best-implement-code-model-overriding-for-certain-values/71816
Link: https://discourse.llvm.org/t/rfc-add-per-global-code-model-attribute/74944

Suggested-by: Arthur Eubanks aeubanks@google.com

@heiher heiher requested a review from aeubanks November 13, 2023 02:48
@llvmbot
Copy link
Member

llvmbot commented Nov 13, 2023

@llvm/pr-subscribers-llvm-transforms

@llvm/pr-subscribers-llvm-ir

Author: hev (heiher)

Changes

This adds a per-global code model attribute, which can override the target's code model to access global variables.

Link: https://discourse.llvm.org/t/how-to-best-implement-code-model-overriding-for-certain-values/71816

Suggested-by: Arthur Eubanks <aeubanks@google.com>


Full diff: https://github.com/llvm/llvm-project/pull/72077.diff

13 Files Affected:

  • (modified) llvm/docs/LangRef.rst (+1)
  • (modified) llvm/include/llvm/AsmParser/LLParser.h (+1)
  • (modified) llvm/include/llvm/AsmParser/LLToken.h (+1)
  • (modified) llvm/include/llvm/IR/GlobalObject.h (+21)
  • (modified) llvm/lib/AsmParser/LLLexer.cpp (+1)
  • (modified) llvm/lib/AsmParser/LLParser.cpp (+29)
  • (modified) llvm/lib/Bitcode/Reader/BitcodeReader.cpp (+16)
  • (modified) llvm/lib/Bitcode/Writer/BitcodeWriter.cpp (+16-2)
  • (modified) llvm/lib/IR/AsmWriter.cpp (+21)
  • (modified) llvm/lib/IR/Globals.cpp (+12)
  • (modified) llvm/lib/IR/LLVMContextImpl.h (+3)
  • (added) llvm/test/Bitcode/global-variables-code-model.ll (+17)
  • (added) llvm/test/Bitcode/global-variables-code-model.ll.bc ()
diff --git a/llvm/docs/LangRef.rst b/llvm/docs/LangRef.rst
index ff4f769dcc0dbac..c3d708532b5b0e7 100644
--- a/llvm/docs/LangRef.rst
+++ b/llvm/docs/LangRef.rst
@@ -757,6 +757,7 @@ Syntax::
                          <global | constant> <Type> [<InitializerConstant>]
                          [, section "name"] [, partition "name"]
                          [, comdat [($name)]] [, align <Alignment>]
+                         [, code_model "model"]
                          [, no_sanitize_address] [, no_sanitize_hwaddress]
                          [, sanitize_address_dyninit] [, sanitize_memtag]
                          (, !name !N)*
diff --git a/llvm/include/llvm/AsmParser/LLParser.h b/llvm/include/llvm/AsmParser/LLParser.h
index eca908a24aac7b2..9e5968c5917f73f 100644
--- a/llvm/include/llvm/AsmParser/LLParser.h
+++ b/llvm/include/llvm/AsmParser/LLParser.h
@@ -290,6 +290,7 @@ namespace llvm {
     bool parseOptionalCallingConv(unsigned &CC);
     bool parseOptionalAlignment(MaybeAlign &Alignment,
                                 bool AllowParens = false);
+    bool parseOptionalCodeModel(CodeModel::Model &model);
     bool parseOptionalDerefAttrBytes(lltok::Kind AttrKind, uint64_t &Bytes);
     bool parseOptionalUWTableKind(UWTableKind &Kind);
     bool parseAllocKind(AllocFnKind &Kind);
diff --git a/llvm/include/llvm/AsmParser/LLToken.h b/llvm/include/llvm/AsmParser/LLToken.h
index c9dcd29b31955dc..56a8db2d68df238 100644
--- a/llvm/include/llvm/AsmParser/LLToken.h
+++ b/llvm/include/llvm/AsmParser/LLToken.h
@@ -115,6 +115,7 @@ enum Kind {
   kw_addrspace,
   kw_section,
   kw_partition,
+  kw_code_model,
   kw_alias,
   kw_ifunc,
   kw_module,
diff --git a/llvm/include/llvm/IR/GlobalObject.h b/llvm/include/llvm/IR/GlobalObject.h
index 889bd3a28e12b37..08af50191a05b1a 100644
--- a/llvm/include/llvm/IR/GlobalObject.h
+++ b/llvm/include/llvm/IR/GlobalObject.h
@@ -17,6 +17,7 @@
 #include "llvm/ADT/StringRef.h"
 #include "llvm/IR/GlobalValue.h"
 #include "llvm/IR/Value.h"
+#include "llvm/Support/CodeGen.h"
 #include "llvm/Support/Alignment.h"
 
 namespace llvm {
@@ -52,6 +53,7 @@ class GlobalObject : public GlobalValue {
   enum {
     LastAlignmentBit = 5,
     HasSectionHashEntryBit,
+    HasCodeModelHashEntryBit,
 
     GlobalObjectBits,
   };
@@ -124,6 +126,24 @@ class GlobalObject : public GlobalValue {
   /// appropriate default object file section.
   void setSection(StringRef S);
 
+  /// Check if this global has a custom code model.
+  ///
+  bool hasCodeModel() const {
+    return getGlobalValueSubClassData() & (1 << HasCodeModelHashEntryBit);
+  }
+
+  /// Get the custom code model of this global if it has one.
+  ///
+  /// If this global does not have a custom code model, the default small code
+  /// model will be used.
+  CodeModel::Model getCodeModel() const {
+    return hasCodeModel() ? getCodeModelImpl() : CodeModel::Small;
+  }
+
+  /// Change the code model for this global.
+  ///
+  void setCodeModel(CodeModel::Model M);
+
   bool hasComdat() const { return getComdat() != nullptr; }
   const Comdat *getComdat() const { return ObjComdat; }
   Comdat *getComdat() { return ObjComdat; }
@@ -170,6 +190,7 @@ class GlobalObject : public GlobalValue {
   }
 
   StringRef getSectionImpl() const;
+  CodeModel::Model getCodeModelImpl() const;
 };
 
 } // end namespace llvm
diff --git a/llvm/lib/AsmParser/LLLexer.cpp b/llvm/lib/AsmParser/LLLexer.cpp
index da9e9f4a3c9833b..3b96f0e4fe866bd 100644
--- a/llvm/lib/AsmParser/LLLexer.cpp
+++ b/llvm/lib/AsmParser/LLLexer.cpp
@@ -570,6 +570,7 @@ lltok::Kind LLLexer::LexIdentifier() {
   KEYWORD(addrspace);
   KEYWORD(section);
   KEYWORD(partition);
+  KEYWORD(code_model);
   KEYWORD(alias);
   KEYWORD(ifunc);
   KEYWORD(module);
diff --git a/llvm/lib/AsmParser/LLParser.cpp b/llvm/lib/AsmParser/LLParser.cpp
index 9940bfb15d1979e..784f2e71c726a67 100644
--- a/llvm/lib/AsmParser/LLParser.cpp
+++ b/llvm/lib/AsmParser/LLParser.cpp
@@ -1286,6 +1286,11 @@ bool LLParser::parseGlobal(const std::string &Name, LocTy NameLoc,
         return true;
       if (Alignment)
         GV->setAlignment(*Alignment);
+    } else if (Lex.getKind() == lltok::kw_code_model) {
+      CodeModel::Model CodeModel;
+      if (parseOptionalCodeModel(CodeModel))
+        return true;
+      GV->setCodeModel(CodeModel);
     } else if (Lex.getKind() == lltok::MetadataVar) {
       if (parseGlobalObjectMetadataAttachment(*GV))
         return true;
@@ -2166,6 +2171,30 @@ bool LLParser::parseOptionalAlignment(MaybeAlign &Alignment, bool AllowParens) {
   return false;
 }
 
+/// parseOptionalCodeModel
+///   ::= /* empty */
+///   ::= 'code_model' "large"
+bool LLParser::parseOptionalCodeModel(CodeModel::Model &model) {
+  Lex.Lex();
+  auto StrVal = Lex.getStrVal();
+  auto ErrMsg = "expected global code model string";
+  if (StrVal == "tiny")
+    model = CodeModel::Tiny;
+  else if (StrVal == "small")
+    model = CodeModel::Small;
+  else if (StrVal == "kernel")
+    model = CodeModel::Kernel;
+  else if (StrVal == "medium")
+    model = CodeModel::Medium;
+  else if (StrVal == "large")
+    model = CodeModel::Large;
+  else
+    return tokError(ErrMsg);
+  if (parseToken(lltok::StringConstant, ErrMsg))
+    return true;
+  return false;
+}
+
 /// parseOptionalDerefAttrBytes
 ///   ::= /* empty */
 ///   ::= AttrKind '(' 4 ')'
diff --git a/llvm/lib/Bitcode/Reader/BitcodeReader.cpp b/llvm/lib/Bitcode/Reader/BitcodeReader.cpp
index 76431e883b8d96d..740692894e39f01 100644
--- a/llvm/lib/Bitcode/Reader/BitcodeReader.cpp
+++ b/llvm/lib/Bitcode/Reader/BitcodeReader.cpp
@@ -1144,6 +1144,17 @@ static bool getDecodedDSOLocal(unsigned Val) {
   }
 }
 
+static CodeModel::Model getDecodedCodeModel(unsigned Val) {
+  switch(Val) {
+  case 1: return CodeModel::Tiny;
+  default: // Map unknown values to small.
+  case 2: return CodeModel::Small;
+  case 3: return CodeModel::Kernel;
+  case 4: return CodeModel::Medium;
+  case 5: return CodeModel::Large;
+  }
+}
+
 static GlobalVariable::ThreadLocalMode getDecodedThreadLocalMode(unsigned Val) {
   switch (Val) {
     case 0: return GlobalVariable::NotThreadLocal;
@@ -3809,6 +3820,7 @@ Error BitcodeReader::parseGlobalVarRecord(ArrayRef<uint64_t> Record) {
   // dllstorageclass, comdat, attributes, preemption specifier,
   // partition strtab offset, partition strtab size] (name in VST)
   // v2: [strtab_offset, strtab_size, v1]
+  // v3: [v2, code_model]
   StringRef Name;
   std::tie(Name, Record) = readNameFromStrtab(Record);
 
@@ -3917,6 +3929,10 @@ Error BitcodeReader::parseGlobalVarRecord(ArrayRef<uint64_t> Record) {
     NewGV->setSanitizerMetadata(Meta);
   }
 
+  if (Record.size() > 17 && Record[17]) {
+    NewGV->setCodeModel(getDecodedCodeModel(Record[17]));
+  }
+
   return Error::success();
 }
 
diff --git a/llvm/lib/Bitcode/Writer/BitcodeWriter.cpp b/llvm/lib/Bitcode/Writer/BitcodeWriter.cpp
index d16b5c7781c2413..44f4b9e2dcf6aaf 100644
--- a/llvm/lib/Bitcode/Writer/BitcodeWriter.cpp
+++ b/llvm/lib/Bitcode/Writer/BitcodeWriter.cpp
@@ -1212,6 +1212,19 @@ static unsigned getEncodedUnnamedAddr(const GlobalValue &GV) {
   llvm_unreachable("Invalid unnamed_addr");
 }
 
+static unsigned getEncodedCodeModel(const GlobalVariable &GV) {
+  if (!GV.hasCodeModel())
+    return 0;
+  switch (GV.getCodeModel()) {
+  case CodeModel::Tiny:   return 1;
+  case CodeModel::Small:  return 2;
+  case CodeModel::Kernel: return 3;
+  case CodeModel::Medium: return 4;
+  case CodeModel::Large:  return 5;
+  }
+  llvm_unreachable("Invalid code model");
+}
+
 size_t ModuleBitcodeWriter::addToStrtab(StringRef Str) {
   if (GenerateHash)
     Hasher.update(Str);
@@ -1404,7 +1417,7 @@ void ModuleBitcodeWriter::writeModuleInfo() {
     // GLOBALVAR: [strtab offset, strtab size, type, isconst, initid,
     //             linkage, alignment, section, visibility, threadlocal,
     //             unnamed_addr, externally_initialized, dllstorageclass,
-    //             comdat, attributes, DSO_Local, GlobalSanitizer]
+    //             comdat, attributes, DSO_Local, GlobalSanitizer, code_model]
     Vals.push_back(addToStrtab(GV.getName()));
     Vals.push_back(GV.getName().size());
     Vals.push_back(VE.getTypeID(GV.getValueType()));
@@ -1421,7 +1434,7 @@ void ModuleBitcodeWriter::writeModuleInfo() {
         GV.isExternallyInitialized() ||
         GV.getDLLStorageClass() != GlobalValue::DefaultStorageClass ||
         GV.hasComdat() || GV.hasAttributes() || GV.isDSOLocal() ||
-        GV.hasPartition() || GV.hasSanitizerMetadata()) {
+        GV.hasPartition() || GV.hasSanitizerMetadata() || GV.hasCodeModel()) {
       Vals.push_back(getEncodedVisibility(GV));
       Vals.push_back(getEncodedThreadLocalMode(GV));
       Vals.push_back(getEncodedUnnamedAddr(GV));
@@ -1439,6 +1452,7 @@ void ModuleBitcodeWriter::writeModuleInfo() {
       Vals.push_back((GV.hasSanitizerMetadata() ? serializeSanitizerMetadata(
                                                       GV.getSanitizerMetadata())
                                                 : 0));
+      Vals.push_back(getEncodedCodeModel(GV));
     } else {
       AbbrevToUse = SimpleGVarAbbrev;
     }
diff --git a/llvm/lib/IR/AsmWriter.cpp b/llvm/lib/IR/AsmWriter.cpp
index 6d66b34423949fb..afc4ba1f2392b21 100644
--- a/llvm/lib/IR/AsmWriter.cpp
+++ b/llvm/lib/IR/AsmWriter.cpp
@@ -3647,6 +3647,27 @@ void AssemblyWriter::printGlobal(const GlobalVariable *GV) {
     printEscapedString(GV->getPartition(), Out);
     Out << '"';
   }
+  if (GV->hasCodeModel()) {
+    Out << ", code_model \"";
+    switch (GV->getCodeModel()) {
+    case CodeModel::Tiny:
+      printEscapedString("tiny", Out);
+      break;
+    case CodeModel::Small:
+      printEscapedString("small", Out);
+      break;
+    case CodeModel::Kernel:
+      printEscapedString("kernel", Out);
+      break;
+    case CodeModel::Medium:
+      printEscapedString("medium", Out);
+      break;
+    case CodeModel::Large:
+      printEscapedString("large", Out);
+      break;
+    }
+    Out << '"';
+  }
 
   using SanitizerMetadata = llvm::GlobalValue::SanitizerMetadata;
   if (GV->hasSanitizerMetadata()) {
diff --git a/llvm/lib/IR/Globals.cpp b/llvm/lib/IR/Globals.cpp
index 7bd4503a689e4ae..3fc338fc767678d 100644
--- a/llvm/lib/IR/Globals.cpp
+++ b/llvm/lib/IR/Globals.cpp
@@ -139,6 +139,8 @@ void GlobalObject::copyAttributesFrom(const GlobalObject *Src) {
   GlobalValue::copyAttributesFrom(Src);
   setAlignment(Src->getAlign());
   setSection(Src->getSection());
+  if (Src->hasCodeModel())
+    setCodeModel(Src->getCodeModel());
 }
 
 std::string GlobalValue::getGlobalIdentifier(StringRef Name,
@@ -263,6 +265,16 @@ void GlobalObject::setSection(StringRef S) {
   setGlobalObjectFlag(HasSectionHashEntryBit, !S.empty());
 }
 
+void GlobalObject::setCodeModel(CodeModel::Model M) {
+  getContext().pImpl->GlobalObjectCodeModels[this] = M;
+  setGlobalObjectFlag(HasCodeModelHashEntryBit, true);
+}
+
+CodeModel::Model GlobalObject::getCodeModelImpl() const {
+  assert(hasCodeModel());
+  return getContext().pImpl->GlobalObjectCodeModels[this];
+}
+
 bool GlobalValue::isNobuiltinFnDef() const {
   const Function *F = dyn_cast<Function>(this);
   if (!F || F->empty())
diff --git a/llvm/lib/IR/LLVMContextImpl.h b/llvm/lib/IR/LLVMContextImpl.h
index ebc444fcb6896e9..44602ea10142e98 100644
--- a/llvm/lib/IR/LLVMContextImpl.h
+++ b/llvm/lib/IR/LLVMContextImpl.h
@@ -1570,6 +1570,9 @@ class LLVMContextImpl {
   /// Collection of per-GlobalObject sections used in this context.
   DenseMap<const GlobalObject *, StringRef> GlobalObjectSections;
 
+  /// Collection of per-GlobalObject code models used in this context.
+  DenseMap<const GlobalObject *, CodeModel::Model> GlobalObjectCodeModels;
+
   /// Collection of per-GlobalValue partitions used in this context.
   DenseMap<const GlobalValue *, StringRef> GlobalValuePartitions;
 
diff --git a/llvm/test/Bitcode/global-variables-code-model.ll b/llvm/test/Bitcode/global-variables-code-model.ll
new file mode 100644
index 000000000000000..f8cae4475028510
--- /dev/null
+++ b/llvm/test/Bitcode/global-variables-code-model.ll
@@ -0,0 +1,17 @@
+; RUN:  llvm-dis < %s.bc| FileCheck %s
+; RUN:  verify-uselistorder < %s.bc
+
+@tiny.var = global i32 1, code_model "tiny"
+; CHECK: @tiny.var = global i32 1, code_model "tiny"
+
+@small.var = global i32 1, code_model "small"
+; CHECK: @small.var = global i32 1, code_model "small"
+
+@kernel.var = global i32 1, code_model "kernel"
+; CHECK: @kernel.var = global i32 1, code_model "kernel"
+
+@medium.var = global i32 1, code_model "medium"
+; CHECK: @medium.var = global i32 1, code_model "medium"
+
+@large.var = global i32 1, code_model "large"
+; CHECK: @large.var = global i32 1, code_model "large"
diff --git a/llvm/test/Bitcode/global-variables-code-model.ll.bc b/llvm/test/Bitcode/global-variables-code-model.ll.bc
new file mode 100644
index 000000000000000..c63ce041784a839
Binary files /dev/null and b/llvm/test/Bitcode/global-variables-code-model.ll.bc differ

@heiher
Copy link
Member Author

heiher commented Nov 13, 2023

Part 2/3: #72078
Part 3/3: #72079

Copy link

github-actions bot commented Nov 13, 2023

⚠️ C/C++ code formatter, clang-format found issues in your code. ⚠️

You can test this locally with the following command:
git-clang-format --diff 6da4ecdf9285225ccc8fa4441b7e9f65e8f4f49c 9508441cb787ed60301aa0dc6cdbefdd3071ae30 -- llvm/include/llvm/AsmParser/LLParser.h llvm/include/llvm/AsmParser/LLToken.h llvm/include/llvm/IR/GlobalObject.h llvm/include/llvm/IR/GlobalVariable.h llvm/lib/AsmParser/LLLexer.cpp llvm/lib/AsmParser/LLParser.cpp llvm/lib/Bitcode/Reader/BitcodeReader.cpp llvm/lib/Bitcode/Writer/BitcodeWriter.cpp llvm/lib/IR/AsmWriter.cpp llvm/lib/IR/Globals.cpp
View the diff from clang-format here.
diff --git a/llvm/lib/AsmParser/LLParser.cpp b/llvm/lib/AsmParser/LLParser.cpp
index 784f2e71c7..fa55bc5be4 100644
--- a/llvm/lib/AsmParser/LLParser.cpp
+++ b/llvm/lib/AsmParser/LLParser.cpp
@@ -2071,7 +2071,9 @@ bool LLParser::parseOptionalCallingConv(unsigned &CC) {
     break;
   case lltok::kw_amdgpu_kernel:  CC = CallingConv::AMDGPU_KERNEL; break;
   case lltok::kw_tailcc:         CC = CallingConv::Tail; break;
-  case lltok::kw_m68k_rtdcc:     CC = CallingConv::M68k_RTD; break;
+  case lltok::kw_m68k_rtdcc:
+    CC = CallingConv::M68k_RTD;
+    break;
   case lltok::kw_cc: {
       Lex.Lex();
       return parseUInt32(CC);
@@ -3977,7 +3979,8 @@ bool LLParser::parseValID(ValID &ID, PerFunctionState *PFS, Type *ExpectedTy) {
     unsigned Flags = 0;
     if (NUW)   Flags |= OverflowingBinaryOperator::NoUnsignedWrap;
     if (NSW)   Flags |= OverflowingBinaryOperator::NoSignedWrap;
-    if (Exact) Flags |= PossiblyExactOperator::IsExact;
+    if (Exact)
+      Flags |= PossiblyExactOperator::IsExact;
     ID.ConstantVal = ConstantExpr::get(Opc, Val0, Val1, Flags);
     ID.Kind = ValID::t_Constant;
     return false;
diff --git a/llvm/lib/IR/AsmWriter.cpp b/llvm/lib/IR/AsmWriter.cpp
index 5c64827e49..8a0d91a6c4 100644
--- a/llvm/lib/IR/AsmWriter.cpp
+++ b/llvm/lib/IR/AsmWriter.cpp
@@ -306,7 +306,9 @@ static void PrintCallingConv(unsigned cc, raw_ostream &Out) {
   case CallingConv::PreserveAll:   Out << "preserve_allcc"; break;
   case CallingConv::CXX_FAST_TLS:  Out << "cxx_fast_tlscc"; break;
   case CallingConv::GHC:           Out << "ghccc"; break;
-  case CallingConv::Tail:          Out << "tailcc"; break;
+  case CallingConv::Tail:
+    Out << "tailcc";
+    break;
   case CallingConv::CFGuard_Check: Out << "cfguard_checkcc"; break;
   case CallingConv::X86_StdCall:   Out << "x86_stdcallcc"; break;
   case CallingConv::X86_FastCall:  Out << "x86_fastcallcc"; break;

@heiher
Copy link
Member Author

heiher commented Nov 13, 2023

cc @xen0n @xry111

@aeubanks
Copy link
Contributor

I was just about to send out an RFC for #69638 :)

Can you explain (or link to docs) how the different code models for LoongArch work and a per-global code model would change that?
For example, for x86-64, data/text (GlobalVariable/Function) can either be small or large depending on the code model (small PIC data can be accessed with 32-bit PC-relative relocations, large data must be accessed with 64-bit relocations), so to override that we don't actually need an entire code-model attribute, we just need a force-large or force-small attribute to cover all cases.

@heiher
Copy link
Member Author

heiher commented Nov 14, 2023

@aeubanks Thank you!

For linux kernel module percpu case of LoongArch:

static __attribute__((section(".data..percpu" ""))) int pcpu;

int fun(void)
{
    return pcpu;
}

After static linking, a small code model code is generated to access the pcpu variable, which can fully access the pcpu variable in the same link unit. The corner issue with kernel modules is that after dynamic linking, the actual location of the pcpu variables will be so far away from the code that the default code model (small) doesn't work. This only happens for percpu variables, changing the code model for variables instead of all will help reduce overhead.

Code model (default/small):

@pcpu= external dso_local global i32, align 4

define dso_local signext i32 @fun() #0 {
; CHECK-LABEL: fun:
; CHECK:       # %bb.0:
; CHECK-NEXT:    pcalau12i $a0, %pc_hi20(pcpu)
; CHECK-NEXT:    addi.d $a0, $a0, %pc_lo12(pcpu)
; CHECK-NEXT:    ld.w $a0, $a0, 0
; CHECK-NEXT:    ret
  %1 = load i32, ptr @pcpu, align 4
  ret i32 %1
}

Code model (large):

@pcpu = external dso_local global i32, code_model "large", align 4

define dso_local signext i32 @fun() #0 {
; CHECK-LABEL: fun:
; CHECK:       # %bb.0:
; CHECK-NEXT:    pcalau12i $a0, %pc_hi20(pcpu)
; CHECK-NEXT:    addi.d $a1, $zero, %pc_lo12(pcpu)
; CHECK-NEXT:    lu32i.d $a1, %pc64_lo20(pcpu)
; CHECK-NEXT:    lu52i.d $a1, $a1, %pc64_hi12(pcpu)
; CHECK-NEXT:    add.d $a0, $a1, $a0
; CHECK-NEXT:    ld.w $a0, $a0, 0
; CHECK-NEXT:    ret
  %1 = load i32, ptr @pcpu, align 4
  ret i32 %1
}

I think your original suggestion has better generality, which provides more choices for code models. This allows us to choose a code model that's just right without having to worry about more overhead.

@aeubanks
Copy link
Contributor

This should definitely have an RFC sent to discourse

The annoying thing is that the x86-64 medium/large code model doesn't necessarily mean that all globals compiled like that are near/far, they mean that globals over a certain size are considered far (the linker places them farther from text to mitigate 32-bit relocation pressure). The way I want to use this feature is to unconditionally mark a global as far regardless of its size, so specifying a specific code model is not quite right for my use case. But maybe for x86-64 we can just say that an explicit "small" code model means the global is unconditionally near and an explicit "large" code model means the global is unconditionally far, even if that's not exactly what the x86-64 code models mean.

Do you need this for functions as well as global variables? For x86-64 theoretically this is useful for both, but in practice we'll probably only use this for global variables.

llvm/include/llvm/IR/GlobalObject.h Outdated Show resolved Hide resolved
llvm/lib/IR/Globals.cpp Outdated Show resolved Hide resolved
llvm/lib/Bitcode/Reader/BitcodeReader.cpp Outdated Show resolved Hide resolved
@heiher
Copy link
Member Author

heiher commented Nov 15, 2023

This should definitely have an RFC sent to discourse

https://discourse.llvm.org/t/rfc-add-per-global-code-model-attribute/74944

The annoying thing is that the x86-64 medium/large code model doesn't necessarily mean that all globals compiled like that are near/far, they mean that globals over a certain size are considered far (the linker places them farther from text to mitigate 32-bit relocation pressure). The way I want to use this feature is to unconditionally mark a global as far regardless of its size, so specifying a specific code model is not quite right for my use case. But maybe for x86-64 we can just say that an explicit "small" code model means the global is unconditionally near and an explicit "large" code model means the global is unconditionally far, even if that's not exactly what the x86-64 code models mean.

Yeah. From the GCC document, the code model of x86-64 has two intentions, one is to change the ability to access the address space of the program. The other is to change the position according to the size of the symbol object, which may not be all architecture. The ability to access the address space is the need for this.

Do you need this for functions as well as global variables? For x86-64 theoretically this is useful for both, but in practice we'll probably only use this for global variables.

Just only global variable for LoongArch. When kernel module dynamic linking, PLT will be created for function calls.

Copy link
Contributor

@aeubanks aeubanks left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(I keep forgetting to click "submit review"...)

llvm/include/llvm/IR/GlobalObject.h Outdated Show resolved Hide resolved
llvm/test/Bitcode/global-variables-code-model.ll Outdated Show resolved Hide resolved
llvm/lib/IR/AsmWriter.cpp Outdated Show resolved Hide resolved
This adds a per-global code model attribute, which can override
the target's code model to access global variables.

Link: https://discourse.llvm.org/t/how-to-best-implement-code-model-overriding-for-certain-values/71816
Link: https://discourse.llvm.org/t/rfc-add-per-global-code-model-attribute/74944

Suggested-by: Arthur Eubanks <aeubanks@google.com>
Signed-off-by: WANG Rui <wangrui@loongson.cn>
@heiher
Copy link
Member Author

heiher commented Nov 18, 2023

@aeubanks Thanks for your comments. All done.

@SixWeining
Copy link
Contributor

Hi @heiher, you can push additional "fixup" commits to address review comments instead of force pushing. These multiple commits in one PR will be squashed before getting merged.
https://llvm.org/docs/GitHub.html#updating-pull-requests

@heiher
Copy link
Member Author

heiher commented Nov 18, 2023

Hi @heiher, you can push additional "fixup" commits to address review comments instead of force pushing. These multiple commits in one PR will be squashed before getting merged. https://llvm.org/docs/GitHub.html#updating-pull-requests

Okay. Thanks for the tip.

Copy link
Contributor

@aeubanks aeubanks left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sorry for the late review, almost looks good, just a couple of comments

llvm/docs/LangRef.rst Outdated Show resolved Hide resolved
llvm/lib/Bitcode/Reader/BitcodeReader.cpp Show resolved Hide resolved
Copy link
Contributor

@aeubanks aeubanks left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

one comment, otherwise lgtm

llvm/lib/Bitcode/Reader/BitcodeReader.cpp Show resolved Hide resolved
llvm/docs/LangRef.rst Outdated Show resolved Hide resolved
@nikic
Copy link
Contributor

nikic commented Dec 2, 2023

Can you please add a test that shows the code model is correctly copied if e.g. GlobalOpt performs a transform on the global?

I'd also like some clarity what kind of globals support code_model. You placed the APIs on GlobalObject, but the AsmParser support looks like this is only actually supported on GlobalVariable?

Either these APIs should be on GlobalVariable, or code_model should also be supported on Functions.

@heiher
Copy link
Member Author

heiher commented Dec 3, 2023

Thanks for the comments.

Can you please add a test that shows the code model is correctly copied if e.g. GlobalOpt performs a transform on the global?

Done.

I'd also like some clarity what kind of globals support code_model. You placed the APIs on GlobalObject, but the AsmParser support looks like this is only actually supported on GlobalVariable?

Either these APIs should be on GlobalVariable, or code_model should also be supported on Functions.

Just global variable. Moved to GlobalVariable.

@heiher heiher merged commit a8874cf into llvm:main Dec 5, 2023
4 of 5 checks passed
@heiher heiher deleted the model-attr-llvm branch December 5, 2023 01:43
aeubanks added a commit to aeubanks/llvm-project that referenced this pull request Dec 5, 2023
Using the GlobalVariable code_model property added in llvm#72077.

code_model = "small" means the global should be treated as small regardless of the TargetMachine code model.
code_model = "large" means the global should be treated as large regardless of the TargetMachine code model.

Inferring small/large based on a known section name still takes precedence for correctness.

The intention is to use this for globals that are accessed very infrequently but also take up a lot of space in the binary to mitigate relocation overflows. Prime examples are globals that go in "__llvm_prf_names" for coverage/PGO instrumented builds and "asan_globals" for ASan builds.
aeubanks added a commit that referenced this pull request Dec 5, 2023
…74498)

Using the GlobalVariable code_model property added in #72077.

code_model = "small" means the global should be treated as small
regardless of the TargetMachine code model.
code_model = "large" means the global should be treated as large
regardless of the TargetMachine code model.

Inferring small/large based on a known section name still takes
precedence for correctness.

The intention is to use this for globals that are accessed very
infrequently but also take up a lot of space in the binary to mitigate
relocation overflows. Prime examples are globals that go in
"__llvm_prf_names" for coverage/PGO instrumented builds and
"asan_globals" for ASan builds.
@SixWeining
Copy link
Contributor

@heiher The release/18.x branch will be created next week, could you mention this change in llvm's release notes?

@heiher
Copy link
Member Author

heiher commented Jan 19, 2024

@heiher The release/18.x branch will be created next week, could you mention this change in llvm's release notes?

👌 PR: #78664 Thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants