Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[LLD][COFF] Allow overriding EC alias symbols with lazy archive symbols #113283

Merged
merged 1 commit into from
Oct 23, 2024

Conversation

cjacek
Copy link
Contributor

@cjacek cjacek commented Oct 22, 2024

On ARM64EC, external function calls emit a pair of weak-dependency aliases: func to #func and #func to the func guess exit thunk (instead of a single undefined func symbol, which would be emitted on other targets). Allow such aliases to be overridden by lazy archive symbols, just as we would for undefined symbols.

@cjacek
Copy link
Contributor Author

cjacek commented Oct 22, 2024

This PR allows the use of archive symbols from actual ARM64EC objects. I will also
create a separate PR for a related issue where multiple archives may conflict with
each other. As usual, I tried to keep it compatible with MSVC. I believe that the
approach is close enough, but figuring out the exact details of MSVC behavior has
proven to be difficult in this case. While testing various corner cases, I encountered
some internal linker errors and instances where the results were not quite right.
Nevertheless, I can confirm that anti-dependency symbols are treated differently
with respect to their interaction with archive symbols.

I explored various approaches to this, such as tracking an "EC alias" flag in symbols,
but it added complexity with little to no benefits. The approach in this and the next
PR has led to simpler logic that should be equally robust in practice.

@llvmbot
Copy link
Member

llvmbot commented Oct 22, 2024

@llvm/pr-subscribers-platform-windows

@llvm/pr-subscribers-lld-coff

Author: Jacek Caban (cjacek)

Changes

On ARM64EC, external functions do not emit undefined symbols. Instead, they emit a pair of weak-dependency aliases: func to #func, and #func to the func guess exit thunk. This change allows such aliases to be overridden by lazy archive symbols, similar to how we handle undefined symbols.


Full diff: https://github.com/llvm/llvm-project/pull/113283.diff

6 Files Affected:

  • (modified) lld/COFF/InputFiles.cpp (+29-6)
  • (modified) lld/COFF/InputFiles.h (+1-1)
  • (modified) lld/COFF/SymbolTable.cpp (+6-4)
  • (modified) lld/COFF/SymbolTable.h (+1-1)
  • (modified) lld/COFF/Symbols.h (+4)
  • (modified) lld/test/COFF/arm64ec-lib.test (+85)
diff --git a/lld/COFF/InputFiles.cpp b/lld/COFF/InputFiles.cpp
index 292c3bfc1eaa9d..fdbc1de4beaf32 100644
--- a/lld/COFF/InputFiles.cpp
+++ b/lld/COFF/InputFiles.cpp
@@ -455,11 +455,34 @@ void ObjFile::initializeSymbols() {
     COFFSymbolRef coffSym = check(coffObj->getSymbol(i));
     bool prevailingComdat;
     if (coffSym.isUndefined()) {
-      symbols[i] = createUndefined(coffSym);
+      symbols[i] = createUndefined(coffSym, false);
     } else if (coffSym.isWeakExternal()) {
-      symbols[i] = createUndefined(coffSym);
-      weakAliases.emplace_back(symbols[i],
-                               coffSym.getAux<coff_aux_weak_external>());
+      auto aux = coffSym.getAux<coff_aux_weak_external>();
+      bool overrideLazy = true;
+
+      // On ARM64EC, external functions don't emit undefined symbols. Instead,
+      // they emit a pair of weak-dependency aliases: func to #func and
+      // #func to the func guess exit thunk. Allow such aliases to be overridden
+      // by lazy archive symbols, just as we would for undefined symbols.
+      if (isArm64EC(getMachineType()) &&
+          aux->Characteristics == IMAGE_WEAK_EXTERN_ANTI_DEPENDENCY) {
+        COFFSymbolRef targetSym = check(coffObj->getSymbol(aux->TagIndex));
+        if (!targetSym.isAnyUndefined()) {
+          // If the target is defined, it may be either a guess exit thunk or
+          // the actual implementation. If it's the latter, consider the alias
+          // to be part of the implementation and override potential lazy archive
+          // symbols.
+          StringRef targetName = check(coffObj->getSymbolName(targetSym));
+          StringRef name = check(coffObj->getSymbolName(coffSym));
+          std::optional<std::string> mangledName =
+              getArm64ECMangledFunctionName(name);
+          overrideLazy = mangledName == targetName;
+        } else {
+          overrideLazy = false;
+        }
+      }
+      symbols[i] = createUndefined(coffSym, overrideLazy);
+      weakAliases.emplace_back(symbols[i], aux);
     } else if (std::optional<Symbol *> optSym =
                    createDefined(coffSym, comdatDefs, prevailingComdat)) {
       symbols[i] = *optSym;
@@ -508,9 +531,9 @@ void ObjFile::initializeSymbols() {
   decltype(sparseChunks)().swap(sparseChunks);
 }
 
-Symbol *ObjFile::createUndefined(COFFSymbolRef sym) {
+Symbol *ObjFile::createUndefined(COFFSymbolRef sym, bool overrideLazy) {
   StringRef name = check(coffObj->getSymbolName(sym));
-  return ctx.symtab.addUndefined(name, this, sym.isWeakExternal());
+  return ctx.symtab.addUndefined(name, this, overrideLazy);
 }
 
 static const coff_aux_section_definition *findSectionDef(COFFObjectFile *obj,
diff --git a/lld/COFF/InputFiles.h b/lld/COFF/InputFiles.h
index a20b097cbe04af..77f7e298166eec 100644
--- a/lld/COFF/InputFiles.h
+++ b/lld/COFF/InputFiles.h
@@ -272,7 +272,7 @@ class ObjFile : public InputFile {
                     &comdatDefs,
                 bool &prevailingComdat);
   Symbol *createRegular(COFFSymbolRef sym);
-  Symbol *createUndefined(COFFSymbolRef sym);
+  Symbol *createUndefined(COFFSymbolRef sym, bool overrideLazy);
 
   std::unique_ptr<COFFObjectFile> coffObj;
 
diff --git a/lld/COFF/SymbolTable.cpp b/lld/COFF/SymbolTable.cpp
index 230ae74dfb21d0..435b3bf4dbab80 100644
--- a/lld/COFF/SymbolTable.cpp
+++ b/lld/COFF/SymbolTable.cpp
@@ -620,9 +620,9 @@ void SymbolTable::initializeECThunks() {
 }
 
 Symbol *SymbolTable::addUndefined(StringRef name, InputFile *f,
-                                  bool isWeakAlias) {
+                                  bool overrideLazy) {
   auto [s, wasInserted] = insert(name, f);
-  if (wasInserted || (s->isLazy() && isWeakAlias)) {
+  if (wasInserted || (s->isLazy() && overrideLazy)) {
     replaceSymbol<Undefined>(s, name);
     return s;
   }
@@ -639,7 +639,8 @@ void SymbolTable::addLazyArchive(ArchiveFile *f, const Archive::Symbol &sym) {
     return;
   }
   auto *u = dyn_cast<Undefined>(s);
-  if (!u || u->weakAlias || s->pendingArchiveLoad)
+  if (!u || (u->weakAlias && !u->isECAlias(ctx.config.machine)) ||
+      s->pendingArchiveLoad)
     return;
   s->pendingArchiveLoad = true;
   f->addMember(sym);
@@ -653,7 +654,8 @@ void SymbolTable::addLazyObject(InputFile *f, StringRef n) {
     return;
   }
   auto *u = dyn_cast<Undefined>(s);
-  if (!u || u->weakAlias || s->pendingArchiveLoad)
+  if (!u || (u->weakAlias && !u->isECAlias(ctx.config.machine)) ||
+      s->pendingArchiveLoad)
     return;
   s->pendingArchiveLoad = true;
   f->lazy = false;
diff --git a/lld/COFF/SymbolTable.h b/lld/COFF/SymbolTable.h
index dab03afde3f987..1d9e908b8b9918 100644
--- a/lld/COFF/SymbolTable.h
+++ b/lld/COFF/SymbolTable.h
@@ -91,7 +91,7 @@ class SymbolTable {
   Symbol *addSynthetic(StringRef n, Chunk *c);
   Symbol *addAbsolute(StringRef n, uint64_t va);
 
-  Symbol *addUndefined(StringRef name, InputFile *f, bool isWeakAlias);
+  Symbol *addUndefined(StringRef name, InputFile *f, bool overrideLazy);
   void addLazyArchive(ArchiveFile *f, const Archive::Symbol &sym);
   void addLazyObject(InputFile *f, StringRef n);
   void addLazyDLLSymbol(DLLFile *f, DLLFile::Symbol *sym, StringRef n);
diff --git a/lld/COFF/Symbols.h b/lld/COFF/Symbols.h
index ff84ff8ad7b28b..203a542466c68e 100644
--- a/lld/COFF/Symbols.h
+++ b/lld/COFF/Symbols.h
@@ -353,6 +353,10 @@ class Undefined : public Symbol {
     isAntiDep = antiDep;
   }
 
+  bool isECAlias(MachineTypes machine) const {
+    return weakAlias && isAntiDep && isArm64EC(machine);
+  }
+
   // If this symbol is external weak, replace this object with aliased symbol.
   bool resolveWeakAlias();
 };
diff --git a/lld/test/COFF/arm64ec-lib.test b/lld/test/COFF/arm64ec-lib.test
index a26c098547fdbe..78b9f326aab893 100644
--- a/lld/test/COFF/arm64ec-lib.test
+++ b/lld/test/COFF/arm64ec-lib.test
@@ -7,10 +7,16 @@ RUN: llvm-mc -filetype=obj -triple=aarch64-windows nsymref.s -o nsymref-aarch64.
 RUN: llvm-mc -filetype=obj -triple=arm64ec-windows sym.s -o sym-arm64ec.obj
 RUN: llvm-mc -filetype=obj -triple=x86_64-windows sym.s -o sym-x86_64.obj
 RUN: llvm-mc -filetype=obj -triple=aarch64-windows nsym.s -o nsym-aarch64.obj
+RUN: llvm-mc -filetype=obj -triple=arm64ec-windows ref-alias.s -o ref-alias.obj
+RUN: llvm-mc -filetype=obj -triple=arm64ec-windows ref-thunk.s -o ref-thunk.obj
+RUN: llvm-mc -filetype=obj -triple=arm64ec-windows func.s -o func.obj
+RUN: llvm-mc -filetype=obj -triple=x86_64-windows func-x86_64.s -o func-x86_64.obj
 RUN: llvm-mc -filetype=obj -triple=arm64ec-windows %S/Inputs/loadconfig-arm64ec.s -o loadconfig-arm64ec.obj
 
 RUN: llvm-lib -machine:arm64ec -out:sym-arm64ec.lib sym-arm64ec.obj nsym-aarch64.obj
 RUN: llvm-lib -machine:amd64 -out:sym-x86_64.lib sym-x86_64.obj
+RUN: llvm-lib -machine:arm64ec -out:func.lib func.obj
+RUN: llvm-lib -machine:arm64ec -out:func-x86_64.lib func-x86_64.obj
 
 Verify that a symbol can be referenced from ECSYMBOLS.
 RUN: lld-link -machine:arm64ec -dll -noentry -out:test.dll symref-arm64ec.obj sym-arm64ec.lib loadconfig-arm64ec.obj
@@ -26,6 +32,49 @@ RUN: not lld-link -machine:arm64ec -dll -noentry -out:test-err.dll nsymref-arm64
 RUN:              FileCheck --check-prefix=ERR %s
 ERR: error: undefined symbol: nsym
 
+Verify that a library symbol can be referenced, even if its name conflicts with an anti-dependency alias.
+RUN: lld-link -machine:arm64ec -dll -noentry -out:ref-alias-1.dll ref-alias.obj func.lib loadconfig-arm64ec.obj
+RUN: llvm-objdump -d ref-alias-1.dll | FileCheck -check-prefix=DISASM %s
+DISASM:      0000000180001000 <.text>:
+DISASM-NEXT: 180001000: d65f03c0     ret
+DISASM-EMPTY:
+
+RUN: llvm-readobj --hex-dump=.test ref-alias-1.dll | FileCheck -check-prefix=TESTSEC %s
+TESTSEC: 0x180004000 00100000
+
+The same test, but with a different input order.
+RUN: lld-link -machine:arm64ec -dll -noentry -out:ref-alias-2.dll func.lib ref-alias.obj loadconfig-arm64ec.obj
+RUN: llvm-objdump -d ref-alias-2.dll | FileCheck -check-prefix=DISASM %s
+RUN: llvm-readobj --hex-dump=.test ref-alias-2.dll | FileCheck -check-prefix=TESTSEC %s
+
+Verify that when an anti-dependency to a guess exit thunk is present, it is overridden by an archive symbol.
+RUN: lld-link -machine:arm64ec -dll -noentry -out:ref-thunk-1.dll ref-thunk.obj func.lib loadconfig-arm64ec.obj
+RUN: llvm-objdump -d ref-thunk-1.dll | FileCheck -check-prefix=DISASM %s
+RUN: llvm-readobj --hex-dump=.test ref-thunk-1.dll | FileCheck -check-prefix=TESTSEC %s
+
+The same test, but with a different input order.
+RUN: lld-link -machine:arm64ec -dll -noentry -out:ref-thunk-2.dll func.lib ref-thunk.obj loadconfig-arm64ec.obj
+RUN: llvm-objdump -d ref-thunk-2.dll | FileCheck -check-prefix=DISASM %s
+RUN: llvm-readobj --hex-dump=.test ref-thunk-2.dll | FileCheck -check-prefix=TESTSEC %s
+
+Test linking against an x86_64 library (which uses a demangled function name).
+RUN: lld-link -machine:arm64ec -dll -noentry -out:ref-x86-1.dll ref-thunk.obj func-x86_64.lib loadconfig-arm64ec.obj
+RUN: llvm-objdump -d ref-x86-1.dll | FileCheck -check-prefix=DISASM-X86 %s
+RUN: llvm-readobj --hex-dump=.test ref-x86-1.dll | FileCheck -check-prefix=TESTSEC %s
+
+DISASM-X86:      0000000180001000 <.text>:
+DISASM-X86-NEXT: 180001000: c3                           retq
+
+The same test, but with a different input order.
+RUN: lld-link -machine:arm64ec -dll -noentry -out:ref-x86-2.dll func-x86_64.lib ref-thunk.obj loadconfig-arm64ec.obj
+RUN: llvm-objdump -d ref-x86-2.dll | FileCheck -check-prefix=DISASM-X86 %s
+RUN: llvm-readobj --hex-dump=.test ref-x86-2.dll | FileCheck -check-prefix=TESTSEC %s
+
+A similar test using -start-lib for linking.
+RUN: lld-link -machine:arm64ec -dll -noentry -out:start-lib-1.dll ref-thunk.obj -start-lib func.obj -end-lib loadconfig-arm64ec.obj
+RUN: llvm-objdump -d start-lib-1.dll | FileCheck -check-prefix=DISASM %s
+RUN: llvm-readobj --hex-dump=.test start-lib-1.dll | FileCheck -check-prefix=TESTSEC %s
+
 #--- symref.s
     .data
     .rva sym
@@ -45,3 +94,39 @@ sym:
      .globl nsym
 nsym:
      .word 0
+
+#--- ref-alias.s
+    .weak_anti_dep func
+.set func,"#func"
+
+    .section .test, "r"
+    .rva func
+
+#--- ref-thunk.s
+    .weak_anti_dep func
+.set func, "#func"
+    .weak_anti_dep "#func"
+.set "#func", thunksym
+
+    .section .test, "r"
+    .rva func
+
+    .section .thnk,"xr",discard,thunksym
+thunksym:
+    mov w0, #2
+    ret
+
+#--- func.s
+    .text
+    .globl "#func"
+"#func":
+    ret
+
+    .weak_anti_dep func
+.set func,"#func"
+
+#--- func-x86_64.s
+    .text
+    .globl func
+func:
+    ret

@llvmbot
Copy link
Member

llvmbot commented Oct 22, 2024

@llvm/pr-subscribers-lld

Author: Jacek Caban (cjacek)

Changes

On ARM64EC, external functions do not emit undefined symbols. Instead, they emit a pair of weak-dependency aliases: func to #func, and #func to the func guess exit thunk. This change allows such aliases to be overridden by lazy archive symbols, similar to how we handle undefined symbols.


Full diff: https://github.com/llvm/llvm-project/pull/113283.diff

6 Files Affected:

  • (modified) lld/COFF/InputFiles.cpp (+29-6)
  • (modified) lld/COFF/InputFiles.h (+1-1)
  • (modified) lld/COFF/SymbolTable.cpp (+6-4)
  • (modified) lld/COFF/SymbolTable.h (+1-1)
  • (modified) lld/COFF/Symbols.h (+4)
  • (modified) lld/test/COFF/arm64ec-lib.test (+85)
diff --git a/lld/COFF/InputFiles.cpp b/lld/COFF/InputFiles.cpp
index 292c3bfc1eaa9d..fdbc1de4beaf32 100644
--- a/lld/COFF/InputFiles.cpp
+++ b/lld/COFF/InputFiles.cpp
@@ -455,11 +455,34 @@ void ObjFile::initializeSymbols() {
     COFFSymbolRef coffSym = check(coffObj->getSymbol(i));
     bool prevailingComdat;
     if (coffSym.isUndefined()) {
-      symbols[i] = createUndefined(coffSym);
+      symbols[i] = createUndefined(coffSym, false);
     } else if (coffSym.isWeakExternal()) {
-      symbols[i] = createUndefined(coffSym);
-      weakAliases.emplace_back(symbols[i],
-                               coffSym.getAux<coff_aux_weak_external>());
+      auto aux = coffSym.getAux<coff_aux_weak_external>();
+      bool overrideLazy = true;
+
+      // On ARM64EC, external functions don't emit undefined symbols. Instead,
+      // they emit a pair of weak-dependency aliases: func to #func and
+      // #func to the func guess exit thunk. Allow such aliases to be overridden
+      // by lazy archive symbols, just as we would for undefined symbols.
+      if (isArm64EC(getMachineType()) &&
+          aux->Characteristics == IMAGE_WEAK_EXTERN_ANTI_DEPENDENCY) {
+        COFFSymbolRef targetSym = check(coffObj->getSymbol(aux->TagIndex));
+        if (!targetSym.isAnyUndefined()) {
+          // If the target is defined, it may be either a guess exit thunk or
+          // the actual implementation. If it's the latter, consider the alias
+          // to be part of the implementation and override potential lazy archive
+          // symbols.
+          StringRef targetName = check(coffObj->getSymbolName(targetSym));
+          StringRef name = check(coffObj->getSymbolName(coffSym));
+          std::optional<std::string> mangledName =
+              getArm64ECMangledFunctionName(name);
+          overrideLazy = mangledName == targetName;
+        } else {
+          overrideLazy = false;
+        }
+      }
+      symbols[i] = createUndefined(coffSym, overrideLazy);
+      weakAliases.emplace_back(symbols[i], aux);
     } else if (std::optional<Symbol *> optSym =
                    createDefined(coffSym, comdatDefs, prevailingComdat)) {
       symbols[i] = *optSym;
@@ -508,9 +531,9 @@ void ObjFile::initializeSymbols() {
   decltype(sparseChunks)().swap(sparseChunks);
 }
 
-Symbol *ObjFile::createUndefined(COFFSymbolRef sym) {
+Symbol *ObjFile::createUndefined(COFFSymbolRef sym, bool overrideLazy) {
   StringRef name = check(coffObj->getSymbolName(sym));
-  return ctx.symtab.addUndefined(name, this, sym.isWeakExternal());
+  return ctx.symtab.addUndefined(name, this, overrideLazy);
 }
 
 static const coff_aux_section_definition *findSectionDef(COFFObjectFile *obj,
diff --git a/lld/COFF/InputFiles.h b/lld/COFF/InputFiles.h
index a20b097cbe04af..77f7e298166eec 100644
--- a/lld/COFF/InputFiles.h
+++ b/lld/COFF/InputFiles.h
@@ -272,7 +272,7 @@ class ObjFile : public InputFile {
                     &comdatDefs,
                 bool &prevailingComdat);
   Symbol *createRegular(COFFSymbolRef sym);
-  Symbol *createUndefined(COFFSymbolRef sym);
+  Symbol *createUndefined(COFFSymbolRef sym, bool overrideLazy);
 
   std::unique_ptr<COFFObjectFile> coffObj;
 
diff --git a/lld/COFF/SymbolTable.cpp b/lld/COFF/SymbolTable.cpp
index 230ae74dfb21d0..435b3bf4dbab80 100644
--- a/lld/COFF/SymbolTable.cpp
+++ b/lld/COFF/SymbolTable.cpp
@@ -620,9 +620,9 @@ void SymbolTable::initializeECThunks() {
 }
 
 Symbol *SymbolTable::addUndefined(StringRef name, InputFile *f,
-                                  bool isWeakAlias) {
+                                  bool overrideLazy) {
   auto [s, wasInserted] = insert(name, f);
-  if (wasInserted || (s->isLazy() && isWeakAlias)) {
+  if (wasInserted || (s->isLazy() && overrideLazy)) {
     replaceSymbol<Undefined>(s, name);
     return s;
   }
@@ -639,7 +639,8 @@ void SymbolTable::addLazyArchive(ArchiveFile *f, const Archive::Symbol &sym) {
     return;
   }
   auto *u = dyn_cast<Undefined>(s);
-  if (!u || u->weakAlias || s->pendingArchiveLoad)
+  if (!u || (u->weakAlias && !u->isECAlias(ctx.config.machine)) ||
+      s->pendingArchiveLoad)
     return;
   s->pendingArchiveLoad = true;
   f->addMember(sym);
@@ -653,7 +654,8 @@ void SymbolTable::addLazyObject(InputFile *f, StringRef n) {
     return;
   }
   auto *u = dyn_cast<Undefined>(s);
-  if (!u || u->weakAlias || s->pendingArchiveLoad)
+  if (!u || (u->weakAlias && !u->isECAlias(ctx.config.machine)) ||
+      s->pendingArchiveLoad)
     return;
   s->pendingArchiveLoad = true;
   f->lazy = false;
diff --git a/lld/COFF/SymbolTable.h b/lld/COFF/SymbolTable.h
index dab03afde3f987..1d9e908b8b9918 100644
--- a/lld/COFF/SymbolTable.h
+++ b/lld/COFF/SymbolTable.h
@@ -91,7 +91,7 @@ class SymbolTable {
   Symbol *addSynthetic(StringRef n, Chunk *c);
   Symbol *addAbsolute(StringRef n, uint64_t va);
 
-  Symbol *addUndefined(StringRef name, InputFile *f, bool isWeakAlias);
+  Symbol *addUndefined(StringRef name, InputFile *f, bool overrideLazy);
   void addLazyArchive(ArchiveFile *f, const Archive::Symbol &sym);
   void addLazyObject(InputFile *f, StringRef n);
   void addLazyDLLSymbol(DLLFile *f, DLLFile::Symbol *sym, StringRef n);
diff --git a/lld/COFF/Symbols.h b/lld/COFF/Symbols.h
index ff84ff8ad7b28b..203a542466c68e 100644
--- a/lld/COFF/Symbols.h
+++ b/lld/COFF/Symbols.h
@@ -353,6 +353,10 @@ class Undefined : public Symbol {
     isAntiDep = antiDep;
   }
 
+  bool isECAlias(MachineTypes machine) const {
+    return weakAlias && isAntiDep && isArm64EC(machine);
+  }
+
   // If this symbol is external weak, replace this object with aliased symbol.
   bool resolveWeakAlias();
 };
diff --git a/lld/test/COFF/arm64ec-lib.test b/lld/test/COFF/arm64ec-lib.test
index a26c098547fdbe..78b9f326aab893 100644
--- a/lld/test/COFF/arm64ec-lib.test
+++ b/lld/test/COFF/arm64ec-lib.test
@@ -7,10 +7,16 @@ RUN: llvm-mc -filetype=obj -triple=aarch64-windows nsymref.s -o nsymref-aarch64.
 RUN: llvm-mc -filetype=obj -triple=arm64ec-windows sym.s -o sym-arm64ec.obj
 RUN: llvm-mc -filetype=obj -triple=x86_64-windows sym.s -o sym-x86_64.obj
 RUN: llvm-mc -filetype=obj -triple=aarch64-windows nsym.s -o nsym-aarch64.obj
+RUN: llvm-mc -filetype=obj -triple=arm64ec-windows ref-alias.s -o ref-alias.obj
+RUN: llvm-mc -filetype=obj -triple=arm64ec-windows ref-thunk.s -o ref-thunk.obj
+RUN: llvm-mc -filetype=obj -triple=arm64ec-windows func.s -o func.obj
+RUN: llvm-mc -filetype=obj -triple=x86_64-windows func-x86_64.s -o func-x86_64.obj
 RUN: llvm-mc -filetype=obj -triple=arm64ec-windows %S/Inputs/loadconfig-arm64ec.s -o loadconfig-arm64ec.obj
 
 RUN: llvm-lib -machine:arm64ec -out:sym-arm64ec.lib sym-arm64ec.obj nsym-aarch64.obj
 RUN: llvm-lib -machine:amd64 -out:sym-x86_64.lib sym-x86_64.obj
+RUN: llvm-lib -machine:arm64ec -out:func.lib func.obj
+RUN: llvm-lib -machine:arm64ec -out:func-x86_64.lib func-x86_64.obj
 
 Verify that a symbol can be referenced from ECSYMBOLS.
 RUN: lld-link -machine:arm64ec -dll -noentry -out:test.dll symref-arm64ec.obj sym-arm64ec.lib loadconfig-arm64ec.obj
@@ -26,6 +32,49 @@ RUN: not lld-link -machine:arm64ec -dll -noentry -out:test-err.dll nsymref-arm64
 RUN:              FileCheck --check-prefix=ERR %s
 ERR: error: undefined symbol: nsym
 
+Verify that a library symbol can be referenced, even if its name conflicts with an anti-dependency alias.
+RUN: lld-link -machine:arm64ec -dll -noentry -out:ref-alias-1.dll ref-alias.obj func.lib loadconfig-arm64ec.obj
+RUN: llvm-objdump -d ref-alias-1.dll | FileCheck -check-prefix=DISASM %s
+DISASM:      0000000180001000 <.text>:
+DISASM-NEXT: 180001000: d65f03c0     ret
+DISASM-EMPTY:
+
+RUN: llvm-readobj --hex-dump=.test ref-alias-1.dll | FileCheck -check-prefix=TESTSEC %s
+TESTSEC: 0x180004000 00100000
+
+The same test, but with a different input order.
+RUN: lld-link -machine:arm64ec -dll -noentry -out:ref-alias-2.dll func.lib ref-alias.obj loadconfig-arm64ec.obj
+RUN: llvm-objdump -d ref-alias-2.dll | FileCheck -check-prefix=DISASM %s
+RUN: llvm-readobj --hex-dump=.test ref-alias-2.dll | FileCheck -check-prefix=TESTSEC %s
+
+Verify that when an anti-dependency to a guess exit thunk is present, it is overridden by an archive symbol.
+RUN: lld-link -machine:arm64ec -dll -noentry -out:ref-thunk-1.dll ref-thunk.obj func.lib loadconfig-arm64ec.obj
+RUN: llvm-objdump -d ref-thunk-1.dll | FileCheck -check-prefix=DISASM %s
+RUN: llvm-readobj --hex-dump=.test ref-thunk-1.dll | FileCheck -check-prefix=TESTSEC %s
+
+The same test, but with a different input order.
+RUN: lld-link -machine:arm64ec -dll -noentry -out:ref-thunk-2.dll func.lib ref-thunk.obj loadconfig-arm64ec.obj
+RUN: llvm-objdump -d ref-thunk-2.dll | FileCheck -check-prefix=DISASM %s
+RUN: llvm-readobj --hex-dump=.test ref-thunk-2.dll | FileCheck -check-prefix=TESTSEC %s
+
+Test linking against an x86_64 library (which uses a demangled function name).
+RUN: lld-link -machine:arm64ec -dll -noentry -out:ref-x86-1.dll ref-thunk.obj func-x86_64.lib loadconfig-arm64ec.obj
+RUN: llvm-objdump -d ref-x86-1.dll | FileCheck -check-prefix=DISASM-X86 %s
+RUN: llvm-readobj --hex-dump=.test ref-x86-1.dll | FileCheck -check-prefix=TESTSEC %s
+
+DISASM-X86:      0000000180001000 <.text>:
+DISASM-X86-NEXT: 180001000: c3                           retq
+
+The same test, but with a different input order.
+RUN: lld-link -machine:arm64ec -dll -noentry -out:ref-x86-2.dll func-x86_64.lib ref-thunk.obj loadconfig-arm64ec.obj
+RUN: llvm-objdump -d ref-x86-2.dll | FileCheck -check-prefix=DISASM-X86 %s
+RUN: llvm-readobj --hex-dump=.test ref-x86-2.dll | FileCheck -check-prefix=TESTSEC %s
+
+A similar test using -start-lib for linking.
+RUN: lld-link -machine:arm64ec -dll -noentry -out:start-lib-1.dll ref-thunk.obj -start-lib func.obj -end-lib loadconfig-arm64ec.obj
+RUN: llvm-objdump -d start-lib-1.dll | FileCheck -check-prefix=DISASM %s
+RUN: llvm-readobj --hex-dump=.test start-lib-1.dll | FileCheck -check-prefix=TESTSEC %s
+
 #--- symref.s
     .data
     .rva sym
@@ -45,3 +94,39 @@ sym:
      .globl nsym
 nsym:
      .word 0
+
+#--- ref-alias.s
+    .weak_anti_dep func
+.set func,"#func"
+
+    .section .test, "r"
+    .rva func
+
+#--- ref-thunk.s
+    .weak_anti_dep func
+.set func, "#func"
+    .weak_anti_dep "#func"
+.set "#func", thunksym
+
+    .section .test, "r"
+    .rva func
+
+    .section .thnk,"xr",discard,thunksym
+thunksym:
+    mov w0, #2
+    ret
+
+#--- func.s
+    .text
+    .globl "#func"
+"#func":
+    ret
+
+    .weak_anti_dep func
+.set func,"#func"
+
+#--- func-x86_64.s
+    .text
+    .globl func
+func:
+    ret

Copy link

github-actions bot commented Oct 22, 2024

✅ With the latest revision this PR passed the C/C++ code formatter.

Copy link
Member

@mstorsjo mstorsjo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this looks fine, but I don't quite follow some of the explanations in a comment and in the commit message.

auto aux = coffSym.getAux<coff_aux_weak_external>();
bool overrideLazy = true;

// On ARM64EC, external functions don't emit undefined symbols. Instead,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you elaborate here and in the commit message, about what you mean about how external functions emit things. Outside of ARM64EC, where would external functions emit undefined symbols - do you mean when a calling function calls an external function? (Instinctively, the only thing I would expect a function to emit, would be a defined symbol?)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, I will adjust the commit, but here is a long version. Let’s walk through an example:

$ cat test.c
extern void func(void);
void caller(void) { func(); }

On a typical target, func would be an undefined symbol:

$ clang -c test.c -target aarch64-windows
$ llvm-readobj --symbols test.o
...
  Symbol {
    Name: func
    Value: 0
    Section: IMAGE_SYM_UNDEFINED (0)
    BaseType: Null (0x0)
    ComplexType: Null (0x0)
    StorageClass: External (0x2)
    AuxSymbolCount: 0
  }   
...

However, that’s not the case on ARM64EC:

$ clang -c test.c -target arm64ec-windows
$ llvm-readobj --symbols test.o
...
  Symbol {
    Name: #func$exit_thunk
    Value: 0
    Section: .wowthk$aa (7)
    BaseType: Null (0x0)
    ComplexType: Function (0x2)
    StorageClass: External (0x2)
    AuxSymbolCount: 0
  } 
...
  Symbol {
    Name: #func
    Value: 0
    Section: IMAGE_SYM_UNDEFINED (0)
    BaseType: Null (0x0)
    ComplexType: Null (0x0)
    StorageClass: WeakExternal (0x69)
    AuxSymbolCount: 1
    AuxWeakExternal {
      Linked: #func$exit_thunk (23)
      Search: AntiDependency (0x4)
    }
  } 
...
  Symbol {
    Name: func
    Value: 0
    Section: IMAGE_SYM_UNDEFINED (0)
    BaseType: Null (0x0)
    ComplexType: Null (0x0)
    StorageClass: WeakExternal (0x69)
    AuxSymbolCount: 1
    AuxWeakExternal {
      Linked: #func (41)
      Search: AntiDependency (0x4)
    }
  }

The reason for this is that the compiler doesn't know whether the callee will be an x86_64 function (in which case it will define the func symbol) or ARM64EC (in which case it will define the #func symbol). This approach works seamlessly when func is defined in another object file.

However, the usual rule is that weak externals (including anti-dependency symbols) take precedence over archive symbols. This would prevent both func and #func from being resolved to archive symbols. ARM64EC changes these rules to handle this scenario.

Additionally, function definitions also emit an anti-dependency symbol. Using caller from the same example:

$ llvm-readobj --symbols test.o
...
  Symbol {
    Name: #caller
    Value: 0
    Section: .text (4)
    BaseType: Null (0x0)
    ComplexType: Function (0x2)
    StorageClass: External (0x2)
    AuxSymbolCount: 0
  } 
...
  Symbol {
    Name: caller
    Value: 0
    Section: IMAGE_SYM_UNDEFINED (0)
    BaseType: Null (0x0)
    ComplexType: Null (0x0)
    StorageClass: WeakExternal (0x69)
    AuxSymbolCount: 1
    AuxWeakExternal {
      Linked: #caller (8)
      Search: AntiDependency (0x4)
    }
  } 
...

In addition to the defined #caller symbol, caller is defined as an alias (hybrid patchable functions would behave differently here). This alias is part of the implementation and should take precedence over archive symbols.

Weak aliases are not included in the archive index on any target. Therefore, if I were to create an archive containing the above test.o, the index would only include the #func symbol (this part is important for the other PR).

On ARM64EC, external functions do not emit undefined symbols. Instead,
they emit a pair of weak-dependency aliases: `func` to `#func`, and
`#func` to the func guess exit thunk. This change allows such aliases
to be overridden by lazy archive symbols, similar to how we handle
undefined symbols.
@cjacek
Copy link
Contributor Author

cjacek commented Oct 22, 2024

v2: Adjusted the comment and added an additional test for the alias that is considered part of the implementation.

@cjacek cjacek merged commit 9b88792 into llvm:main Oct 23, 2024
8 checks passed
@cjacek cjacek deleted the ec-alias branch October 23, 2024 10:43
@llvm-ci
Copy link
Collaborator

llvm-ci commented Oct 23, 2024

LLVM Buildbot has detected a new failure on builder lldb-remote-linux-ubuntu running on as-builder-9 while building lld at step 16 "test-check-lldb-api".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/195/builds/71

Here is the relevant piece of the build log for the reference
Step 16 (test-check-lldb-api) failure: Test just built components: check-lldb-api completed (failure)
...
PASS: lldb-api :: types/TestIntegerType.py (1199 of 1208)
PASS: lldb-api :: types/TestRecursiveTypes.py (1200 of 1208)
PASS: lldb-api :: types/TestIntegerTypeExpr.py (1201 of 1208)
PASS: lldb-api :: types/TestShortType.py (1202 of 1208)
PASS: lldb-api :: types/TestLongTypes.py (1203 of 1208)
PASS: lldb-api :: types/TestShortTypeExpr.py (1204 of 1208)
PASS: lldb-api :: types/TestLongTypesExpr.py (1205 of 1208)
PASS: lldb-api :: tools/lldb-server/TestNonStop.py (1206 of 1208)
PASS: lldb-api :: tools/lldb-server/TestLldbGdbServer.py (1207 of 1208)
UNRESOLVED: lldb-api :: tools/lldb-server/TestGdbRemote_vCont.py (1208 of 1208)
******************** TEST 'lldb-api :: tools/lldb-server/TestGdbRemote_vCont.py' FAILED ********************
Script:
--
/usr/bin/python3.12 /home/buildbot/worker/as-builder-9/lldb-remote-linux-ubuntu/llvm-project/lldb/test/API/dotest.py -u CXXFLAGS -u CFLAGS --env LLVM_LIBS_DIR=/home/buildbot/worker/as-builder-9/lldb-remote-linux-ubuntu/build/./lib --env LLVM_INCLUDE_DIR=/home/buildbot/worker/as-builder-9/lldb-remote-linux-ubuntu/build/include --env LLVM_TOOLS_DIR=/home/buildbot/worker/as-builder-9/lldb-remote-linux-ubuntu/build/./bin --libcxx-include-dir /home/buildbot/worker/as-builder-9/lldb-remote-linux-ubuntu/build/include/c++/v1 --libcxx-include-target-dir /home/buildbot/worker/as-builder-9/lldb-remote-linux-ubuntu/build/include/aarch64-unknown-linux-gnu/c++/v1 --libcxx-library-dir /home/buildbot/worker/as-builder-9/lldb-remote-linux-ubuntu/build/./lib/aarch64-unknown-linux-gnu --arch aarch64 --build-dir /home/buildbot/worker/as-builder-9/lldb-remote-linux-ubuntu/build/lldb-test-build.noindex --lldb-module-cache-dir /home/buildbot/worker/as-builder-9/lldb-remote-linux-ubuntu/build/lldb-test-build.noindex/module-cache-lldb/lldb-api --clang-module-cache-dir /home/buildbot/worker/as-builder-9/lldb-remote-linux-ubuntu/build/lldb-test-build.noindex/module-cache-clang/lldb-api --executable /home/buildbot/worker/as-builder-9/lldb-remote-linux-ubuntu/build/./bin/lldb --compiler /home/buildbot/worker/as-builder-9/lldb-remote-linux-ubuntu/build/bin/clang --dsymutil /home/buildbot/worker/as-builder-9/lldb-remote-linux-ubuntu/build/./bin/dsymutil --make /usr/bin/make --llvm-tools-dir /home/buildbot/worker/as-builder-9/lldb-remote-linux-ubuntu/build/./bin --lldb-obj-root /home/buildbot/worker/as-builder-9/lldb-remote-linux-ubuntu/build/tools/lldb --lldb-libs-dir /home/buildbot/worker/as-builder-9/lldb-remote-linux-ubuntu/build/./lib --platform-url connect://jetson-agx-2198.lab.llvm.org:1234 --platform-working-dir /home/ubuntu/lldb-tests --sysroot /mnt/fs/jetson-agx-ubuntu --env ARCH_CFLAGS=-mcpu=cortex-a78 --platform-name remote-linux /home/buildbot/worker/as-builder-9/lldb-remote-linux-ubuntu/llvm-project/lldb/test/API/tools/lldb-server -p TestGdbRemote_vCont.py
--
Exit Code: 1

Command Output (stdout):
--
lldb version 20.0.0git (https://github.com/llvm/llvm-project.git revision 9b88792291c6441aae7c66c046a9460984ddc7d2)
  clang revision 9b88792291c6441aae7c66c046a9460984ddc7d2
  llvm revision 9b88792291c6441aae7c66c046a9460984ddc7d2
Setting up remote platform 'remote-linux'
Connecting to remote platform 'remote-linux' at 'connect://jetson-agx-2198.lab.llvm.org:1234'...
Connected.
Setting remote platform working directory to '/home/ubuntu/lldb-tests'...
Skipping the following test categories: ['dsym', 'gmodules', 'debugserver', 'objc', 'lldb-dap']
connect to debug monitor on port 13134 failed, attempt #1 of 20
connect to debug monitor on port 13711 failed, attempt #2 of 20
connect to debug monitor on port 15205 failed, attempt #3 of 20
connect to debug monitor on port 12117 failed, attempt #4 of 20
connect to debug monitor on port 14733 failed, attempt #5 of 20
connect to debug monitor on port 12117 failed, attempt #6 of 20
connect to debug monitor on port 13195 failed, attempt #7 of 20
connect to debug monitor on port 12704 failed, attempt #8 of 20
connect to debug monitor on port 15115 failed, attempt #9 of 20
connect to debug monitor on port 14351 failed, attempt #10 of 20
connect to debug monitor on port 14181 failed, attempt #11 of 20
connect to debug monitor on port 14463 failed, attempt #12 of 20
connect to debug monitor on port 15606 failed, attempt #13 of 20
connect to debug monitor on port 13187 failed, attempt #14 of 20
connect to debug monitor on port 14128 failed, attempt #15 of 20
connect to debug monitor on port 15303 failed, attempt #16 of 20
connect to debug monitor on port 15683 failed, attempt #17 of 20
connect to debug monitor on port 13320 failed, attempt #18 of 20
connect to debug monitor on port 13072 failed, attempt #19 of 20
connect to debug monitor on port 14686 failed, attempt #20 of 20

--

@frobtech frobtech mentioned this pull request Oct 25, 2024
NoumanAmir657 pushed a commit to NoumanAmir657/llvm-project that referenced this pull request Nov 4, 2024
…ls (llvm#113283)

On ARM64EC, external function calls emit a pair of weak-dependency
aliases: `func` to `#func` and `#func` to the `func` guess exit thunk
(instead of a single undefined `func` symbol, which would be emitted on
other targets). Allow such aliases to be overridden by lazy archive
symbols, just as we would for undefined symbols.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants