Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Write linker script for lld to enable gc-sections #84493

Merged
merged 1 commit into from
Apr 8, 2023

Conversation

am11
Copy link
Member

@am11 am11 commented Apr 7, 2023

The gc-sections (size) optimization was disabled for llvm linker because we were getting the following link errors with
dotnet publish -p:PublishAot=true -p:LinkerFlavor=lld

  Generating native code
ld.lld : error : undefined symbol: __start___modules [/exe1/exe1.csproj]
  >>> referenced by main.cpp:182 (/__w/1/s/src/coreclr/nativeaot/Bootstrap/main.cpp:182)
  >>>               main.cpp.o:(main) in archive libbootstrapper.a
  >>> referenced by main.cpp:182 (/__w/1/s/src/coreclr/nativeaot/Bootstrap/main.cpp:182)
  >>>               main.cpp.o:(main) in archive libbootstrapper.a
  >>> the encapsulation symbol needs to be retained under --gc-sections properly; consider -z nostart-stop-gc (see https://lld.llvm.org/ELF/start-stop-gc)
  
ld.lld : error : undefined symbol: __stop___modules [/exe1/exe1.csproj]
  >>> referenced by main.cpp:182 (/__w/1/s/src/coreclr/nativeaot/Bootstrap/main.cpp:182)
  >>>               main.cpp.o:(main) in archive libbootstrapper.a
  >>> referenced by main.cpp:182 (/__w/1/s/src/coreclr/nativeaot/Bootstrap/main.cpp:182)
  >>>               main.cpp.o:(main) in archive libbootstrapper.a
clang : error : linker command failed with exit code 1 (use -v to see invocation) [/exe1/exe1.csproj]

lld eagerly drops this section while the default linker, bfd, sees it as effective and retains it. The differences are described in this article: https://maskray.me/blog/2020-12-19-lld-and-gnu-linker-incompatibilities. The weak sections are discarded by lld's --gc-sections which are needed by live-sections and we don't get the DT_NEEDED.

There are (at least) three ways to fix this:

  • Explicitly specify the section we want to keep in linker script.
    • This is what the current state of this PR is bringing.
    • Not a neatest solution but it is better than the current situation (gc-sections opt. completely disabled).
  • Annotate section definition __attribute__((retain,used,section(__modules))).
  • Emit DT_NEEDED for __modules section to get the formal treatment.
    • This would be a nicest solution for range of linkers in the wild, but so far, I haven't figured out how to achieve that (maybe some dummy usage in main will do the trick or some other __attribute?). If anyone else has insights, feel free to chime in.

With -p:LinkerFlavor=lld and -p:StripSymbols=true, there is a noticeable difference in size of published binary due to gc-sections optimization:

--- main
+++ pr

- 1922376 (1.9M)
+ 1807688 (1.8M)

@ghost ghost added the community-contribution Indicates that the PR has been added by a community member label Apr 7, 2023
@ghost
Copy link

ghost commented Apr 7, 2023

Tagging subscribers to this area: @agocke, @MichalStrehovsky, @jkotas
See info in area-owners.md if you want to be subscribed.

Issue Details

The gc-sections (size) optimization was disabled for llvm linker because we were getting the following link errors with
dotnet publish -p:PublishAot=true -p:LinkerFlavor=lld

  Generating native code
ld.lld : error : undefined symbol: __start___modules [/exe1/exe1.csproj]
  >>> referenced by main.cpp:182 (/__w/1/s/src/coreclr/nativeaot/Bootstrap/main.cpp:182)
  >>>               main.cpp.o:(main) in archive libbootstrapper.a
  >>> referenced by main.cpp:182 (/__w/1/s/src/coreclr/nativeaot/Bootstrap/main.cpp:182)
  >>>               main.cpp.o:(main) in archive libbootstrapper.a
  >>> the encapsulation symbol needs to be retained under --gc-sections properly; consider -z nostart-stop-gc (see https://lld.llvm.org/ELF/start-stop-gc)
  
ld.lld : error : undefined symbol: __stop___modules [/exe1/exe1.csproj]
  >>> referenced by main.cpp:182 (/__w/1/s/src/coreclr/nativeaot/Bootstrap/main.cpp:182)
  >>>               main.cpp.o:(main) in archive libbootstrapper.a
  >>> referenced by main.cpp:182 (/__w/1/s/src/coreclr/nativeaot/Bootstrap/main.cpp:182)
  >>>               main.cpp.o:(main) in archive libbootstrapper.a
clang : error : linker command failed with exit code 1 (use -v to see invocation) [/exe1/exe1.csproj]

lld eagerly drops this section while the default linker, bfd, sees it as effective and retains it. The differences are described in this article: https://maskray.me/blog/2020-12-19-lld-and-gnu-linker-incompatibilities. The weak sections are discarded by lld's --gc-sections which are needed by live-sections and we don't get the DT_NEEDED.

There are (at least) three ways to fix this:

  • Explicitly specify the section we want to keep in linker script.
    • This is what the current state of this PR is bringing.
    • Not a neatest solution but it is better than the current situation (gc-sections opt. completely disabled).
  • Annotate section definition __attribute__((retain,used,section(__modules))).
  • Emit DT_NEEDED for __modules section to get the formal treatment.
    • This would be a nicest solution for range of linkers in the wild, but so far, I haven't figured out how to achieve that (maybe some dummy usage in main will do the trick or some other __attribute?). If anyone else has insights, feel free to chime in.

With -p:LinkerFlavor=lld and -p:StripSymbols=true, there is a noticeable difference in size of published binary due to gc-sections optimization:

--- main
+++ pr

- 1922376 (1.9M)
+ 1807688 (1.8M)

TODO: revert the change in Microsoft.NETCore.Native.Unix.targets which is temporary; to exercise CI tests against lld.

Author: am11
Assignees: -
Labels:

community-contribution, area-NativeAOT-coreclr

Milestone: -

@am11 am11 changed the title Feature/nativeaot/build integration lld Write linker script for lld to enable gc-sections Apr 7, 2023
@am11 am11 force-pushed the feature/nativeaot/build-integration-lld branch from eb2ec8f to 15a45c9 Compare April 7, 2023 19:56
@am11 am11 force-pushed the feature/nativeaot/build-integration-lld branch from 15a45c9 to 853420d Compare April 7, 2023 22:31
@am11 am11 marked this pull request as ready for review April 7, 2023 22:32
@am11 am11 requested a review from MichalStrehovsky as a code owner April 7, 2023 22:32
@am11 am11 requested a review from jkotas April 7, 2023 22:33
Copy link
Member

@jkotas jkotas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thank you!

@jkotas jkotas merged commit 3ab663e into dotnet:main Apr 8, 2023
@sbomer
Copy link
Member

sbomer commented Apr 10, 2023

@am11 it looks like the OVERWRITE_SECTIONS command wasn't available until LLVM 13, so it's causing build failures in #84148. What is the minimum version of clang that we support when building NativeAot apps? The instructions of https://learn.microsoft.com/en-us/dotnet/core/deploying/native-aot/#prerequisites say to install clang on ubuntu 18.04, which will give clang 6. But the 18.04 repos have up to clang-10 available.

sbomer added a commit to sbomer/runtime that referenced this pull request Apr 10, 2023
@am11 am11 deleted the feature/nativeaot/build-integration-lld branch April 11, 2023 03:41
@am11
Copy link
Member Author

am11 commented Apr 11, 2023

The instructions of https://learn.microsoft.com/en-us/dotnet/core/deploying/native-aot/#prerequisites say to install clang on ubuntu 18.04, which will give clang 6. But the 18.04 repos have up to clang-10 available.

Clang and lld are different things. clang is a compiler driver and lld is a linker. The documentation you are pointing to talks about the minimum version of clang and not the linker. We have only ever supported the "default" linker bfd on linux until d0fcd62 which added lld support the first time with "experimental" status in mind. We still use default platform linker in the CI.

#84148 is switching to linker to non-default against the platform convention, which is not recommended. So we shouldn't be doing that in first place because a) lld is not a default linker on top 10 linux distros b) testing builds CI with "different than everybody" linker is wasteful and c) lld eventually regresses the binary size.

@sbomer
Copy link
Member

sbomer commented Apr 11, 2023

@am11 I understand - I'm just suggesting we decide on minimum llvm versions since LinkerFlavor=lld won't work with this change unless you install the llvm 13+ lld. Maybe the min version can be different depending on choice of linker, if we don't care to support lld as widely.

We've been trying to move from binutils to llvm-based tools in our builds. This both reduces dependencies in our build images, and makes it easier to cross-compile, which is the approach we're using for the linux builds going forward. Using lld simplifies cross-compilation by not requiring a separate linker variant for each target architecture, and this already works with ILCompiler (thanks for adding that!).

I understand that bfd is the default for ILCompiler, and is what customers will end up using in most cases. From @MichalStrehovsky it sounds like we have additional testing in the dotnet/performance repo that should help us catch size regressions with bfd, but it may also be a good idea to add non-cross-compilation tests in dotnet/runtime that use bfd. Either way, I think we should decide on the minimum version of llvm that we support (both with the default settings and when using lld, if that requires a different version).

c) lld eventually regresses the binary size.

Is there still a significant size regression with gc-sections? I plan to move to llvm 16 ASAP which should work fine with your OVERWRITE_SECTIONS fix.

@am11
Copy link
Member Author

am11 commented Apr 11, 2023

I'm just suggesting we decide on minimum llvm versions since LinkerFlavor=lld won't work with this change unless you install the llvm 13+ lld.

This linker script is not needed for lower version as all sections are retained with lld 12 and below. (as much as I dislike branching on version checks but since feature-detection would be too complex here) We can add the condition if we want to support older than v13: am11@04fb39b

@Thefrank, is llvm < 13 scenario important enough for FreeBSD? Ports seem to have all versions of toolchain 10 to 15.

FWIW, llvm 7 was the first version to support lld and gcc 9 was the first version to support fuse-ld=lld.

Is there still a significant size regression with gc-sections?

On linux-arm64 with latest daily build (includes this change):

--- bfd
+++ lld

# size in bytes

# dotnet new console (~2.2 K diff)
- 1496072
+ 1498192

# dotnet new api -aot (~41 K diff)
- 12743672
+ 12785208

The difference is not fixed. It grows with the size of application.

@Thefrank
Copy link
Contributor

@am11 should be no need for a special check or case for older llvm on FreeBSD

explanation: the current "meta" port (what you get when you type pkg install llvm) is for llvm15 for all in-support versions of FreeBSD. The default system linker has been lld for a long time.

My CI system uses "old supported" which has been llvm10 and functional until Sunday. Moving forward I will start using llvm15

sbomer added a commit to sbomer/runtime that referenced this pull request Apr 17, 2023
…tnet#84493)""

This reverts the revert of commit 3ab663e.
The revert of that commit was included in 887c043.
sbomer added a commit that referenced this pull request Apr 17, 2023
* Revert "Revert "Write linker script for lld to enable gc-sections (#84493)""

This reverts the revert of commit 3ab663e.
The revert of that commit was included in 887c043.

* Lower the lld version requirement

* Adjust dwarfdump baseline

---------

Co-authored-by: Adeel <3840695+am11@users.noreply.github.com>
@ghost ghost locked as resolved and limited conversation to collaborators May 12, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-NativeAOT-coreclr community-contribution Indicates that the PR has been added by a community member
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants