-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[NativeAOT] Enable -dead_strip
linker optimization by default on Apple platforms
#103039
Conversation
@MichalStrehovsky @jkotas @rolfbjarne @filipnavara Could you please provide your opinion on this? |
If I understand We can express this as a COMDAT section, but as far as I know, it's not expressible in |
I consider removing That said, I am glad that you are looking into it, summarizing it, and trying to find a solution. One thing that comes to mind is, did you try to mark all the sections produced by object writer with |
Mach-O doesn't have the notion of COMDAT sections. It entirely depends on the subsections defined by symbols and linking based on that. The closest equivalent is the That said, the atoms/subsections don't normally get reordered (*). Dependencies between atoms/subsections can be modeled with no-op relocations. We currently don't do this modeling, but it may be reasonable thing to do if there's more than one defined symbol with different offset here. (*) It is not guaranteed but there are rules to it and last time I checked we were not hitting anything that would cause a re-order. |
At the end of the day, anything that stops linker's discretion to remove things or move things around would work. We do not want any dead stripping on the object file - the object file is not a classic object file (bunch of .c files compiled to .o in a vacuum, not knowing about other .c files that are part of the app); the .o file generated by ILC is a result of whole program analysis and we already removed everything that's dead. I would consider it a compiler bug if the linker is legitimately able to remove something. I can't think of anything emitted by the compiler that we could mark as dead strippable and get a benefit from it. |
That's true as long as the ILC's output is the entire executable, but not if the output is a native library that can be consumed by other native libraries (or even an executable exporting symbols consumed by other native libraries). For instance: public class MyManagedLogic {
[UnmanagedCallersOnly (EntryPoint = "MyManagedFunction")]
public static int MyManagedFunction () => 42;
} If no other native code calls |
If I understand the problem here correctly we are looking for a way to keep the symbol and its prefix together.
Q1: @filipnavara do you have experience with |
No, but it looks intriguing. It's fairly recent addition to the Mach-O file format (~3 years ago) so presumably it's less likely to be considered a legacy feature. |
At minimum the method is going to be referenced from stack trace metadata tables that will keep it rooted no matter what. And even if we manage to remove that method, it's transitive closure will likely still be unstrippable unless it only ever calls non-generic static methods. There's a bunch of bookkeeping for reflection, type loader, etc. that is in form of hashtables and tables that a native linker has no chance to remove and restructure. The possible upswing is very small. The possible downswing is pretty big (damaging the MethodTable prefix will lead to crashes in GC that may or may not be close to where the problem is, similar for the statics, and I don't have an accounting of other places that may assume a linker is not going to disassemble entire sections). |
A small update regarding:
I marked all section except Expand to show the header and load commands for the Mach-o object file of a console app
|
src/coreclr/tools/aot/ILCompiler.Compiler/Compiler/ObjectWriter/MachObjectWriter.cs
Outdated
Show resolved
Hide resolved
src/coreclr/tools/aot/ILCompiler.Compiler/Compiler/ObjectWriter/MachObjectWriter.cs
Outdated
Show resolved
Hide resolved
/azp run runtime-ioslike |
No pipelines are associated with this pull request. |
Azure Pipelines successfully started running 1 pipeline(s). |
MH_SUBSECTIONS_VIA_SYMBOLS
and enable -dead_strip
by default on Apple platforms -dead_strip
linker optimization by default on Apple platforms
/azp run runtime-nativeaot-outerloop |
Azure Pipelines successfully started running 1 pipeline(s). |
67490d1
to
1f2628a
Compare
dotnet#103039 (comment) found that we're generating some orphaned `MethodTable`s. They seem to be coming from here. Wondering if anything breaks if we just delete this.
@MichalStrehovsky is there any other concern regarding these changes? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@dotnet-policy-service rerun |
Q: will this also be backported to dotnet 8 ? If so, in which version could we expect this? |
Description
This PR:
-dead_strip
linker optimization by default.Background
When Xcode 15 got released we started hitting issues with NAOT executables as the new linker was producing incorrect unwind tables causing crashes during GC or on first stack walk.
We got this fixed by marking the ILC output and our libs with
.subsections_via_symbols
in:In the meantime we started using managed object writer in .NET9 but we still mark our object Mach-O files with
.subsections_via_symbols
:runtime/src/coreclr/tools/aot/ILCompiler.Compiler/Compiler/ObjectWriter/MachObjectWriter.cs
Line 197 in 9fca0c3
Regarding the
-dead_strip
flag we disabled it for iOS platforms as the build was crashing the old Apple linker: #88032Current state
The
.subsections_via_symbols
fixes solved us some problems but unfortunately they did not seem to completely solve the compatibility with the new linker as from: #97745 we started noticing similar problems which in the end resulted with falling back to using the old linkerld_classic
#98726 (backport: #97856)@filipnavara also opened a ticket to Apple regarding the new linker issues which still have not been resolved.
Additionally, our customers started reporting startup crashes when linker optimizations switches are enabled (
-dead_strip
and-flto
):RhpNewArray
when linked with -dead_strip #96663Dead stripping and the
.subsections_via_symbols
flagApple's linking is based on atoms - indivisible chunks of code or data which is different to traditional section-based linkers. This means that linker operates at a finer grain but it also means that it gives the linker freedom to break up the sections and reorganize their subsections to achieve better optimization. This is particularly useful for dead code stripping (ref: https://github.com/apple-oss-distributions/ld64/blob/main/doc/design/linker.html). To enable an object file to be split into atoms and qualify it for dead stripping by the linker, it needs to be marked with a
.subsections_via_symbols
flag (ref: https://opensource.apple.com/source/cctools/cctools-622.5.1/RelNotes/CompilerTools.html).However, NAOT codegen/runtime has many assumptions about the layout of the generated code/data which shouldn't be altered once object file is produced. For this reason we need to mark all symbols as non-deadstrippable to avoid problems with the platform linker.
Click here to expand the case study used to asses the behaviour of dead stripping on a macOS console app
Case study: NAOT hello world console app with
-dead_strip
enabled on macOS./build.sh clr+clr.aot+libs+packs -c Release
dotnet new console
-p:IlcDehydrate=false
the app then crashes in:StartupCodeHelpers.InitializeModules
because the module count is 0 as the whole__modules
section gets stripped__modules
section withS_ATTR_NO_DEAD_STRIP
the app will crash in:StartupCodeHelpers.InitializeStatics
due to stripping and reordering of the GCStatic MethodTables which causesRhAllocateNewObject
allocation to fail asblockAddr
points to a nonexistent location.NOTE: Kudos to @anatawa12 as the first two problems were called out in: #96743
Questions
Size savings
The savings are coming from dead stripping other native dependencies so are not related to the size of the managed code.
Fixes: #96663
PS Manually tested that both
-dead_strip
and-flto
are now working properly