Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chore(ilc/mach-o): add workaround or fix for problem with dead_strip #96743

Conversation

anatawa12
Copy link
Contributor

@anatawa12 anatawa12 commented Jan 10, 2024

Adds workaround (or fix) for #96663

I think it might be better to backport to .net 8

@ghost ghost added the community-contribution Indicates that the PR has been added by a community member label Jan 10, 2024
@ghost
Copy link

ghost commented Jan 10, 2024

Tagging subscribers to this area: @agocke, @MichalStrehovsky, @jkotas
See info in area-owners.md if you want to be subscribed.

Issue Details

Adds workaround (or fix) for #96663

Author: anatawa12
Assignees: -
Labels:

area-NativeAOT-coreclr

Milestone: -

@@ -347,6 +347,10 @@ private protected override void CreateSection(ObjectNodeSection section, string
{
".dotnet_eh_table" => S_ATTR_DEBUG,
".eh_frame" => S_ATTR_LIVE_SUPPORT | S_ATTR_STRIP_STATIC_SYMS | S_ATTR_NO_TOC,
// workaround problem with dead_strip
// see https://github.com/dotnet/runtime/issues/96663
"hydrated" => S_ATTR_NO_DEAD_STRIP,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hydration target section is essentially BSS, or uninitalized data. The failures in the CI are from the conflicting definition of uninitialized data (ie. no actual data are in the file, just size) and S_ATTR_NO_DEAD_STRIP.

The linker cannot strip something that's not there in the first place, so this is invalid combination.

Copy link
Contributor Author

@anatawa12 anatawa12 Jan 10, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Umm... In my mac, without S_ATTR_NO_DEAD_STRIP on the hydrated section, another memory error occurs and it is looks hydrated data is not correct.
For example, HasComponentSize for vtable of TypeManagerHandle[] returns false, which violates this assertion.

I found several symbols in the hydrated section are removed and size of hydrated section is changed so I assumed the stripping symbols and spaces in the hydrated will cause problem with __dehydrated_data .

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not entirely familiar with the format of dehydrated data. Out of curiosity, did you try running the build with IlcDehydrate=false?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

did you try running the build with IlcDehydrate=false?

I just tried with IlcDehydrate=false and it works.

Copy link
Member

@filipnavara filipnavara Jan 10, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume the linker doesn't know precisely where the dehydrated data start and end, and how the relocation table at the end of the data creates the dependencies that should not be stripped. This is non-trivial problem, unfortunately. Maybe we can mark the dehydrated data as non-dead-stripable.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's a little hard to mark dehydrated data as non-dead-stripable since it's inside const section shared with other data.
However, referencing dehydrated data from C code as a global variable to not strip dehydrated data doesn't fix problem.

(Actually, I first found dehydrated data is stripped so I tried referencing __dehydrated and __Modules symbol. This fixes that hydrated section is not initialized. however, it looks hydrated section is not restored correctly and then, I found hydrated section size is changed and symbols are stripped. that's why I added S_ATTR_NO_DEAD_STRIP flag. Finally, I added S_ATTR_NO_DEAD_STRIP to _modules as a alternative for referencing __dehydrated and __Modules symbol.)

// workaround problem with dead_strip
// see https://github.com/dotnet/runtime/issues/96663
"hydrated" => S_ATTR_NO_DEAD_STRIP,
"__modules" => S_ATTR_NO_DEAD_STRIP,
Copy link
Member

@filipnavara filipnavara Jan 10, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW, the attribute does make sense here (for the __modules section).

@ivanpovazan
Copy link
Member

@filipnavara I bumped into this rather old documentation https://developer.apple.com/library/archive/documentation/Performance/Conceptual/CodeFootprint/Articles/CompilerOptions.html#//apple_ref/doc/uid/20001861-CJBJFIDD stating the following:

To enable dead-code stripping from the command line, pass the -dead_strip option to ld. You should also pass the -gfull option to GCC to generate a complete set of debugging symbols for your code. The linker uses this extra debugging information to dead strip the executable.

I know it refers to GCC, but still was wondering whether you had any experience with using -gfull instead of -g and if it helped in anyway? LLVM reference is not really useful: https://clang.llvm.org/docs/ClangCommandLineReference.html#cmdoption-clang-gfull

@filipnavara
Copy link
Member

filipnavara commented Jan 16, 2024

I know it refers to GCC, but still was wondering whether you had any experience with using -gfull instead of -g and if it helped in anyway?

No, I have never heard of -gfull. The document was last updated in 2006 and it predates the introduction of ld64, the second generation linker that was used for many years now (first version of ld64 was published around 2005).

ld64 uses a very different strategy compared to traditional linkers. Traditional linkers use sections as the base "unbreakable" unit and they combine them together. The Mach-O format is, however, limited to 255 sections. ld64 deals with the limit by splitting each section into a smaller unit called "atom" which has its own list of fixups/relocations, and dependencies. These atomic units are then recomposed into sections in the final executable. Generally speaking, the split is driven by the symbol table and many different options. For example, the "subsections-via-symbols" option tells the linker that each method is fully contained within one continuous block inside a section. The linker then introduces relationships between the atoms by interpreting the relocations, and global flags (eg. lack of "subsections-via-symbols" is modelled as all methods within a section being dependent on each other and thus making the whole section unbreakable). Artificial relationships between atoms can be introduced by adding no-op relocations. For a more complete description, I recommend reading the official design document for the linker.

In the context of the dehydrated data the format seems to be a data block followed by a relocation table. We should ensure that it's treated by the linker as single atom, or at worst a dependent set of atoms (by introducing the no-op relocations). The basic strategy the linker uses is to split the section by symbols in the symbol table. We should verify if we get the expected atoms or not. This can be done either by manually inspecting the otool/objdump output and checking what symbols intersect with the dehydrated data, or optionally the zld linker can be compiled and debugged to get quite accurate approximation of what the official Apple linker does.

@jeffschwMSFT jeffschwMSFT requested a review from agocke February 26, 2024 21:05
@agocke
Copy link
Member

agocke commented Feb 27, 2024

It sounds like this PR still needs some work. I'm going to mark it draft for now (or until someone manages to figure out how to prevent the dehydrated section from being stripped).

@agocke agocke marked this pull request as draft February 27, 2024 18:45
Copy link
Contributor

Draft Pull Request was automatically closed for 30 days of inactivity. Please let us know if you'd like to reopen it.

@github-actions github-actions bot locked and limited conversation to collaborators Apr 28, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-NativeAOT-coreclr community-contribution Indicates that the PR has been added by a community member
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants