Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ObjWriter: Place unboxing section along other TEXT sections #97960

Closed
wants to merge 3 commits into from

Conversation

filipnavara
Copy link
Member

@filipnavara filipnavara commented Feb 5, 2024

Ref: #97745 (comment)

The new Apple linker generates unsorted garbage unwinding data for the __unbox section in the final executable. Apparently reordering the sections in the object file to place it alongside other TEXT sections helps.

@ghost ghost added the community-contribution Indicates that the PR has been added by a community member label Feb 5, 2024
@ghost
Copy link

ghost commented Feb 5, 2024

Tagging subscribers to this area: @agocke, @MichalStrehovsky, @jkotas
See info in area-owners.md if you want to be subscribed.

Issue Details

Ref: #97745 (comment)

The new Apple linker generates unsorted garbage unwinding data for the __unbox section in the final executable. Apparently reordering the sections in the object file to place it along other TEXT sections helps.

Author: filipnavara
Assignees: -
Labels:

community-contribution, area-NativeAOT-coreclr

Milestone: -

@filipnavara filipnavara added the os-mac-os-x macOS aka OSX label Feb 5, 2024
@filipnavara
Copy link
Member Author

Unfortunately, I have no reliable way to test it and reason about it since the new linker is closed source. The change itself should be rather harmless. Thoughts?

@jkotas
Copy link
Member

jkotas commented Feb 20, 2024

I think we can give this workaround a try. If it works, it is better than the workaround that we have pushed into .NET 8 servicing.

I would do it for Apple platforms only and added a comment with a link to the issue that it is working around.

@filipnavara
Copy link
Member Author

If it works, it is better than the workaround that we have pushed into .NET 8 servicing.

I'm still waiting if Apple provides a better guidance than to use -ld_classic.

I would do it for Apple platforms only and added a comment with a link to the issue that it is working around.

Honestly, I don't think there's a point in limiting this to Apple platforms. We already pre-create the other TEXT sections, so this only brings the last one into consistency. For any reasonable linker it should make no difference and I tried this with lld (LLVM), ld (Binutils), sold, ld64 (both the "Classic" and "Prime" versions), and Microsoft link.

@filipnavara filipnavara marked this pull request as ready for review February 20, 2024 17:15
Copy link
Member

@jkotas jkotas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you!

Copy link
Member

@VSadov VSadov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Let's hope this works!

@VSadov
Copy link
Member

VSadov commented Feb 20, 2024

I will run some tests with the changes locally - just to be sure.

@VSadov
Copy link
Member

VSadov commented Feb 20, 2024

ILC works!

But a few tests fail to build with linker failures like:

  ld: Undefined symbols:
    Thread::IsCurrentThreadInCooperativeMode(), referenced from:
        ObjCMarshalNative::RegisterBeginEndCallback(void*) in libRuntime.WorkstationGC.a[56](interoplibinterface_objc.cpp.o)
  clang: error: linker command failed with exit code 1 (use -v to see invocation)

Not sure if this is some unrelated existing issue, or caused by the change, or just another bug in Prime linker that we only see now.

This is for ./build.sh clr+libs+libs.tests -rc checked -lc release /p:TestNativeAot=true -test

@jkotas
Copy link
Member

jkotas commented Feb 20, 2024

But a few tests fail to build with linker failures like:

This is: #98266

@VSadov
Copy link
Member

VSadov commented Feb 20, 2024

./build.sh clr+libs+libs.tests -rc release -lc release /p:TestNativeAot=true -test

had a lot more progress, but eventually failed with

  0  0x1048a6f2c  __assert_rtn + 72
  1  0x1047c7970  ld::Fixup::setAddend(std::__1::variant<long long, ld::Fixup::LargeAddendIndex>) + 104
  2  0x104818b48  ld::DynamicAtom::addFixup(ld::FixupBuilder) + 1384
  3  0x1047d6928  ld::KindToRelocs::addFixupDiffFromSymbolNums(ld::AtomPlacement const&, ld::DynamicAtom*, unsigned int, unsigned int, unsigned int, long long) const + 456
  4  0x1047e9de0  ld::InputFiles::SliceParser::parseObjectFile(mach_o::Header const*) const + 6364
  5  0x1047fa404  ld::InputFiles::parseAllFiles(void (ld::AtomFile const*) block_pointer)::$_7::operator()(unsigned long, ld::FileInfo const&) const + 420
  6  0x181bb4950  _dispatch_client_callout2 + 20
  7  0x181bc7ba0  _dispatch_apply_invoke + 176
  8  0x181bb4910  _dispatch_client_callout + 20
  9  0x181bc63cc  _dispatch_root_queue_drain + 864
  10  0x181bc6a04  _dispatch_worker_thread2 + 156
  11  0x181d620d8  _pthread_wqthread + 228
  ld: Assertion failed: (index == _addend), function setAddend, file Fixup.cpp, line 131.
  clang: error: linker command failed with exit code 1 (use -v to see invocation)

let me sync up with the branch and try again. Maybe it is the same reason as in checked

@VSadov
Copy link
Member

VSadov commented Feb 20, 2024

now checked fails with:

  Process terminated. Assertion failed.
     at System.Diagnostics.DebugProvider.Fail(String, String) + 0x30
     at System.Diagnostics.Debug.Fail(String, String) + 0x44
     at ILCompiler.DependencyAnalysis.ExternalReferencesTableNode.GetIndex(ISymbolNode, Int32) + 0x58
     at ILCompiler.DependencyAnalysis.TypeMetadataMapNode.GetData(NodeFactory, Boolean) + 0x274
     at ILCompiler.ObjectWriter.ObjectWriter.EmitObject(String, IReadOnlyCollection`1, IObjectDumper, Logger) + 0x5b4
     at ILCompiler.ObjectWriter.ObjectWriter.EmitObject(String, IReadOnlyCollection`1, NodeFactory, ObjectWritingOptions, IObjectDumper, Logger) + 0x168
     at ILCompiler.Compilation.ILCompiler.ICompilation.Compile(String, ObjectDumper) + 0x80
     at ILCompiler.Program.Run() + 0x2460
     at ILCompiler.ILCompilerRootCommand.<>c__DisplayClass236_0.<.ctor>b__0(ParseResult result) + 0x330
     at System.CommandLine.Invocation.InvocationPipeline.Invoke(ParseResult) + 0x60
     at System.CommandLine.ParseResult.Invoke() + 0x44
     at ILCompiler.Program.Main(String[] args) + 0xe0
     

not sure which test that was.

It made quite a lot of progress, I expected the build to pass.
I'll see what we have in release, and then try checked again (after carefully cleaning up things - to be really sure nothing got stuck from previous build ).

@filipnavara
Copy link
Member Author

Thanks for running the additional tests. I didn’t try much more than the NativeAOT smoke tests and self-compiling ILC.

I have no details from Apple about the root cause so this is a bit of a shot in the dark. It’s a relatively harmless change and seems to help at least in some cases. Worst case we can always backport your -ld_classic workaround but at least this should not make things worse than they already are.

@VSadov
Copy link
Member

VSadov commented Feb 20, 2024

It’s a relatively harmless change and seems to help at least in some cases.

It definitely helps and looks like an incremental improvement.
There could simply be more than one issue and avoiding one exposes us to another.

@VSadov
Copy link
Member

VSadov commented Feb 20, 2024

release (as in ./build.sh clr+libs+libs.tests -rc release -lc release /p:TestNativeAot=true -test) had a few failures that look almost random.

Some are in ILC, some in the linker.
Perhaps some uninitialized data in the linker causes nondeterminism.

EXEC : error : Object reference not set to an instance of an object. [/Users/vs/Hosting01/runtime/src/libraries/Microsoft.Extensions.DependencyModel/tests/Microsoft.Extensions.DependencyModel.Tests.csproj::TargetFramework=net9.0]
  System.NullReferenceException: Object reference not set to an instance of an object.
     at ILCompiler.DependencyAnalysis.GenericVarianceDetails.Equals(Object) + 0x2fd8
     at ILCompiler.DependencyAnalysisFramework.DependencyAnalyzer`2.GetStaticDependenciesImpl(DependencyNodeCore`1) + 0x48
     at ILCompiler.DependencyAnalysisFramework.DependencyAnalyzer`2.GetStaticDependencies(DependencyNodeCore`1) + 0x3c
     at ILCompiler.DependencyAnalysisFramework.DependencyAnalyzer`2.ProcessMarkStack() + 0xd0
     at ILCompiler.DependencyAnalysisFramework.DependencyAnalyzer`2.ComputeMarkedNodes() + 0x40
     at ILCompiler.ILScanner.ILCompiler.IILScanner.Scan() + 0x20
     at ILCompiler.Program.<Run>g__RunScanner|4_0(Program.<>c__DisplayClass4_0&) + 0x1c4
     at ILCompiler.Program.Run() + 0x22a8
     at ILCompiler.ILCompilerRootCommand.<>c__DisplayClass236_0.<.ctor>b__0(ParseResult result) + 0x330
/Users/vs/Hosting01/runtime/artifacts/bin/coreclr/osx.arm64.Release/build/Microsoft.NETCore.Native.targets(310,5): error MSB3073: The command ""/Users/vs/Hosting01/runtime/artifacts/bin/coreclr/osx.arm64.Release/ilc-published/ilc" @"/Users/vs/Hosting01/runtime/artifacts/obj/Microsoft.Extensions.DependencyModel.Tests/Release/net9.0/native/Microsoft.Extensions.DependencyModel.Tests.ilc.rsp"" exited with code 1. [/Users/vs/Hosting01/runtime/src/libraries/Microsoft.Extensions.DependencyModel/tests/Microsoft.Extensions.DependencyModel.Tests.csproj::TargetFramework=net9.0]
  Process terminated. Access Violation: Attempted to read or write protected memory. This is often an indication that other memory is corrupt. The application will be terminated since this platform does not support throwing an AccessViolationException.
     at System.RuntimeExceptionHelpers.FailFast(String, Exception, String, RhFailFastReason, IntPtr, IntPtr) + 0x190
     at System.RuntimeExceptionHelpers.GetRuntimeException(ExceptionIDs) + 0x2cc
     at System.Runtime.EH.GetClasslibException(ExceptionIDs, IntPtr) + 0x34
     at System.Runtime.EH.RhThrowHwEx(UInt32, EH.ExInfo&) + 0xd4
     at ILCompiler.DependencyAnalysis.GenericVarianceDetails.Equals(Object) + 0x2c50
     at ILCompiler.DependencyAnalysisFramework.DependencyAnalyzer`2.GetStaticDependenciesImpl(DependencyNodeCore`1) + 0x48
     at ILCompiler.DependencyAnalysisFramework.DependencyAnalyzer`2.GetStaticDependencies(DependencyNodeCore`1) + 0x3c
     at ILCompiler.DependencyAnalysisFramework.DependencyAnalyzer`2.ProcessMarkStack() + 0xd0
     at ILCompiler.DependencyAnalysisFramework.DependencyAnalyzer`2.ComputeMarkedNodes() + 0x40
     at ILCompiler.ILScanner.ILCompiler.IILScanner.Scan() + 0x20
     at ILCompiler.Program.<Run>g__RunScanner|4_0(Program.<>c__DisplayClass4_0&) + 0x1c4
     at ILCompiler.Program.Run() + 0x22a8
     at ILCompiler.ILCompilerRootCommand.<>c__DisplayClass236_0.<.ctor>b__0(ParseResult result) + 0x330
     at System.CommandLine.Invocation.InvocationPipeline.Invoke(ParseResult) + 0x60
     at System.CommandLine.ParseResult.Invoke() + 0x44
     at ILCompiler.Program.Main(String[] args) + 0xe0
  /var/folders/4b/qx0g987x3zxdd_dwd553n7wr0000gn/T/MSBuildTempvs/tmpbfaa1a415e4e48dcbe61d0f08f795d85.exec.cmd: line 2: 81772 Abort trap: 6           "/Users/vs/Hosting01/runtime/artifacts/bin/coreclr/osx.arm64.Release/ilc-published/ilc" @"/Users/vs/Hosting01/runtime/artifacts/obj/System.Composition.Runtime.Tests/Release/net9.0/native/System.Composition.Runtime.Tests.ilc.rsp"
  0  0x1029aef2c  __assert_rtn + 72
  1  0x1028cf970  ld::Fixup::setAddend(std::__1::variant<long long, ld::Fixup::LargeAddendIndex>) + 104
  2  0x102920b48  ld::DynamicAtom::addFixup(ld::FixupBuilder) + 1384
  3  0x1028de928  ld::KindToRelocs::addFixupDiffFromSymbolNums(ld::AtomPlacement const&, ld::DynamicAtom*, unsigned int, unsigned int, unsigned int, long long) const + 456
  4  0x1028f1de0  ld::InputFiles::SliceParser::parseObjectFile(mach_o::Header const*) const + 6364
  5  0x102902404  ld::InputFiles::parseAllFiles(void (ld::AtomFile const*) block_pointer)::$_7::operator()(unsigned long, ld::FileInfo const&) const + 420
  6  0x181bb4950  _dispatch_client_callout2 + 20
  7  0x181bc7ba0  _dispatch_apply_invoke + 176
  8  0x181bb4910  _dispatch_client_callout + 20
  9  0x181bc63cc  _dispatch_root_queue_drain + 864
  10  0x181bc6a04  _dispatch_worker_thread2 + 156
  11  0x181d620d8  _pthread_wqthread + 228
  ld: Assertion failed: (index == _addend), function setAddend, file Fixup.cpp, line 131.
  clang: error: linker command failed with exit code 1 (use -v to see invocation)

@VSadov
Copy link
Member

VSadov commented Feb 20, 2024

This looks like GC hole in ILC:


EXEC : error : Object reference not set to an instance of an object. [/Users/vs/Hosting01/runtime/src/libraries/System.Runtime.Caching/tests/System.Runtime.Caching.Tests.csproj::TargetFramework=net9.0]
  System.NullReferenceException: Object reference not set to an instance of an object.
     at System.Runtime.TypeCast.IsInstanceOfAny_NoCacheLookup(MethodTable*, Object) + 0x1c
     at System.Runtime.TypeCast.StelemRef_Helper_NoCacheLookup(Object&, MethodTable*, Object) + 0x20
     at System.Runtime.TypeCast.StelemRef_Helper(Object&, MethodTable*, Object) + 0xc8
     at ILCompiler.DependencyAnalysis.GenericVarianceDetails.Equals(Object) + 0x12d8
     at ILCompiler.DependencyAnalysis.DehydratableObjectNode.GetData(NodeFactory, Boolean) + 0x38
     at ILCompiler.ObjectWriter.ObjectWriter.EmitObject(String, IReadOnlyCollection`1, IObjectDumper, Logger) + 0x5b4
     at ILCompiler.ObjectWriter.ObjectWriter.EmitObject(String, IReadOnlyCollection`1, NodeFactory, ObjectWritingOptions, IObjectDumper, Logger) + 0x168
     at ILCompiler.Compilation.ILCompiler.ICompilation.Compile(String, ObjectDumper) + 0x80
     at ILCompiler.Program.Run() + 0x25f4
     at ILCompiler.ILCompilerRootCommand.<>c__DisplayClass236_0.<.ctor>b__0(ParseResult result) + 0x330

@VSadov
Copy link
Member

VSadov commented Feb 20, 2024

I think, if the change is otherwise uncontroversial, we can merge it.
Sadly, it does not look like it is sufficient though.

@filipnavara
Copy link
Member Author

Sadly, it does not look like it is sufficient though.

Thanks again for the thorough test. I may try to investigate more but I don't think I will come up with anything in the short term.

(I still think this change is worth merging even if it doesn't resolve all the issues.)

@akoeplinger
Copy link
Member

FYI the -ld_classic workaround was ported to main: #98726

@github-actions github-actions bot locked and limited conversation to collaborators Apr 25, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-NativeAOT-coreclr community-contribution Indicates that the PR has been added by a community member os-mac-os-x macOS aka OSX
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants