-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
.NET 7 osx-arm64 single-file crashing with sigsegv #67062
Comments
Tagging subscribers to this area: @agocke, @vitek-karas, @VSadov Issue DetailsDescriptionLatest build of .NET 7 published single-file app is crashing on execution. Reproduction Steps# installation
mkdir ~/.dotnet7
curl -sSL https://aka.ms/dotnet/7.0.1xx/daily/dotnet-sdk-osx-arm64.tar.gz | tar xzf - -C ~/.dotnet7
# publish a new app as self-contained and single app
~/.dotnet7/dotnet new console -n testapp1
cd testapp1
cat > NuGet.config << EOF
<configuration>
<packageSources>
<add key="dotnet7" value="https://pkgs.dev.azure.com/dnceng/public/_packaging/dotnet7/nuget/v3/index.json" />
</packageSources>
</configuration>
EOF
~/.dotnet7/dotnet publish --use-current-runtime -p:PublishSingleFile=true --self-contained -c Release
# run the published app
bin/Release/net7.0/osx-arm64/publish/testapp1 Expected behaviorDisplays Actual behaviorzsh: segmentation fault bin/Release/net7.0/osx-arm64/publish/testapp1 Regression?Yes, it woks with .NET 6. Known WorkaroundsPublish as self-contained, without ConfigurationDaily build % strings ~/.dotnet7/dotnet | grep '@(#)'
@(#)Version 7.0.22.17106 @Commit: ce813882f4061459dc62b63acb75add040f1f603 Other informationI tried debugging it with native symbols (of release singlefilehost), the clrstack looks like this: % lldb bin/Release/net7.0/osx-arm64/publish/testapp1
Added Microsoft public symbol server
(lldb) target create "bin/Release/net7.0/osx-arm64/publish/testapp1"
Current executable set to '/Users/am11/projects/testapp1/bin/Release/net7.0/osx-arm64/publish/testapp1' (arm64).
(lldb) r
Process 22685 launched: '/Users/am11/projects/testapp1/bin/Release/net7.0/osx-arm64/publish/testapp1' (arm64)
Process 22685 stopped
* thread #1, queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS (code=1, address=0x0)
frame #0: 0x00000001000b4c78 testapp1`DictionaryLayout::FindToken(MethodTable*, LoaderAllocator*, int, SigBuilder*, unsigned char*, DictionaryEntrySignatureSource, CORINFO_RUNTIME_LOOKUP*, unsigned short*) + 84
testapp1`DictionaryLayout::FindToken:
-> 0x1000b4c78 <+84>: ldr w8, [x22]
0x1000b4c7c <+88>: tst w8, #0x30
0x1000b4c80 <+92>: cset w9, eq
0x1000b4c84 <+96>: orr w8, w9, w8, lsr #31
Target 0: (testapp1) stopped.
(lldb) bt
* thread #1, queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS (code=1, address=0x0)
* frame #0: 0x00000001000b4c78 testapp1`DictionaryLayout::FindToken(MethodTable*, LoaderAllocator*, int, SigBuilder*, unsigned char*, DictionaryEntrySignatureSource, CORINFO_RUNTIME_LOOKUP*, unsigned short*) + 84
frame #1: 0x000000010010bf00 testapp1`ProcessDynamicDictionaryLookup(TransitionBlock*, Module*, Module*, unsigned char, unsigned char const*, unsigned char const*, CORINFO_RUNTIME_LOOKUP*, unsigned int*) + 932
frame #2: 0x000000010010c290 testapp1`DynamicHelperFixup(TransitionBlock*, unsigned long*, unsigned int, Module*, CORCOMPILE_FIXUP_BLOB_KIND*, TypeHandle*, MethodDesc**, FieldDesc**) + 408
frame #3: 0x000000010010d2d0 testapp1`DynamicHelperWorker + 232
frame #4: 0x00000001002ed34c testapp1`DelayLoad_Helper_FakeProlog + 92
frame #5: 0x0000000176a93760
frame #6: 0x0000000176aa86b0
frame #7: 0x00000001766badc4
frame #8: 0x00000001002ed830 testapp1`CallDescrWorkerInternal + 132
frame #9: 0x0000000100162eb4 testapp1`MethodDescCallSite::CallTargetWorker(unsigned long const*, unsigned long*, int) + 852
frame #10: 0x000000010008df44 testapp1`CorHost2::CreateAppDomainWithManager(char16_t const*, unsigned int, char16_t const*, char16_t const*, int, char16_t const**, char16_t const**, unsigned int*) + 620
frame #11: 0x0000000100572334 testapp1`coreclr_initialize + 784
frame #12: 0x000000010001fb70 testapp1`coreclr_t::create(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, char const*, char const*, coreclr_property_bag_t const&, std::__1::unique_ptr<coreclr_t, std::__1::default_delete<coreclr_t> >&) + 420
frame #13: 0x000000010002c998 testapp1`(anonymous namespace)::create_coreclr() + 432
frame #14: 0x000000010002c46c testapp1`corehost_main + 160
frame #15: 0x000000010000d5c8 testapp1`fx_muxer_t::handle_exec_host_command(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, host_startup_info_t const&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, std::__1::unordered_map<known_options, std::__1::vector<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >, std::__1::allocator<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > > >, known_options_hash, std::__1::equal_to<known_options>, std::__1::allocator<std::__1::pair<known_options const, std::__1::vector<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >, std::__1::allocator<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > > > > > > const&, int, char const**, int, host_mode_t, bool, char*, int, int*) + 1328
frame #16: 0x000000010000c6a4 testapp1`fx_muxer_t::execute(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >, int, char const**, host_startup_info_t const&, char*, int, int*) + 860
frame #17: 0x00000001000091c0 testapp1`hostfxr_main_bundle_startupinfo + 196
frame #18: 0x000000010004c818 testapp1`exe_start(int, char const**) + 1124
frame #19: 0x000000010004caf4 testapp1`main + 152
frame #20: 0x00000001043610f4 dyld`start + 520
(lldb) clrstack -f
OS Thread Id: 0x30e100 (1)
Child SP IP Call Site
000000016FDFE1A0 00000001000B4C78 testapp1!DictionaryLayout::FindToken(MethodTable*, LoaderAllocator*, int, SigBuilder*, unsigned char*, DictionaryEntrySignatureSource, CORINFO_RUNTIME_LOOKUP*, unsigned short*) + 84
000000016FDFE230 000000010010BF00 testapp1!ProcessDynamicDictionaryLookup(TransitionBlock*, Module*, Module*, unsigned char, unsigned char const*, unsigned char const*, CORINFO_RUNTIME_LOOKUP*, unsigned int*) + 932
000000016FDFE290 000000010010C290 testapp1!DynamicHelperFixup(TransitionBlock*, unsigned long*, unsigned int, Module*, CORCOMPILE_FIXUP_BLOB_KIND*, TypeHandle*, MethodDesc**, FieldDesc**) + 408
000000016FDFE610 000000010010D2D0 testapp1!DynamicHelperWorker + 232
000000016FDFE6A0 [DynamicHelperFrame: 000000016fdfe6a0]
000000016FDFE730 00000001002ED34C testapp1!DelayLoad_Helper_FakeProlog + 92
000000016FDFE860 0000000176AC3760 System.Private.CoreLib.dll!System.Collections.Generic.HashSet`1[[System.__Canon, System.Private.CoreLib]].CheckUniqueAndUnfoundElements(System.Collections.Generic.IEnumerable`1<System.__Canon>, Boolean) + 112 [/_/src/libraries/System.Private.CoreLib/src/System/Collections/Generic/HashSet.cs @ 1436]
000000016FDFE910 0000000176AD86B0 System.Private.CoreLib.dll!System.Collections.Generic.Dictionary`2[[System.__Canon, System.Private.CoreLib],[System.IntPtr, System.Private.CoreLib]].TryGetValue(System.__Canon, IntPtr ByRef) + 32 [/_/src/libraries/System.Private.CoreLib/src/System/Collections/Generic/Dictionary.cs @ 1108]
000000016FDFE930 00000001766EADC4 System.Private.CoreLib.dll!System.AppContext.Setup(Char**, Char**, Int32) + 84 [/_/src/libraries/System.Private.CoreLib/src/System/AppContext.cs @ 136]
FFFFFFFFFFFFFFFF 0000000176AD86B0
FFFFFFFFFFFFFFFF 00000001766EADC4
FFFFFFFFFFFFFFFF 00000001002ED830 testapp1!CallDescrWorkerInternal + 132
000000016FDFE9B0 0000000100162EB4 testapp1!MethodDescCallSite::CallTargetWorker(unsigned long const*, unsigned long*, int) + 852
000000016FDFEC20 000000010008DF44 testapp1!CorHost2::CreateAppDomainWithManager(char16_t const*, unsigned int, char16_t const*, char16_t const*, int, char16_t const**, char16_t const**, unsigned int*) + 620
000000016FDFEE20 0000000100572334 testapp1!coreclr_initialize + 784
000000016FDFEEE0 000000010001FB70 testapp1!coreclr_t::create(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, char const*, char const*, coreclr_property_bag_t const&, std::__1::unique_ptr<coreclr_t, std::__1::default_delete<coreclr_t> >&) + 420
000000016FDFEFF0 000000010002C998 testapp1!(anonymous namespace)::create_coreclr() + 432
000000016FDFF060 000000010002C46C testapp1!corehost_main + 160
000000016FDFF1B0 000000010000D5C8 testapp1!fx_muxer_t::handle_exec_host_command(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, host_startup_info_t const&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, std::__1::unordered_map<known_options, std::__1::vector<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >, std::__1::allocator<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > > >, known_options_hash, std::__1::equal_to<known_options>, std::__1::allocator<std::__1::pair<known_options const, std::__1::vector<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >, std::__1::allocator<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > > > > > > const&, int, char const**, int, host_mode_t, bool, char*, int, int*) + 1328
000000016FDFF310 000000010000C6A4 testapp1!fx_muxer_t::execute(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >, int, char const**, host_startup_info_t const&, char*, int, int*) + 860
000000016FDFF420 00000001000091C0 testapp1!hostfxr_main_bundle_startupinfo + 196
000000016FDFF4D0 000000010004C818 testapp1!exe_start(int, char const**) + 1124
000000016FDFF600 000000010004CAF4 testapp1!main + 152
000000016FDFF660 00000001043610F4 dyld!start + 520
|
It broke in @jkoritzinsky, I have bisected the commits and found that the first commit (since .NET 6 release) which fails single-file app on osx-arm64 is 24e7a4a (it was working until the previous commit c87e932). With debug build, it fails an assertion:
I have debugged a bit and noticed that after this line (which does not fail): runtime/src/coreclr/vm/dllimport.cpp Line 2750 in 24e7a4a
|
I have ran another git-bisect session, this time marking
If they are not related in terms of root-cause, then fixing 2 first will bring it back to state of 1. |
@jkotas, (I can create a separate issue for 2 if needed) it looks like the issue is with the meta signature of runtime/src/coreclr/vm/siginfo.cpp Line 4281 in 9b3b937
* thread #1, queue = 'com.apple.main-thread', stop reason = breakpoint 10.1
frame #0: 0x00000001002a8a10 testapp1`MetaSig::CompareMethodSigs(pSignature1="", cSig1=5, pModule1=0x00000001764c0000, pSubst1=0x0000000000000000, pSignature2=" \U00000001\U0000000e\U0000001d\U00000003\a \U00000003\U00000001\U0000001d\U00000003\b\b2\U00000001", cSig2=5, pModule2=0x00000001764c0000, pSubst2=0x0000000000000000, skipReturnTypeSig=NO, pVisited=0x0000000000000000) at siginfo.cpp:4281:17
4278 (cSig1 == cSig2) &&
4279 (pSubst1 == NULL) &&
4280 (pSubst2 == NULL) &&
-> 4281 (memcmp(pSig1, pSig2, cSig1) == 0))
4282 {
4283 return TRUE;
4284 }
Target 0: (testapp1) stopped.
(lldb) p (int)memcmp(pSig1, pSig2, cSig1)
(int) $300 = -32
(lldb) p cSig1
(DWORD) $301 = 5
(lldb) memory read -s1 -fu -c5 pSig1 --force
0x100e0429e: 0
0x100e0429f: 1
0x100e042a0: 14
0x100e042a1: 29
0x100e042a2: 3
(lldb) memory read -s1 -fu -c5 pSig2 --force
0x108684d18: 32
0x108684d19: 1
0x108684d1a: 14
0x108684d1b: 29
0x108684d1c: 3 if i jump the PC to line 4283 and continue, the same 32 vs. 0 issue shows up for other string methods. For the non-string methods (like |
Neither of the two failure modes make sense. I think that the problem is likely a bad C++ codegen or something low-level like that. |
Maybe mismatching bits - like a new |
Yeah, I agree. This looks like mismatched bits. |
@VSadov will it be fixed in the next preview? |
When I am trying the scenario with latest daily build, it looks like bits are matching but R2R is broken.
It looks like R2R is broken in singlefile on OSX. BTW, when targeting I will continue investigating. |
the build that I picked up is:
|
Single file tests were added to outerloop test pipeline in 7677f7d, and removed in f29ba20#diff-e2e027b9777fc35f4a8243db97ce50f7dac99b3cee9465c5325d283c34d2d872L655 for cost saving. I think those are good tests to validate with frequent runtime changes and we should bring them back with |
I "think" we have an E2E test in the SDK repo (didn't check to be sure) - unfortunately I know that SDK or installer repo doesn't run tests on osx-arm64 either. |
it looks like we sometimes see PE sections overlapping in memory. This is either a loader bug or crossgen bug. Most likely crossgen. |
Same error with dotnet 6.0 on M1 thread #1, queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS (code=1, address=0x580000ead2800051) Fix: export COMPlus_ZapDisable=1 |
Pretty sure it was working fine with .NET 6 in March, without disabling zap. It is perhaps a recent regression? I haven't tested with latest patch version. |
Here are the outputs: → dotnet --version → uname -a → cat Program.cs log("Hello, World!"); → dotnet publish --use-current-runtime -p:PublishSingleFile=true --self-contained -c Release Determining projects to restore... → /private/tmp/test/bin/Release/net6.0/osx-arm64/publish/test |
@am11 can we re-open this for v6? |
There is a separate issue for 6.0 - #69923 |
Description
Latest build of .NET 7 published single-file app is crashing on execution.
Reproduction Steps
Expected behavior
Displays
Hello, World!
Actual behavior
Regression?
Yes, it woks with .NET 6.
Known Workarounds
Publish as self-contained, without
-p:PublishSingleFile=true
.Configuration
Daily build
Other information
I tried debugging it with native symbols (of release singlefilehost), the clrstack looks like this:
The text was updated successfully, but these errors were encountered: