-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Generate the DAC table on Windows with MSVC linker directives instead of resource injection #68065
Conversation
Today it still loads dynamically, so we need to fix that or delay when SOS loads.
…ile scenario. Things are still broken, but they're at least loading correctly now.
Tagging subscribers to this area: @tommcdon Issue DetailsCurrently in main, we generate the DAC table with a combination of a few tools:
This process has a few problems:
This PR changes this portion of the build process to use a combination of MSVC pragmas and linker comments to compute the DAC table at compile time. This provides a few improvements:
This design has one problem:
This PR also removes the windows-only "dac table header". I figured that since this wasn't used on Linux, that we didn't find enough use out of it to make it worthwhile to port over. I've validated locally with this change that I can use cc: @dotnet/runtime-infrastructure for the infra changes cc: @dotnet/dotnet-diag for general review
|
@jkoritzinsky I'd work with @hoyosjs to run some of the more extensive testing we have for debugging scenarios. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice! Some of your linker shenanigans were new to me, but if the tests are working then I'm happy to assume they work as intended : ) Thats great that we can drop the custom tools and build targets.
Do we have yml files which invoked the managed dac tools subset explicitly before the native build? If yes, we could probably remove such steps now? |
The steps in the YAML that built DacTableGen also now generate the version files for the native build via Arcade (along with other things like native SourceLink manifests), so we can't remove them. Theoretically we could remove them for the non-official build, but that would require adding extra logic to the YAML, so it doesn't feel worth it to me at the moment. |
@@ -6885,129 +6883,6 @@ bool ClrDataAccess::TargetConsistencyAssertsEnabled() | |||
// | |||
HRESULT ClrDataAccess::VerifyDlls() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So we are always going to say CoreCLR and DAC match? The DAC table used to have the bits to verify the CLR matched, granted - now you don't embed anything, but it's
- If we still care about making sure bits match, we need a new sort of signature. Maybe a global embedded on both that's derived from the build version that the DAC can verify against?
- If not, we should probably remove this function entirely. It doesn't make sense to just return S_OK.
Thoughts @dotnet/dotnet-diag ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We're already assuming that they match on non-Windows, so if we are introducing a new signature mechanism, we should make sure it works for all platforms we support.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yup - I am happy to have it be a wide check
// Allocate a buffer for the memory that we'll read out of the target image. | ||
// We can allocate as much space as the target symbol name as we gracefully handle reading | ||
// inaccessable memory. | ||
std::unique_ptr<char[]> namePointer(new char[symbolNameLength]); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This gets linked into the DAC. I know we've historically avoided the STL there, I am generally ok (even supportive) with it in this layer as it doesn't touch runtime allocators. However, I don't have all the context of why we avoid them. Is usage in layers like this OK @jkotas?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I only decided to use the STL as we're using it in the ELF reader in the same folder.
ElfReader::TryLookupSymbol(std::string symbolName, uint64_t* symbolOffset) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At some point it caused a bug that looked like:
(lldb) bt
* thread #2, queue = 'com.apple.main-thread', stop reason = signal SIGABRT
* frame #0: 0x00000001804a0e68 libsystem_kernel.dylib`__pthread_kill + 8
frame #1: 0x00000001804d343c libsystem_pthread.dylib`pthread_kill + 292
frame #2: 0x000000018041b460 libsystem_c.dylib`abort + 104
frame #3: 0x0000000180303fb4 libsystem_malloc.dylib`malloc_vreport + 560
frame #4: 0x00000001803075fc libsystem_malloc.dylib`malloc_report + 64
frame #5: 0x00000001802f66f4 libsystem_malloc.dylib`free + 516
frame #6: 0x000000013b89eee0 libmscordaccore.dylib`operator delete(void*) [inlined] ClrFree(p=<unavailable>) at clrhost_nodependencies.cpp:304:5 [opt]
frame #7: 0x000000013b89eea8 libmscordaccore.dylib`operator delete(p=<unavailable>) at clrhost_nodependencies.cpp:398 [opt]
frame #8: 0x000000013b843930 libmscordaccore.dylib`MachOModule::TryLookupSymbol(char const*, unsigned long long*) [inlined] std::__1::_DeallocateCaller::__do_call(__ptr=0x000000013b2873a0) at new:334:12 [opt]
frame #9: 0x000000013b843928 libmscordaccore.dylib`MachOModule::TryLookupSymbol(char const*, unsigned long long*) [inlined] std::__1::_DeallocateCaller::__do_deallocate_handle_size(__ptr=0x000000013b2873a0, __size=<unavailable>) at new:292 [opt]
frame #10: 0x000000013b843928 libmscordaccore.dylib`MachOModule::TryLookupSymbol(char const*, unsigned long long*) [inlined] std::__1::_DeallocateCaller::__do_deallocate_handle_size_align(__ptr=0x000000013b2873a0, __size=<unavailable>, __align=<unavailable>) at new:262 [opt]
frame #11: 0x000000013b843928 libmscordaccore.dylib`MachOModule::TryLookupSymbol(char const*, unsigned long long*) [inlined] std::__1::__libcpp_deallocate(__ptr=0x000000013b2873a0, __size=<unavailable>, __align=<unavailable>) at new:340 [opt]
frame #12: 0x000000013b843928 libmscordaccore.dylib`MachOModule::TryLookupSymbol(char const*, unsigned long long*) [inlined] std::__1::allocator<char>::deallocate(this=<unavailable>, __p="coreclr_create_delegate", __n=<unavailable>) at memory:1673 [opt]
frame #13: 0x000000013b843928 libmscordaccore.dylib`MachOModule::TryLookupSymbol(char const*, unsigned long long*) [inlined] std::__1::allocator_traits<std::__1::allocator<char> >::deallocate(__a=<unavailable>, __p="coreclr_create_delegate", __n=<unavailable>) at memory:1408 [opt]
frame #14: 0x000000013b843928 libmscordaccore.dylib`MachOModule::TryLookupSymbol(char const*, unsigned long long*) [inlined] std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >::~basic_string(this="coreclr_create_delegate") at string:2198 [opt]
frame #15: 0x000000013b843928 libmscordaccore.dylib`MachOModule::TryLookupSymbol(char const*, unsigned long long*) [inlined] std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >::~basic_string(this="coreclr_create_delegate") at string:2193 [opt]
frame #16: 0x000000013b843928 libmscordaccore.dylib`MachOModule::TryLookupSymbol(this=0x000000016fdfe100, symbolName="g_dacTable", symbolValue=0x000000016fdfe0f8) at machoreader.cpp:142 [opt]
frame #17: 0x000000013b84350c libmscordaccore.dylib`::TryGetSymbol(dataTarget=0x000000013b287380, baseAddress=4345102336, symbolName="g_dacTable", symbolAddress=0x000000016fdfe1e0) at machoreader.cpp:65:16 [opt]
frame #18: 0x000000013b7d7ad4 libmscordaccore.dylib`ClrDataAccess::Initialize() [inlined] GetDacTableAddress(dataTarget=<unavailable>, baseAddress=<unavailable>, dacTableAddress=0x000000016fdfe1e0) at daccess.cpp:7246:10 [opt]
frame #19: 0x000000013b7d7ac4 libmscordaccore.dylib`ClrDataAccess::Initialize() [inlined] ClrDataAccess::GetDacGlobals(this=0x0000000121a46a10) at daccess.cpp:7260 [opt]
frame #20: 0x000000013b7d7abc libmscordaccore.dylib`ClrDataAccess::Initialize(this=0x0000000121a46a10) at daccess.cpp:5607 [opt]
frame #21: 0x000000013b7da3d4 libmscordaccore.dylib`::CLRDataAccessCreateInstance(pLegacyTarget=0x000000013b287340, pClrDataAccess=0x000000016fdfe278) at daccess.cpp:7492:28 [opt]
frame #22: 0x000000013b7da468 libmscordaccore.dylib`::CLRDataCreateInstance(iid=0x000000013b452048, pLegacyTarget=<unavailable>, iface=0x000000013b285028) at daccess.cpp:7524:18 [opt]
frame #23: 0x000000013b426c08 libsos.dylib`Runtime::GetClrDataProcess(this=0x000000013b284fd0, ppClrDataProcess=0x000000013b5bb598) at runtimeimpl.cpp:425:22
frame #24: 0x000000013b41cbec libsos.dylib`LoadClrDebugDll() at util.cpp:4063:30
frame #25: 0x000000013b3bc2a0 libsos.dylib`::IP2MD(client=0x000000012194c800, args="0x0000000280f64c1c ") at strike.cpp:274:5
frame #26: 0x000000013843f7f4 libsosplugin.dylib`sosCommand::DoExecute(this=0x00000001377b19a0, debugger=<unavailable>, arguments=0x00000001387042a8, result=0x000000016fdfede8) at soscommand.cpp:69:30
frame #27: 0x00000001001ccf6c LLDB`CommandPluginInterfaceImplementation::DoExecute(lldb_private::Args&, lldb_private::CommandReturnObject&) + 196
frame #28: 0x00000001004dd974 LLDB`lldb_private::CommandObjectParsed::Execute(char const*, lldb_private::CommandReturnObject&) + 404
frame #29: 0x00000001004d4db8 LLDB`lldb_private::CommandInterpreter::HandleCommand(char const*, lldb_private::LazyBool, lldb_private::CommandReturnObject&, lldb_private::ExecutionContext*, bool, bool) + 2264
frame #30: 0x00000001004d8ccc LLDB`lldb_private::CommandInterpreter::IOHandlerInputComplete(lldb_private::IOHandler&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >&) + 616
frame #31: 0x00000001004295f4 LLDB`lldb_private::IOHandlerEditline::Run() + 316
frame #32: 0x0000000100412110 LLDB`lldb_private::Debugger::RunIOHandlers() + 140
frame #33: 0x00000001004d9cac LLDB`lldb_private::CommandInterpreter::RunCommandInterpreter(lldb_private::CommandInterpreterRunOptions&) + 160
frame #34: 0x0000000100206640 LLDB`lldb::SBDebugger::RunCommandInterpreter(bool, bool) + 192
frame #35: 0x00000001000043d8 lldb`Driver::MainLoop() + 2452
frame #36: 0x0000000100005b3c lldb`main + 1880
frame #37: 0x00000001804f1450 libdyld.dylib`start + 4
But that was fixed here: https://github.com/dotnet/runtime/pull/55945/files.
As long as this symbol is only present in the DAC we should be fine for this particular usage at least...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My understanding is that we avoided STL in the runtime because it made the underlying data structure opaque and thus hard to create out-of-process DAC compilation that could parse the same structure. However for data structures/code that only exists in the debugger process I think the avoidance of STL may have been historical convention rather than any explicit need. It is possible there were also concerns around binary size (DAC statically links the CRT) or portability but I expect you could confirm the new binary isn't substantially larger than before and it still runs on all the platforms we need it to. In short, it sounds fine to me as long as we do a tiny bit of due diligence.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The use of std::unique_ptr<T>
should be just fine. The bigger concern is std::shared_ptr<T>
that tends to pull in the concrt with MSVC and increases binary size in a profound way.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One more reason for avoiding STL in the runtime has been collisions with CoreCLR exception handling and CoreCLR PAL on Unix. This does not apply here since this code is Windows specific.
I have intentionally removed all usages of selectany from the runtime in the past: see #39532. So it would be unfortunate to reintroduce it. |
…(so once per target) instead of dac-library specific (once per coreclr build version). We need to read the values once per process to ensure that child processes with ASLR enabled (which as a result will have the DAC table entries point to different addresses) don't break debugging later processes.
I've validated that the diagnostics tests have the same results (same failures as main) with a runtime built from this PR. As soon as CI is finished, I'm going to merge this in. |
Some of these are pretty fun c++ tricks - thanks for this. Looks like it should make a fair amount of things a little less sensitive to PDB fun that we were doing before. |
Currently in main, we generate the DAC table with a combination of a few tools:
daccess.h
header to generate a file that we input into a custom tool, along with the PDB for a previously-builtcoreclr.dll
.daccess.h
.coreclr.dll
as a Win32 resourcesinglefilehost.exe
executable and related PDB.This process has a few problems:
This PR changes this portion of the build process to use a combination of MSVC pragmas and linker comments to compute the DAC table at compile time.
This provides a few improvements:
daccess.h
header.link.exe
.This design has one problem:
__declspec(selectany)
to define the dac table, we can verify that there is not a dynamic initializer in the symbol table of thedactable.cpp
file with a script we have CMake run after thedactable.cpp.obj
file is built.This PR also removes the windows-only "dac table header". I figured that since this wasn't used on Linux, that we didn't find enough use out of it to make it worthwhile to port over.
I've validated locally with this change that I can use
!dumpmt
in WinDBG using SOS and that I can hit breakpoints and step through code with VS managed debugging.cc: @dotnet/runtime-infrastructure for the infra changes
cc: @dotnet/dotnet-diag for general review
cc: @AaronRobinsonMSFT for review of the signatures from interoplib that I had to duplicate in
dactable.cpp
.