Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Discussion: ObjWriter in C# #77178

Closed
Tracked by #96511
filipnavara opened this issue Oct 18, 2022 · 40 comments · Fixed by #95876
Closed
Tracked by #96511

Discussion: ObjWriter in C# #77178

filipnavara opened this issue Oct 18, 2022 · 40 comments · Fixed by #95876
Assignees
Labels
area-NativeAOT-coreclr User Story A single user-facing feature. Can be grouped under an epic.
Milestone

Comments

@filipnavara
Copy link
Member

filipnavara commented Oct 18, 2022

Let me start with a bit of a background. Last year I wrote a library for manipulating MachO object files in C# called Melanzana. Few weeks ago, I started doing some changes to the ObjWriter code and I found it to be quite suboptimal in terms of performance. As an exercise I wrote a prototype of an ObjWriter replacement in C# for the MachO files based on my library. I later extended it to emit DWARF debugging information and ELF files as well through the LibObjectFile library. Obviously, the code is not production ready and it is not on par with the current ObjWriter library but I wanted to gauge whether there would be an interest carrying this experiment forward.

What works?

  • Producing MachO object files for osx-x64 and osx-arm64 targets
  • Producing ELF files for linux-x64 and linux-arm64 targets
  • Producing COFF files for win-x64 and win-arm64 targets
  • DWARF debugging information for types, methods, variables, and line numbers
  • CodeView debugging information for types, methods, variables, and line numbers

What doesn't work?

  • No shared symbol (COMDAT) support for ELF
  • No ARM32 and X86 support (NativeAOT doesn't properly support them anyway)

The obvious advantage of the approach is that the object writing libraries (whether it is LibObjectFile or Melanzana) are closer to the data model that the ILCompiler emits. That makes it more efficient at producing the raw section data, relocations, and symbol tables. The LLVM-based ObjWriter has high overhead (15%-30% of the whole compilation process in my tests). Some of the overhead can be reduced (eg. switching sections is expensive) but it usually comes at the cost of writing at least some part of the code in C#.

The disadvantage is that this is a lot of code and two external library references. Essentially, it's trading one dependency for another. While MachO and ELF formats are already part of the initial experiment the COFF one is not (all Windows targets).

There's also a middle way. Parts of the experiment can be reused to feed the data more efficiently into the current ObjWriter. For example, the unwinding sections (__eh_frame, __compact_unwind) could be produced completely in the managed code. That would avoid lot of overhead for section switching at minimal impact on portability (can be used only for specific targets).

The experiment also serves as a good testbed to quickly evaluate some space savings. Some little things I noticed:

  • Non-primary LSDA frames contain relative pointer to main LSDA. This can be generated without symbols and relocations easily.
  • Currently MachO produces __eh_frame data with 8-byte PC relative pointers. The linker can handle 4-byte ones as well so they can be used (as already done for ELF).
  • ELF produces PLT relocations even within the same section. This seems to inhibit some optimizations and it's not recommended.
@dotnet-issue-labeler
Copy link

I couldn't figure out the best area label to add to this issue. If you have write-permissions please help me learn by adding exactly one area label.

@ghost ghost added the untriaged New issue has not been triaged by the area owner label Oct 18, 2022
@filipnavara
Copy link
Member Author

cc @MichalStrehovsky @kant2002

@kant2002
Copy link
Contributor

My unlove to ObjWriter comes from one crash which I diagnose dotnet/runtimelab#1316 (comment) whole thing memory hungry during compilation. NativeAOT already have rough experience so adding one occasional OOM does not helps.

@filipnavara
Copy link
Member Author

filipnavara commented Oct 18, 2022

The memory usage could be improved significantly. In fact, aside from the DWARF part, it likely uses way less memory already. There is potential for even bigger savings since I currently write the section data into MemoryStreams but I could easily reuse the existing arrays and just wrap them in a custom Stream type that iterates over them...

FWIW I am perfectly fine with ditching this as hack week project but I am also fine salvaging the good parts if it makes things fast(er) or improves the source build story.

@am11
Copy link
Member

am11 commented Oct 18, 2022

+1 for the idea and spirit! 🙂

No DWARF5 yet

FWIW, HP libunwind (the one used in CoreCLR PAL) also does not parse DWARF 4 and 5 correctly yet.

it usually comes at the cost of writing at least some part of the code in C#.

Agreed. One less dependency is also a positive for source-build. Moreover, continuous maintenance of dependency has a measurable cost and working across repos using nuget transport is another con of ObjectWriter.

ps - prior to ObjectWriter, there used to be an AsmWriter written in C#: dotnet/corert@ad6e1ba.

@filipnavara
Copy link
Member Author

No DWARF5 yet

FWIW, HP libunwind (the one used in CoreCLR PAL) also does not parse DWARF 4 and 5 correctly yet.

The current ObjWriter DWARF 5 support is basically just writing different version number in the header. It doesn't use any of the new features and it's essentially useless. I could easily replicate that but there's really no point except for checking a checkbox.

The main benefit of DWARF 5 seems to be reducing the number of relocations. That's also non-issue for Mach-O since no relocations are necessary. Apple's linker doesn't copy DWARF to the final executable, instead it produces debug maps that reference the original DWARF data in the .o files. They can by collected by dsymutil into a bundled format but there's still no need for relocations.

@MichalStrehovsky
Copy link
Member

Cc @markples @BrianBohe for thoughts on removing the LLVM dependency

@filipnavara
Copy link
Member Author

I pushed a bit more code to the branch. Now it can produce ELF and COFF too. There are still feature gaps but it's much closer to supporting all the platforms that NativeAOT can target today.

@filipnavara
Copy link
Member Author

I added CodeView debugging support. That brings it close to full parity with the old code on the supported platforms. It still needs cleanup, comments, and fixing some register mappings.

@filipnavara
Copy link
Member Author

filipnavara commented Nov 2, 2022

I pushed a version with the memory usage optimization mentioned in #77178 (comment). A non-scientific benchmark with Stopwatch for generating COFF+CodeView currently looks like this:

> ilc "@artifacts\tests\coreclr\obj\windows.x64.Debug\Managed\nativeaot\SmokeTests\Exceptions\Exceptions\native\Exceptions.ilc.rsp"
Emitting object file took 00:00:00.9877349
> $Env:DOTNET_USE_LLVM_OBJWRITER=1
> ilc "@artifacts\tests\coreclr\obj\windows.x64.Debug\Managed\nativeaot\SmokeTests\Exceptions\Exceptions\native\Exceptions.ilc.rsp"
Emitting object file took 00:00:03.5289337

@agocke
Copy link
Member

agocke commented Nov 3, 2022

Hey Filip, we're still thinking about the replacement. Compile time improvement would be nice, but I don't think it's very high priority since users can mostly use the JIT version for day-to-day development. Size-on-disk improvements are much more interesting.

Overall, I'm not sure whether we feel a managed version is more maintainable than LLVM, and we may have some other uses for LLVM anyway, so we might end up wanting to keep the code around regardless. We'll just need some more time to talk to some other engineers in the org who have more experience and would be better informed about the cost of maintaining LLVM objwriter.

Regardless, the work you've done is great! Very impressive.

@filipnavara
Copy link
Member Author

I appreciate the feedback.

I don't think the maintainability is significantly worse (or better) than the current ObjWriter.
It would be possible to drop external library dependencies altogether, if that makes any difference. It's just how the experiment evolved. That would simplify source builds a bit.

The speed improvements could be significant but it doesn't necessarily scale with size. I built tests in the whole repository and sometimes it saves 10 seconds, other times it saves 3. It makes a difference but it's not the be all and end all.

@kant2002
Copy link
Contributor

kant2002 commented Nov 4, 2022

@agocke what about memory usage of LLVM version? For now if you take app of significant size or have long namespace names you can very easy have OOM if you have 16Gb. That make use of NativeAOT problematic in the CI environment. Just couple days ago question about that appear on Gitter.

Last time I look at ObjWriter I did not find a reasonable way how you can make it less hungry.

@agocke
Copy link
Member

agocke commented Nov 4, 2022

Yeah, that's another good benefit, but I'm not sure it's decisive. I'm not sure how much of that could be that solved by spinning up fewer ILC instances per core, or limiting memory available for the GC.

If we end up doing enough work with LLVM otherwise (e.g., in Mono) it might be worth it to just try to improve LLVM instead. It kinda depends on what dependencies are partners are taking. We'll have to wait for them to give a view into their planning and priorities.

@kant2002
Copy link
Contributor

kant2002 commented Nov 4, 2022

Problem with memory is mostly libObjWriter and ILC (parts in C#) is mostly good with memory.
I do not sure if libObjWriter issues is easy solvable.

  1. IlC compiles all code.
  2. ILc about to write ObJ file and there no parallelization here.

Process which trigger consumption works approximately in following way:

  1. Collecting of all symbol names in large dictionary. Space station have 2M strings.
  2. After collection sort it and rearrange things. That’s for producing consistent id for the symbols. This is what trigger OOM
  3. Use these id for encoding symbol locations, etc.

I have hint locations here dotnet/runtimelab#1316 (comment) in case I have confusing explanation. When I look at it I was very motivated to solve it, but did not find easy way to do it. Maybe my lack of knowledge shows.

anyway, I hear what you are saying about wanting to preserve LLVM. So if this can be solved by fixing LLVM I’m fine. I just did not seen obvious way for that.

@filipnavara
Copy link
Member Author

filipnavara commented Nov 4, 2022

The memory usage is smaller in the managed version although I didn't quite measure by how much.

It still builds the object file in memory but that's an implementation detail. For section content it uses a Stream wrapper around list of buffers. These buffers are the same buffers returned by ObjectNode.GetData and there's no further copying or slicing done on them. However, since the code works with Stream (and few specific -optional- optimizations) it's trivial to use temporary files, or memory mapped files if the output was too large to meaningfully fit into memory. That would be much harder to achieve with the LLVM ObjWriter code.

The LLVM ObjWriter currently slices the buffer from ObjectNode.GetData into tiny chunks to emit relocations, unwind codes, and debug line information. In the managed version all this can be easily avoided. Relocations that need to modify addend in data do so in-place. They are then collected into list which is already close to the output format (except for symbolic names not being resolved to symbol indexes yet; that needs to be done later - the symbol table may have sorting requirement and can only be emitted once all symbols were discovered). Unwind sections are built directly in their own separate memory stream in the respective format (DWARF EH or PDATA/XDATA). No slicing is necessary since we know the code offsets and we can bake them directly into the output format. That avoids the need to converts CFI code offsets into arbitrary code slicing, then having the ObjWriter emit the CFI codes at "current code offset", and reconstructing it back into the form that we had in the first place. Similar optimization applies to the debug line information.

Unlike LLVM we don't represent the relocations as symbolic expressions and we never go back to rewrite any of the data. Everything is generated in append-only fashion and then at the end merged into the final object file.

@am11
Copy link
Member

am11 commented Nov 4, 2022

If we end up doing enough work with LLVM otherwise (e.g., in Mono) it might be worth it to just try to improve LLVM instead.

Mono's LLVM-AOT uses LLVM in a very different way than NativeAOT. Mono emits LLVM IR and then lets the toolchain do its thing through various layers (conversion, lowering, optimization, codegen etc.). While ObjWriter is a wrapper around a bunch of unexposed assembler APIs in llvm-project (the final/semi-final layer just before the object creation). So NativeAOT depends on a very small, private, surface area of llvm-project.

If someone knows how ObjWriter works, they cannot use that particular knowledge to help troubleshoot issues with mono LLVM-AOT, and vice versa.

@agocke
Copy link
Member

agocke commented Feb 13, 2023

I'm going to leave this one open, but move it into the Future milestone, because I think there's a good chance that we stick with the objwriter for 8.0, since we have so much other work to do, and consider switching to a different implementation later.

@agocke agocke removed the untriaged New issue has not been triaged by the area owner label Feb 13, 2023
@agocke agocke added this to the Future milestone Feb 13, 2023
@filipnavara
Copy link
Member Author

I keep maintaining the branch even if I don't always push the up-to-date version to GitHub. I'll likely do another rebase once the NativeAOT/iOS build changes land. I understand it may not be a priority at the moment.

@TIHan
Copy link
Contributor

TIHan commented Sep 8, 2023

@filipnavara, we are considering to look at this. In your opinion, what do you think the state of the work is?

@filipnavara
Copy link
Member Author

I need to revive the experiment and push the latest bits. There were some upstream fixes to DWARF generation which are not reflected in my code (aside from the obvious breakage after rebase).

Overall I think the approach is viable if maintainers decide to go with it. I would likely rewrite the DWARF debug emitting code to mirror the one in current ObjWriter precisely.

@filipnavara
Copy link
Member Author

(I will be unavailable for the next week but I am happy to provide more details after I return back from vacation. I definitely have enough time during the .NET 9 timeframe to push this project forward. It all depends on the willingness to agree on the approach.)

@TIHan
Copy link
Contributor

TIHan commented Sep 8, 2023

Thank you for your quick response!

@agocke and I had a long discussion, and it feels worthwhile to evaluate your work. My plan was to try to pull your work down, try to re-base on latest main and tinker with it.

@filipnavara
Copy link
Member Author

filipnavara commented Sep 8, 2023

I do have a slightly more updated rebased version that I didn't push. Unfortunately I have it on a computer which I won't be able to access until I return back from vacation. At that time I did the rebase, the changes were quite minimal (few tweaks were done to the managed ObjWriter code and some section code was slightly rearranged).

@filipnavara
Copy link
Member Author

filipnavara commented Sep 10, 2023

I managed to push the partially rebased version: https://github.com/filipnavara/runtime/pull/new/objwriter3

As previously stated, I am out of office until September 19th which also means I don't have to access to any of the test machines. The rebased version is thus entirely untested, and I opted to not overwrite the original branch.

Notably, the following changes to ObjWriter in main are not ported yet:

There may also be issues with the completely new linker in Xcode 15, so beware.

@filipnavara
Copy link
Member Author

The objwriter3 branch linked in comment above passes the NativeAOT smoke tests on my M1 MacBook:

Time [secs] | Total | Passed | Failed | Skipped | Assembly Execution Summary
============================================================================
      0.081 |     1 |      1 |      0 |       0 | nativeaot.CustomMain.XUnitWrapper.dll
      0.054 |     1 |      1 |      0 |       0 | nativeaot.GenerateUnmanagedEntryPoints.XUnitWrapper.dll
     10.500 |    13 |     13 |      0 |       0 | nativeaot.SmokeTests.XUnitWrapper.dll
----------------------------------------------------------------------------
     10.635 |    15 |     15 |      0 |       0 | (total)

@TIHan
Copy link
Contributor

TIHan commented Sep 11, 2023

That's pretty amazing @filipnavara . You got this working and re-based so fast.

@TIHan
Copy link
Contributor

TIHan commented Sep 18, 2023

@filipnavara - whenever you feel comfortable, it would be really interesting to see a Draft PR of your work so we can run it through CI.

@filipnavara
Copy link
Member Author

filipnavara commented Sep 21, 2023

whenever you feel comfortable, it would be really interesting to see a Draft PR of your work so we can run it through CI.

I expect to do it on the weekend. I still want to run some of the tests locally to ensure that the rebase didn't break anything substantial.

@filipnavara
Copy link
Member Author

filipnavara commented Sep 23, 2023

I expect to do it on the weekend.

JFYI it may be delayed by few days. Apple released Xcode 15 with the new (broken) linker meanwhile. I'm focusing on fixing the breakage now because it will likely need to be resolved for .NET 8 and the schedule is tight.

@TIHan
Copy link
Contributor

TIHan commented Sep 25, 2023

No worries @filipnavara , take your time and no rush.

@tmds
Copy link
Member

tmds commented Sep 28, 2023

Does this issue affect how we enable NativeAOT with source-build for .NET 9 (dotnet/source-build#1215)?
Or is it orthogonal, and should we continue to look at integrating the llvm-project?

cc @MichaelSimons @premun @omajid @ashnaga

@filipnavara
Copy link
Member Author

If we commit to doing the ObjWriter in C# then the LLVM dependency for NativeAOT could be completely removed. The current prototype still offers an escape hatch to use the LLVM ObjWriter but that's mainly for testing and can be removed for production.

@omajid
Copy link
Member

omajid commented Sep 28, 2023

AFAIK, rewriting ObjWriter in C# means we no longer need the llvm-dependency for NativeAOT.

llvm-project would remain a dependency for cross-compiling .NET applications to, say, apple platforms, and so is likely to be added to the VMR for .NET 9 anyway, even if source-build consumers don't need it.

@tmds
Copy link
Member

tmds commented Sep 28, 2023

If we commit to doing the ObjWriter in C# then the LLVM dependency for NativeAOT could be completely removed.

Can we commit to this for .NET 9?

It would be preferable if we can avoid building the llvm-project when source-building .NET for a distro.

@am11
Copy link
Member

am11 commented Sep 28, 2023

Can we commit to this for .NET 9?

I imagine it depends on the success of this project. It is probably too early to commit. However, @filipnavara's progress is looking promising, so currently the likelihood is high.

From the current implementation, DOTNET_USE_LLVM_OBJWRITER switch might stay for a while. Lets wait and see as it is shaping up.

@agocke agocke modified the milestones: Future, 9.0.0 Oct 30, 2023
@agocke agocke added the User Story A single user-facing feature. Can be grouped under an epic. label Oct 30, 2023
@tmds
Copy link
Member

tmds commented Nov 17, 2023

We would like to have NativeAOT support included with .NET 9 source-build (after missing the boat with .NET 8).

For source-building, the preferable implementation is the ObjWriter based one over the llvm-project one.

In the next couple of months, we should know how things work out with the ObjWriter. When we do the integration work (ObjWriter or llvm-project based) for source-build in February/March, then around the May preview NativeAOT should be part of the source-build builds.

@filipnavara
Copy link
Member Author

My goal is to have the ObjWriter PR ready for review before the end of year. I am basically done with all the big things in the TODO so I will start cleaning up the rough edges and getting it ready. I'll need to discuss the options with @TIHan and @agocke regarding the two external libraries that are currently added as dependencies. They don't necessarily have to be dependencies in the final version but there are some factors that could swing the decision one way or the other (eg. reusing them in other dotnet changes that are not directly related).

@ghost ghost added the in-pr There is an active PR which will close this issue when it is merged label Dec 11, 2023
MichalStrehovsky pushed a commit that referenced this issue Jan 8, 2024
This is reimplementation of NativeAOT ObjWriter in pure C# instead of depending on LLVM. It implements Mach-O, ELF, COFF object file emitter with DWARF and CodeView debugging information. Only x64 and arm64 targets are implemented to cover officially supported platforms (win-x86 code is present and lightly tested; linux-x86 code is present and incomplete, only serves as a test bed for emitting 32-bit ELF files if we ever need that).

Original object writer code is still present and can be used by setting the `DOTNET_USE_LLVM_OBJWRITER=1` environment variable.

**Thanks to @am11 for helping with testing and debugging this, @xoofx for making LibObjectFile which helped kickstart this project, @PaulusParssinen for tips about branchless U/SLEB128 size calculation, and all the people on the .NET team who helped push this forward!**

Fixes #77178
@ghost ghost removed the in-pr There is an active PR which will close this issue when it is merged label Jan 8, 2024
@am11
Copy link
Member

am11 commented Jan 8, 2024

Congrats! 🎉

@Thefrank, FYI - AFAIK, ELF sections don't differ between linux object files and that of FreeBSD; main thing which differs across various ELF OS is E_IDENT OS ABI magic bits, which will come in play if/when full linker is implemented in C# #92705 (comment).

@Thefrank
Copy link
Contributor

Thefrank commented Jan 8, 2024

This is cool! Congratulations!

@am11 Short: Correct

Long: Still correct with more info

The notes section is similar but has some things that are FreeBSD specific but not required like NT_FREEBSD_FEATURE_CTL. The elfctl program can add/remove/modify that note to give the OS a better idea of the security depends on the ELF. It is also the "notes" section.

A quick scan over https://man.freebsd.org/cgi/man.cgi?elf(5) show it to be very close, if not exactly the same in most places as https://man7.org/linux/man-pages/man5/elf.5.html. One note from the Linux manpage:

(Note: the *BSD terminology is a bit different. There, Elf64_Half is twice as large as Elf32_Half, and Elf64Quarter is used for uint16_t. In order to avoid confusion these types are replaced by explicit ones in the below.)

@github-actions github-actions bot locked and limited conversation to collaborators Feb 8, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-NativeAOT-coreclr User Story A single user-facing feature. Can be grouped under an epic.
Projects
Archived in project
Development

Successfully merging a pull request may close this issue.

10 participants