Skip to content
Siegfried Pammer edited this page Jul 18, 2018 · 19 revisions

The srm branch of ILSpy is - as the name suggests - a port of ILSpy and the underlying decompiler and disassembler engine to System.Reflection.Metadata [1] provided by Microsoft, already used in the Roslyn framework and Visual Studio. This change will be part of the upcoming ILSpy 4.0 release.

The main reasons for this step were:

  • Writing portable PDB files: (currently in preview) ILSpy 4.0 comes with the ability to create portable PDB files from any assembly that can be successfully decompiled. These PDB files have the decompiled source code embedded, so all you need to do is store the PDB somewhere where the Visual Studio debugger can find it.

  • Stability: Mono.Cecil does not provide direct access to the IL bytes. It only provides a list of Instruction objects. This is very bad in terms of memory allocations and if there are any garbage bytes in the IL stream, it fails with an exception. It was also not possible to parse the IL bytes without Cecil internally accessing the debug information, which made our analyzers fail on assemblies with invalid debug metadata. With System.Reflection.Metadata we can now parse and interpret the IL bytes independently. This is a great improvement for both the analyzers and the disassembler.

  • Thread Safety: Mono.Cecil's original design was not thread safe: various classes were initializing their contents on-demand, and were using a shared BinaryReader instance. For ILSpy we patched Cecil to make these accesses thread-safe (by locking the underlying shared state), and these patches eventually made it into upstream Cecil. However, lately we've once again noticed occasional issues due to concurrent accesses, particularly with the access to debug symbols. System.Reflection.Metadata has a different design that avoids shared mutable state, so we can use it from multiple threads without locking overhead.

  • Memory usage and performance: Mono.Cecil has a very bulky object model in terms of memory usage. Our type system implementation was originally written for SharpDevelop, where we only needed the type system for the public API of referenced assemblies. So to reduce memory usage in SharpDevelop, we decided to copy all needed information from the Cecil objects to the unresolved type system. In ILSpy we never did this "full copy" because we needed to keep the Cecil objects around anyways to get the IL instructions. However, we still had the "unresolved type system" layer in the codebase, even though it was no longer all that useful. The System.Reflection.Metadata API needs less memory, because it only allocates a few caches, and provides direct view-like access to the bytes in the metadata streams and tables, without ever creating a copy of the memory. So in ILSpy 4.0, we decided to eliminate the extra "unresolved type system" layer: the decompiler type system now directly uses System.Reflection.Metadata. Overall, ILSpy now tends to use 30% less memory than before (of course, the exact numbers vary dramatically depending on the loaded assemblies and the actions performed in ILSpy).

Frequently Asked Questions

  • Why didn't you use library X (e.g. dnlib)? We intentionally decided to use the official library used by the official .NET tooling. This makes us more or less "bug-compatible" when it comes to assemblies ILSpy can accept with ones that VS et al accept. Note that on purpose we do not venture into "overcoming security measures intended to make decompiling impossible".

  • SRM is very low-level and hard to use. How do I ...? We have a high-level type system (ICSharpCode.Decompiler.TypeSystem) wrapped around SRM. Anytime you need Resolve(), that's a sure sign you should be working with the type system and not SRM directly. (only the type system provides a resolve implementation)

    Note: there are different options to configure the type system. The ILSpy UI is using it essentially in "single-assembly-mode", so it can't resolve across assembly boundaries. You can create a full type system via new DecompilerTypeSystem(...).

    Use typeSystem.MainModule.GetDefinition() to get the ITypeDefinition for a TypeDefinitionHandle. Then you can use the GetAllBaseTypeDefinitions() extension method to walk the list of basetypes.

  • Why is there no Body property on IMethod? We do not load method bodies into the type system, because it is metadata-only. Method bodies are stored in a different part of the binary. However, it is very easy to get to the MethodBodyBlock from IMethod:

public MethodBodyBlock GetMethodBody(IMethod method)
{
	if (!method.HasBody || method.MetadataToken.IsNil)
		return null;
	var module = method.ParentModule.PEFile;
	var md = module.Metadata.GetMethodDefinition((MethodDefinitionHandle)method.MetadataToken);
	return module.Reader.GetMethodBody(md.RelativeVirtualAddress);
}

Note that there is no object model for instructions. You will have to do the byte parsing yourself. However there are some extension / helper methods in ICSharpCode.Decompiler.Disassembler.ILParser.

  • How do I access instructions, local variables and try-blocks? After getting the SRM MethodBodyBlock (like described above), you can get access to the IL byte stream via MethodBodyBlock.GetILReader() (there are GetILBytes() and GetILContent() as well, but it is a better idea to use GetILReader(), because it returns a BlobReader and avoids copying the bytes into a managed byte array. The aforementioned ILParser defines extension methods on BlobReader. Keep in mind that BlobReader is a value type, so changes are only reflected back to the caller if passed by-ref. The extension methods defined in ILParser are safe to use, because they are defined as this ref ILReader.

    Local variables can be retrieved by decoding the MethodBodyBlock.LocalSignature (for example, by using dts.MainModule.DecodeLocalSignature). Exception regions can be accessed through MethodBodyBlock.ExceptionRegions.

    Note that MethodBodyBlock.Size includes the header bytes and exception regions. If you are only interested in the byte length of the IL instructions, you can use MethodBodyBlock.GetILReader().Length.

Summary of all the changes, aka Who Moved My Cheese?

For ILSpy users

You will (hopefully) not be affected in a bad way by this change. All functionality provided by ILSpy should stay the same or be improved only. If you come across anything that worked in a previous version of ILSpy, but no longer in version 4.0, please open an issue. Thank you!

For users of the ICSharpCode.Decompiler nuget

  • CSharpDecompiler:
    • now uses SRM's EntityHandle in all Decompile* methods. See *** for examples.
    • ConvertType removed: Use the TypeSystem to retrieve the IEntity instance and use TypeSystemAstBuilder to create the AstNode/AstType.
  • IDecompilerTypeSystem:
    • Integrated with the type system: DecompilerTypeSystem now implements ICompilation.
    • GetCecil replaced by IEntity.MetadataToken: Each entity now stores its metadata token.
    • New functionality:
      • dts.MainModule.GetDefinition() to retrieve type system entities for metadata tokens (definitions only).
      • dts.MainModule.Resolve* methods to retrieve type system entities for metadata tokens (both defs+refs).
      • dts.MainModule.Decode* methods to decode stand-alone signatures.
      • dts.MainModule.PEFile: Retrieves the PEFile instance that was used to construct the type system.
      • dts.MainModule.PEFile.Metadata - the SRM MetadataReader of the main assembly, which provides access to metadata tables. You can use IEntity.MetadataToken to retrieve metadata info. If you need to access an entity from a referenced assembly, you should use IEntity.ParentModule.PEFile.Metadata to access the corresponding MetadataReader.
  • UniversalAssemblyResolver:
    • LoadMainModule removed: the IAssemblyResolver instance now needs to be constructed independently from the CSharpDecompiler. The IAssemblyResolver should be passed to the CSharpDecompiler constructor.
  • NEW IDocumentationProvider interface can be used to provide XML documentation comments.
  • NEW IDebugInfoProvider interface can be used to provide debug info to the decompiler.

For AddIn developers

  • Language API and Tree View now uses the Decompiler TypeSystem.
  • NEW Analyzer API
    • AnalyzerTreeView.Instance.Analyze(IEntity) is now publicly accessible.
    • IAnalyzer and ExportAnalyzerAttribute should enable plugins to be able to add new analyzers.

[1] https://github.com/dotnet/corefx/tree/master/src/System.Reflection.Metadata