-
-
Notifications
You must be signed in to change notification settings - Fork 3.4k
The srm
branch of ILSpy is - as the name suggests - a port of ILSpy and the underlying decompiler and disassembler engine to System.Reflection.Metadata
[1] provided by Microsoft, already used in the Roslyn framework and Visual Studio. This change will be part of the upcoming ILSpy 4.0 release.
The main reasons for this step were:
-
Writing portable PDB files: (currently in preview) ILSpy 4.0 comes with the ability to create portable PDB files from any assembly that can be successfully decompiled. These PDB files have the decompiled source code embedded, so all you need to do is store the PDB somewhere where the Visual Studio debugger can find it.
-
Stability: Mono.Cecil does not provide direct access to the IL bytes. It only provides a list of Instruction objects. This is very bad in terms of memory allocations and if there are any garbage bytes in the IL stream, it fails with an exception. It was also not possible to parse the IL bytes without Cecil internally accessing the debug information, which made our analyzers fail on assemblies with invalid debug metadata. With
System.Reflection.Metadata
we can now parse and interpret the IL bytes independently. This is a great improvement for both the analyzers and the disassembler. -
Thread Safety: Mono.Cecil's original design was not thread safe: various classes were initializing their contents on-demand, and were using a shared
BinaryReader
instance. For ILSpy we patched Cecil to make these accesses thread-safe (by locking the underlying shared state), and these patches eventually made it into upstream Cecil. However, lately we've once again noticed occasional issues due to concurrent accesses, particularly with the access to debug symbols.System.Reflection.Metadata
has a different design that avoids shared mutable state, so we can use it from multiple threads without locking overhead. -
Memory usage and performance: Mono.Cecil has a very bulky object model in terms of memory usage. Our type system implementation was originally written for SharpDevelop, where we only needed the type system for the public API of referenced assemblies. So to reduce memory usage in SharpDevelop, we decided to copy all needed information from the Cecil objects to the unresolved type system. In ILSpy we never did this "full copy" because we needed to keep the Cecil objects around anyways to get the IL instructions. However, we still had the "unresolved type system" layer in the codebase, even though it was no longer all that useful. The
System.Reflection.Metadata
API needs less memory, because it only allocates a few caches, and provides direct view-like access to the bytes in the metadata streams and tables, without ever creating a copy of the memory. So in ILSpy 4.0, we decided to eliminate the extra "unresolved type system" layer: the decompiler type system now directly usesSystem.Reflection.Metadata
. Overall, ILSpy now tends to use 30% less memory than before (of course, the exact numbers vary dramatically depending on the loaded assemblies and the actions performed in ILSpy).
-
We intentionally decided to use the official library used by the official .NET tooling. This makes us more or less "bug-compatible" when it comes to assemblies ILSpy can accept with ones that VS et al accept. Note that on purpose we do not venture into "overcoming security measures intended to make decompiling impossible".
-
We have a high-level type system (ICSharpCode.Decompiler.TypeSystem) wrapped around SRM. Anytime you need
Resolve()
, that's a sure sign you should be working with the type system and not SRM directly. (only the type system provides a resolve implementation)Note: there are different options to configure the type system. The ILSpy UI is using it essentially in "single-assembly-mode", so it can't resolve across assembly boundaries. You can create a full type system via
new DecompilerTypeSystem(...)
.Use
typeSystem.MainModule.GetDefinition()
to get theITypeDefinition
for aTypeDefinitionHandle
. Then you can use theGetAllBaseTypeDefinitions()
extension method to walk the list of basetypes. -
We do not load method bodies into the type system, because it is metadata-only. Method bodies are stored in a different part of the binary. However, it is very easy to get to the
MethodBodyBlock
fromIMethod
:
public MethodBodyBlock GetMethodBody(IMethod method)
{
if (!method.HasBody || method.MetadataToken.IsNil)
return null;
var module = method.ParentModule.PEFile;
var md = module.Metadata.GetMethodDefinition((MethodDefinitionHandle)method.MetadataToken);
return module.Reader.GetMethodBody(md.RelativeVirtualAddress);
}
Note that there is no object model for instructions. You will have to do the byte parsing yourself. However there are some extension / helper methods in ICSharpCode.Decompiler.Disassembler.ILParser.
-
After getting the SRM
MethodBodyBlock
(like described above), you can get access to the IL byte stream viaMethodBodyBlock.GetILReader()
(there areGetILBytes()
andGetILContent()
as well, but it is a better idea to useGetILReader()
, because it returns aBlobReader
and avoids copying the bytes into a managed byte array. The aforementionedILParser
defines extension methods onBlobReader
. Keep in mind thatBlobReader
is a value type, so changes are only reflected back to the caller if passed by-ref. The extension methods defined inILParser
are safe to use, because they are defined asthis ref ILReader
.Local variables can be retrieved by decoding the
MethodBodyBlock.LocalSignature
(for example, by usingdts.MainModule.DecodeLocalSignature
). Exception regions can be accessed throughMethodBodyBlock.ExceptionRegions
.Note that
MethodBodyBlock.Size
includes the header bytes and exception regions. If you are only interested in the byte length of the IL instructions, you can useMethodBodyBlock.GetILReader().Length
.
You will (hopefully) not be affected in a bad way by this change. All functionality provided by ILSpy should stay the same or be improved only. If you come across anything that worked in a previous version of ILSpy, but no longer in version 4.0, please open an issue. Thank you!
-
CSharpDecompiler
:- now uses SRM's
EntityHandle
in allDecompile*
methods. See *** for examples. -
ConvertType
removed: Use theTypeSystem
to retrieve the IEntity instance and useTypeSystemAstBuilder
to create the AstNode/AstType.
- now uses SRM's
-
IDecompilerTypeSystem
:- Integrated with the type system:
DecompilerTypeSystem
now implementsICompilation
. -
GetCecil
replaced byIEntity.MetadataToken
: Each entity now stores its metadata token. - New functionality:
-
dts.MainModule.GetDefinition()
to retrieve type system entities for metadata tokens (definitions only). -
dts.MainModule.Resolve*
methods to retrieve type system entities for metadata tokens (both defs+refs). -
dts.MainModule.Decode*
methods to decode stand-alone signatures. -
dts.MainModule.PEFile
: Retrieves thePEFile
instance that was used to construct the type system. -
dts.MainModule.PEFile.Metadata
- the SRMMetadataReader
of the main assembly, which provides access to metadata tables. You can useIEntity.MetadataToken
to retrieve metadata info. If you need to access an entity from a referenced assembly, you should useIEntity.ParentModule.PEFile.Metadata
to access the correspondingMetadataReader
.
-
- Integrated with the type system:
-
UniversalAssemblyResolver
:-
LoadMainModule
removed: theIAssemblyResolver
instance now needs to be constructed independently from theCSharpDecompiler
. TheIAssemblyResolver
should be passed to theCSharpDecompiler
constructor.
-
-
NEW
IDocumentationProvider
interface can be used to provide XML documentation comments. -
NEW
IDebugInfoProvider
interface can be used to provide debug info to the decompiler.
- Language API and Tree View now uses the Decompiler TypeSystem.
-
NEW Analyzer API
-
AnalyzerTreeView.Instance.Analyze(IEntity)
is now publicly accessible. -
IAnalyzer
andExportAnalyzerAttribute
should enable plugins to be able to add new analyzers.
-
[1] https://github.com/dotnet/corefx/tree/master/src/System.Reflection.Metadata