From ed9e1cbd4486ec0797d7a6c08fa65ce5af261279 Mon Sep 17 00:00:00 2001 From: Arlo Siemsen Date: Tue, 24 Mar 2020 15:34:03 -0700 Subject: [PATCH] Add section describing source file checksums in debug info --- src/debugging-support-in-rustc.md | 43 ++++++++++++++++++++++++++++--- 1 file changed, 39 insertions(+), 4 deletions(-) diff --git a/src/debugging-support-in-rustc.md b/src/debugging-support-in-rustc.md index a11edba78..912ceffae 100644 --- a/src/debugging-support-in-rustc.md +++ b/src/debugging-support-in-rustc.md @@ -268,6 +268,45 @@ Focus is to let macros decide what to do. This can be achieved by having some ki that lets the macro tell the compiler where the line marker should be. This affects where you set the breakpoints and what happens when you step it. +## Source file checksums in debug info + +Both DWARF and CodeView (PDB) support embedding a cryptographic hash of each source file that +contributed to the associated binary. + +The cryptographic hash can be used by a debugger to verify that the source file matches the +executable. If the source file does not match, the debugger can provide a warning to the user. + +The hash can also be used to prove that a given source file has not been modified since it was +used to compile an executable. Because MD5 and SHA1 both have demonstrated vulnerabilities, +using SHA256 is recommended for this application. + +The Rust compiler stores the hash for each source file in the corresponding `SourceFile` in +the `SourceMap`. The hashes of input files to external crates are stored in `rlib` metadata. + +A default hashing algorithm is set in the target specification. This allows the target to +specify the best hash available, since not all targets support all hash algorithms. + +The hashing algorithm for a target can also be overridden with the `-Z source-file-checksum=` +command-line option. + +#### DWARF 5 +DWARF version 5 supports embedding an MD5 hash to validate the source file version in use. +DWARF 5 - Section 6.2.4.1 opcode DW_LNCT_MD5 + +#### LLVM +LLVM IR supports MD5 and SHA1 (and SHA256 in LLVM 11+) source file checksums in the DIFile node. + +[LLVM DIFile documentation](https://llvm.org/docs/LangRef.html#difile) + +#### Microsoft Visual C++ Compiler /ZH option +The MSVC compiler supports embedding MD5, SHA1, or SHA256 hashes in the PDB using the `/ZH` +compiler option. + +[MSVC /ZH documentation](https://docs.microsoft.com/en-us/cpp/build/reference/zh) + +#### Clang +Clang always embeds an MD5 checksum, though this does not appear in documentation. + ## Future work #### Name mangling changes @@ -295,10 +334,6 @@ They implement just the expression language but they also add some extensions li convenience variables. Therefore, if you are taking this route then you not only need to do this bridge but may have to add some mode to let the compiler understand some extensions. -#### Windows debugging (PDB) is missing - -This is a complete unknown. - [Tom Tromey discusses debugging support in rustc]: https://www.youtube.com/watch?v=elBxMRSNYr4 [Debugging the Compiler]: compiler-debugging.md [debugger or debugging tool]: https://en.wikipedia.org/wiki/Debugger