You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The goal is to include enough information in the PDB so that a consumer could fully reconstruct the original compilation based on the information stored in the PDB, the assembly metadata, symbol and source servers and any private keys used to build the assembly.
We assume the binaries were build with /deterministic and Source Link. All dependencies of the compilation must have been published to the symbol server.
There are several scenarios that could use this feature, each might need slightly different information in the PDB. One of the main scenarios is being able to run Roslyn analyzers on source of 3rd party OSS libraries. As long as the libraries were build with above assumptions it should be possible to recreate and analyze the original compilation without rebuilding the library from its source repository and injecting the analyzer into such build.
Information required to create a compilation comprises of:
source files
Available via Source Link, or embedded in the PDB.
metadata references
The assembly metadata only contain assembly identities, which is not sufficient.
Proposal: Include an id of the assembly that can be used to look the assembly up on the symbol server.
Symbol servers currently index PE files using a key that combines the timestamp and file size found in PE COFF header.
When the assembly is built with /deterministic the timestamp is 4B slice of the assembly content hash.
Depending on how this feature is used this kind of id might be considered too weak. If so we should consider updating symbol uploaders to index PE files with an MVID in addition to the existing key.
analyzer references
Although analyzers do not affect the compiler output files it might be useful for some scenarios to store the list of analyzer references in the same way that metadata references would be stored.
For example, as a proof that certain rules were validated during the build and do not need to be validated again.
If we include analyzer references we would need to include additional files as well. Including content of these files might bloat the PDB, so we would probably want to use Source Link for those files that are checked into the repo. This would however required an update to Source Link to account for these files when building a list of source-control-untracked files.
All assemblies that analyzers depend on would also need to be captured.
The added value doesn't seem worth the effort.
source generator references
Source generator references would be included in the same format as metadata references.
Since all source files that source generators may generate are embedded in the PDB while the files that are the inputs to the generators are not (at least not currently), it doesn't seem useful to store source generator references to the PDB.
resources
Resources are stored in the PE file. It should be possible to extract them from there and include them in the new compilation.
primitive value options
Options that have primitive values can be easily serialized to a string and included in the PDB.
Note that PathMap must be excluded in order for the PDB content to stay deterministic. The information in the map is not needed when reconstructing the compilation since the paths extracted from the DLL/PDB have already been mapped.
default source encoding
The encoding used to interpret content of source files that do not declare their encoding via BOM.
signing keys
Public key is included in the AssemblyDef. Private key is not, for obvious reasons, and has to be supplied externally.
…#44373)
Add new information for portable pdbs to help reconstruct the same compilation if source is available. Added spec file with detailed description of design.
Implements #41395
The goal is to include enough information in the PDB so that a consumer could fully reconstruct the original compilation based on the information stored in the PDB, the assembly metadata, symbol and source servers and any private keys used to build the assembly.
We assume the binaries were build with
/deterministic
and Source Link. All dependencies of the compilation must have been published to the symbol server.There are several scenarios that could use this feature, each might need slightly different information in the PDB. One of the main scenarios is being able to run Roslyn analyzers on source of 3rd party OSS libraries. As long as the libraries were build with above assumptions it should be possible to recreate and analyze the original compilation without rebuilding the library from its source repository and injecting the analyzer into such build.
Information required to create a compilation comprises of:
source files
Available via Source Link, or embedded in the PDB.
metadata references
The assembly metadata only contain assembly identities, which is not sufficient.
Proposal: Include an id of the assembly that can be used to look the assembly up on the symbol server.
Symbol servers currently index PE files using a key that combines the timestamp and file size found in PE COFF header.
When the assembly is built with
/deterministic
the timestamp is 4B slice of the assembly content hash.Depending on how this feature is used this kind of id might be considered too weak. If so we should consider updating symbol uploaders to index PE files with an MVID in addition to the existing key.
analyzer references
Although analyzers do not affect the compiler output files it might be useful for some scenarios to store the list of analyzer references in the same way that metadata references would be stored.
For example, as a proof that certain rules were validated during the build and do not need to be validated again.
If we include analyzer references we would need to include additional files as well. Including content of these files might bloat the PDB, so we would probably want to use Source Link for those files that are checked into the repo. This would however required an update to Source Link to account for these files when building a list of source-control-untracked files.
All assemblies that analyzers depend on would also need to be captured.
The added value doesn't seem worth the effort.
source generator references
Source generator references would be included in the same format as metadata references.
Since all source files that source generators may generate are embedded in the PDB while the files that are the inputs to the generators are not (at least not currently), it doesn't seem useful to store source generator references to the PDB.
resources
Resources are stored in the PE file. It should be possible to extract them from there and include them in the new compilation.
primitive value options
Options that have primitive values can be easily serialized to a string and included in the PDB.
Note that
PathMap
must be excluded in order for the PDB content to stay deterministic. The information in the map is not needed when reconstructing the compilation since the paths extracted from the DLL/PDB have already been mapped.default source encoding
The encoding used to interpret content of source files that do not declare their encoding via BOM.
signing keys
Public key is included in the
AssemblyDef
. Private key is not, for obvious reasons, and has to be supplied externally.the compiler version
The version of the compiler used to build the assembly. We might also need the version of the CLR and the zlib library in order to reproduce the build outputs bit-for-bit. See https://github.com/dotnet/roslyn/blob/master/docs/compilers/Deterministic%20Inputs.md.
The text was updated successfully, but these errors were encountered: