Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Include compiler switches in the PDB #41395

Closed
tmat opened this issue Feb 4, 2020 · 1 comment
Closed

Include compiler switches in the PDB #41395

tmat opened this issue Feb 4, 2020 · 1 comment
Milestone

Comments

@tmat
Copy link
Member

tmat commented Feb 4, 2020

The goal is to include enough information in the PDB so that a consumer could fully reconstruct the original compilation based on the information stored in the PDB, the assembly metadata, symbol and source servers and any private keys used to build the assembly.

We assume the binaries were build with /deterministic and Source Link. All dependencies of the compilation must have been published to the symbol server.

There are several scenarios that could use this feature, each might need slightly different information in the PDB. One of the main scenarios is being able to run Roslyn analyzers on source of 3rd party OSS libraries. As long as the libraries were build with above assumptions it should be possible to recreate and analyze the original compilation without rebuilding the library from its source repository and injecting the analyzer into such build.

Information required to create a compilation comprises of:

source files

Available via Source Link, or embedded in the PDB.

metadata references

The assembly metadata only contain assembly identities, which is not sufficient.

Proposal: Include an id of the assembly that can be used to look the assembly up on the symbol server.

Symbol servers currently index PE files using a key that combines the timestamp and file size found in PE COFF header.

When the assembly is built with /deterministic the timestamp is 4B slice of the assembly content hash.

Depending on how this feature is used this kind of id might be considered too weak. If so we should consider updating symbol uploaders to index PE files with an MVID in addition to the existing key.

analyzer references

Although analyzers do not affect the compiler output files it might be useful for some scenarios to store the list of analyzer references in the same way that metadata references would be stored.

For example, as a proof that certain rules were validated during the build and do not need to be validated again.

If we include analyzer references we would need to include additional files as well. Including content of these files might bloat the PDB, so we would probably want to use Source Link for those files that are checked into the repo. This would however required an update to Source Link to account for these files when building a list of source-control-untracked files.

All assemblies that analyzers depend on would also need to be captured.

The added value doesn't seem worth the effort.

source generator references

Source generator references would be included in the same format as metadata references.

Since all source files that source generators may generate are embedded in the PDB while the files that are the inputs to the generators are not (at least not currently), it doesn't seem useful to store source generator references to the PDB.

resources

Resources are stored in the PE file. It should be possible to extract them from there and include them in the new compilation.

primitive value options

Options that have primitive values can be easily serialized to a string and included in the PDB.

Note that PathMap must be excluded in order for the PDB content to stay deterministic. The information in the map is not needed when reconstructing the compilation since the paths extracted from the DLL/PDB have already been mapped.

default source encoding

The encoding used to interpret content of source files that do not declare their encoding via BOM.

signing keys

Public key is included in the AssemblyDef. Private key is not, for obvious reasons, and has to be supplied externally.

the compiler version

The version of the compiler used to build the assembly. We might also need the version of the CLR and the zlib library in order to reproduce the build outputs bit-for-bit. See https://github.com/dotnet/roslyn/blob/master/docs/compilers/Deterministic%20Inputs.md.

@tmat
Copy link
Member Author

tmat commented Feb 4, 2020

@tmat tmat added this to the 16.7 milestone Mar 5, 2020
@vatsalyaagrawal vatsalyaagrawal modified the milestones: 16.7, Backlog Apr 29, 2020
ryzngard added a commit that referenced this issue Jun 8, 2020
…#44373)

Add new information for portable pdbs to help reconstruct the same compilation if source is available. Added spec file with detailed description of design.

Implements #41395
@ryzngard ryzngard closed this as completed Jun 8, 2020
@vatsalyaagrawal vatsalyaagrawal modified the milestones: Backlog, 16.7 Jul 23, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants