Skip to content

Commit

Permalink
More cleanup
Browse files Browse the repository at this point in the history
  • Loading branch information
jaredpar committed Oct 13, 2021
1 parent d01314c commit cbf0e20
Show file tree
Hide file tree
Showing 2 changed files with 22 additions and 17 deletions.
37 changes: 21 additions & 16 deletions determinism.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,8 @@
GetDeterministicKey
===

**This is meant to be the text of the issue I will eventually file for the API**

## Background and Motivation
The `Compilation` type is fully deterministic meaning that given the same inputs
(`SyntaxTree`, `CompilationOptions`, etc ...) it will produce the same output. By
Expand All @@ -14,7 +16,8 @@ Customers have to resort to hand written comparisons which requires a fairly
intimate knowledge of the compiler (for example knowing what does and does not
impact determinism). Such solutions are not version tolerant; every time the
compiler adds a new property that impacts determinism the solution must be
updated.
updated. Even when proper equality checks are in place this does not help
distributed computing where equality must be decided across different processes.

The motivation here is to provide an API that returns a string based key for a
given `Compilation` such that for two equivalent `Compilation` instances the
Expand All @@ -29,7 +32,12 @@ namespace Microsoft.CodeAnalysis
{
public class Compilation
{
+ public string GetDeterministicKey(DeterministicKeyOptions options = default)
+ public string GetDeterministicKey(
+ ImmutableArray<AdditionalText> additionalTexts = default,
+ ImmutableArray<DiagnosticAnalyzer> analyzers = default,
+ ImmutableArray<ISourceGenerator> generators = default,
+ EmitOptions? emitOptions = null,
+ DeterministicKeyOptions options = DeterministicKeyOptions.Default)
}

internal enum DeterministicKeyOptions
Expand All @@ -54,19 +62,21 @@ namespace Microsoft.CodeAnalysis
}
```

The return of `GetDeterministicKey` is an opaque string that represents a markle
tree of the `Compilation` contents. Two `Compilation` which produce different
output, diagnostics or binaries, will have different strings returned for this
function.
The return of `GetDeterministicKey` is an opaque string that full represents
the content of the `Compilation` contents. Two `Compilation` which produce
different output, diagnostics or binaries, will have different strings returned
for this function.

The return of `GetDeterministicKey` can, and by default will, change between versions
of the compiler. That is true of both the content of the string as well as the
underlying format. Consumers should not take any dependency on the content of this string
other than it being an effective hash of the `Compilation` it came from.
underlying format. The content must change because part of the input to compilation
is the version of the compiler. The format will change as desired by the
implementation. Consumers should not take any dependency on the format of this
string other than it being an effective hash of the `Compilation` it came from.

The merkle tree returned here is not a minimal tree, or specified to any depth
(it's opaque). The content can be compressed further by running through a hashing
function such as SHA-256 to get a minimal hash.
The string returned here will be human readable and visually diffable. It will
not be a minimal representation though. The content can, and is expected to be,
compressed further with a hashing function such as SHA-256.

For example here is the proposed return for the following `net5.0` program:

Expand Down Expand Up @@ -115,10 +125,6 @@ System.Console.WriteLine("Hello World");

The full output can be seen [here](https://gist.github.com/jaredpar/654d84f64de2d728685a7d4ccde944e7)

Note: I'm unsure if "merkle tree" is the best term here. It's not a precise merkle
tree because it does have non-hash leafs. But this term is used for other formats
like a git tree that also don't have pure hash values in the leafs.

## Usage Examples

### Output caching
Expand Down Expand Up @@ -161,6 +167,5 @@ How does this compare to analogous APIs in other ecosystems and libraries?
Determinism is hard

## Work Remaining
- Need to consider `EmitOptions` in the output of `GetDeterministicKey`


2 changes: 1 addition & 1 deletion src/Compilers/Core/Portable/PublicAPI.Unshipped.txt
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
abstract Microsoft.CodeAnalysis.SyntaxTree.GetLineMappings(System.Threading.CancellationToken cancellationToken = default(System.Threading.CancellationToken)) -> System.Collections.Generic.IEnumerable<Microsoft.CodeAnalysis.LineMapping>!
const Microsoft.CodeAnalysis.WellKnownMemberNames.PrintMembersMethodName = "PrintMembers" -> string!
Microsoft.CodeAnalysis.Compilation.EmitDifference(Microsoft.CodeAnalysis.Emit.EmitBaseline! baseline, System.Collections.Generic.IEnumerable<Microsoft.CodeAnalysis.Emit.SemanticEdit>! edits, System.Func<Microsoft.CodeAnalysis.ISymbol!, bool>! isAddedSymbol, System.IO.Stream! metadataStream, System.IO.Stream! ilStream, System.IO.Stream! pdbStream, System.Threading.CancellationToken cancellationToken = default(System.Threading.CancellationToken)) -> Microsoft.CodeAnalysis.Emit.EmitDifferenceResult!
Microsoft.CodeAnalysis.Compilation.GetDeterministicKey(System.Collections.Immutable.ImmutableArray<Microsoft.CodeAnalysis.AdditionalText!> additionalTexts = default(System.Collections.Immutable.ImmutableArray<Microsoft.CodeAnalysis.AdditionalText!>), System.Collections.Immutable.ImmutableArray<Microsoft.CodeAnalysis.Diagnostics.DiagnosticAnalyzer!> analyzers = default(System.Collections.Immutable.ImmutableArray<Microsoft.CodeAnalysis.Diagnostics.DiagnosticAnalyzer!>), System.Collections.Immutable.ImmutableArray<Microsoft.CodeAnalysis.ISourceGenerator!> generators = default(System.Collections.Immutable.ImmutableArray<Microsoft.CodeAnalysis.ISourceGenerator!>), Microsoft.CodeAnalysis.Emit.EmitOptions? emitOptions = null, Microsoft.CodeAnalysis.DeterministicKeyOptions deterministicKeyOptions = Microsoft.CodeAnalysis.DeterministicKeyOptions.Default) -> string!
Microsoft.CodeAnalysis.Compilation.GetDeterministicKey(System.Collections.Immutable.ImmutableArray<Microsoft.CodeAnalysis.AdditionalText!> additionalTexts = default(System.Collections.Immutable.ImmutableArray<Microsoft.CodeAnalysis.AdditionalText!>), System.Collections.Immutable.ImmutableArray<Microsoft.CodeAnalysis.Diagnostics.DiagnosticAnalyzer!> analyzers = default(System.Collections.Immutable.ImmutableArray<Microsoft.CodeAnalysis.Diagnostics.DiagnosticAnalyzer!>), System.Collections.Immutable.ImmutableArray<Microsoft.CodeAnalysis.ISourceGenerator!> generators = default(System.Collections.Immutable.ImmutableArray<Microsoft.CodeAnalysis.ISourceGenerator!>), Microsoft.CodeAnalysis.Emit.EmitOptions? emitOptions = null, Microsoft.CodeAnalysis.DeterministicKeyOptions options = Microsoft.CodeAnalysis.DeterministicKeyOptions.Default) -> string!
Microsoft.CodeAnalysis.DeterministicKeyOptions
Microsoft.CodeAnalysis.DeterministicKeyOptions.Default = 0 -> Microsoft.CodeAnalysis.DeterministicKeyOptions
Microsoft.CodeAnalysis.DeterministicKeyOptions.IgnorePaths = 1 -> Microsoft.CodeAnalysis.DeterministicKeyOptions
Expand Down

0 comments on commit cbf0e20

Please sign in to comment.