Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

API for getting a deterministic key for a Compilation #57134

Merged
merged 55 commits into from
Jan 25, 2022
Merged

Conversation

jaredpar
Copy link
Member

@jaredpar jaredpar commented Oct 13, 2021

The Compilation type is fully deterministic meaning that given the same inputs
(SyntaxTree, CompilationOptions, etc ...) it will produce the same output. By
that it means it will produce byte for byte equivalent binaries and the same
diagnostics. This is extremely valuable in infrastructure because it allows
for caching and opens the door for efficient distributed processing.

At the moment this is hard to leverage because consumers can't determine if
two Compilation instances are equivalent for the purposes of determinism.
Customers have to resort to hand written comparisons which requires a fairly
intimate knowledge of the compiler (for example knowing what does and does not
impact determinism). Such solutions are not version tolerant; every time the
compiler adds a new property that impacts determinism the solution must be
updated. Even when proper equality checks are in place this does not help
distributed computing where equality must be decided across different processes.

The motivation here is to provide an API that returns a string based key for a
given Compilation such that for two equivalent Compilation instances the
key will also be equivalent. This will allow customers who want to leverage
deterministic caching and communication to have a simple, and transmittable,
way to full describe the Compilation they are processing.

Remaining items:

  • File issue tracking API review
  • Write lots more tests once general structure of change is agreed on

Follow up Items:

@RikkiGibson
Copy link
Contributor

I still haven't reviewed this. I am still planning to.

@jaredpar
Copy link
Member Author

@RikkiGibson

I still haven't reviewed this. I am still planning to.

I hope your feedback is deterministic so it would be the same now as when I originally filed this PR

😄

Copy link
Contributor

@RikkiGibson RikkiGibson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The general approach seems fine. Had a lot of small questions. Very cool progress.

determinism.md Outdated Show resolved Hide resolved
Copy link
Member

@CyrusNajmabadi CyrusNajmabadi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ugh. pending changes ae the worst.

determinism.md Outdated
+ public string GetDeterministicKey(
+ ImmutableArray<AdditionalText> additionalTexts = default,
+ ImmutableArray<DiagnosticAnalyzer> analyzers = default,
+ ImmutableArray<ISourceGenerator> generators = default,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this slightly worries me as this implies the analyzers/generators are instantiated. that may not be desirable. @jasonmalinowski for thoughts.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What was the representation you were expecting to see here?

determinism.md Outdated Show resolved Hide resolved
determinism.md Outdated Show resolved Hide resolved
determinism.md Outdated

Note: this scenario does require more work as the `Compilation` objects created
in the IDE and Compiler differ in subtle ways. Having this function though would
be motivation to clean these up such that they are equal.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

similar to the recent discussion about how IncrementalSourceGneration is still slow if we require a compilation to be ready up front, we should consider what fixed costs any APIs we are creating mandate, and if we can avoid them. We've already talked about moving this off Compilation (or having a alternative non-Compilation entrypoint) to avoid teh cost of createing hte compilation. We should also consider how we can avoid any costs involved making the subpieces we build the key off of. Some things seems unavoidable (like reading in a source text). HOwever others may be avoidable if they can be determined from some more primitive piece of data. For example, instead of SGs and Analyzer instances, it would be beneficial on the IDE side if just hte PERef alone was sufficient to make that determination.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe we have addressed this with the changes you added to my PR (thank you!). This is an area I want to keep a close eye on though particularly during API review. If we see any more I am happy to continue pushing on the perf aspect here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes. i'm happy if we have SyntaxTreeKey.

determinism.md Outdated Show resolved Hide resolved
determinism.md Outdated Show resolved Hide resolved
writer.WriteObjectStart();
WriteFileName(writer, "fileName", syntaxTree.FilePath, options);
writer.WriteKey("text");
WriteSourceText(writer, syntaxTree.GetText());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

need cancellation token here.

writeSubsystemVersion(writer, options.SubsystemVersion);
writer.Write("fileAlignment", options.FileAlignment);
writer.Write("highEntropyVirtualAddressSpace", options.HighEntropyVirtualAddressSpace);
writer.Write("baseAddress", options.BaseAddress.ToString());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this likely needs passing in invariant culture here. we should likely do an audit of every ToString to make sure it is valid.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moved to "G" which should round trip across cultures. I'm verifying that now.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nope I was wrong about this.

public void Write<T>(string key, T value) where T : struct, Enum
{
WriteKey(key);
Write(value.ToString());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i'm on the fence about this. i feel like perhaps for enums we just emit the integral value. the primary concern i have is uncertainty if the runtime always gives the same string value (for things like flags enums) when you call tostring across launches.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The [Flags] case should be safe as the runtime guarantees the behavior there. From the doc comments

If the FlagsAttribute is applied and there is a combination of one or more named constants equal to the value of this instance, then the return value is a string containing a delimiter-separated list of the names of the constants.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It wasn't 100% clear to me that the constants will always be listed in the same order, but it would be surprising if the order were ever different for a given value.

For Each pair In basicOptions.PreprocessorSymbols
writer.WriteObjectStart()
writer.WriteKey(pair.Key)
writer.Write(pair.Value.ToString())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

interesting. this is .ToString on an aribtrary object. Do we know what forms this will take and if this is safe?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this enumerates a dictionary. that may have different results across rules of hte same program as i don't believe .net guarantees deterministic order. Consider sorting things like this to ensure we're ok.

internal TestableFileSystem FileSystem { get; }
internal BuildPaths BuildPaths { get; }

internal TestableCompiler(CommonCompiler compiler, TestableFileSystem fileSystem, BuildPaths buildPaths)
Copy link
Member

@jcouv jcouv Jan 24, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fileSystem

Why are we passing in a fileSystem if it has to be the same as compiler.FileSystem? Can't we just use compiler.FileSystem then, and remove the extra parameter? #Pending

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem with the property is that it attaches after command line arguments are parsed where the ctor parameter has a chance to thread it through. It's long been a problem with this setup and one that I'm slowly working my waay trhough.

Copy link
Member

@jcouv jcouv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM Thanks (iteration 54)

@jaredpar jaredpar enabled auto-merge (squash) January 24, 2022 23:25
Copy link
Member

@jcouv jcouv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM Thanks (iteration 55)

@jaredpar jaredpar merged commit df86b32 into dotnet:main Jan 25, 2022
@ghost ghost added this to the Next milestone Jan 25, 2022
@RikkiGibson RikkiGibson modified the milestones: Next, 17.2.P1 Feb 4, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

API proposal: Compilation.GetDeterministicKey
7 participants