Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Determinism for NativeAOT #73931

Closed
Tracked by #1632
agocke opened this issue Oct 6, 2021 · 4 comments · Fixed by #93600
Closed
Tracked by #1632

Determinism for NativeAOT #73931

agocke opened this issue Oct 6, 2021 · 4 comments · Fixed by #93600

Comments

@agocke
Copy link
Member

agocke commented Oct 6, 2021

Deterministic binaries should be supported for NativeAOT and we should have tests.

In short, the feature is that identical inputs to the exact same compiler toolchain should produce bit-for-bit identical results.

MichalStrehovsky referenced this issue in MichalStrehovsky/runtime Mar 24, 2022
I was trying to repro dotnet#66191 and assumed determinism is the root cause. I used `CoreRT_DeterminismSeed` to perturb the dependency node evaluation and it found differences.

Now, I don't think the differences really matter. We already seem to be deterministic in the multithreaded compilation (the outputs are always the same no matter how many times I try).

`CoreRT_DeterminismSeed` perturbs the order in which node dependencies are evaluated and that exposes more ordering issues than we practically have.

In normal multithreaded compilation we grab nodes in different order from the `NodeFactory` as part of method compilation, but the order in which those nodes are evaluated is still fixed. This means the nodes we grabbed are still getting marked in a deterministic order and we get stable ordering even without an explicit sort in this spot.

But maybe as part of https://github.com/dotnet/runtimelab/issues/1631 we should enable `CoreRT_Determinism` in our testing and for that we need this extra sorting.
jkotas referenced this issue Mar 25, 2022
I was trying to repro #66191 and assumed determinism is the root cause. I used `CoreRT_DeterminismSeed` to perturb the dependency node evaluation and it found differences.

Now, I don't think the differences really matter. We already seem to be deterministic in the multithreaded compilation (the outputs are always the same no matter how many times I try).

`CoreRT_DeterminismSeed` perturbs the order in which node dependencies are evaluated and that exposes more ordering issues than we practically have.

In normal multithreaded compilation we grab nodes in different order from the `NodeFactory` as part of method compilation, but the order in which those nodes are evaluated is still fixed. This means the nodes we grabbed are still getting marked in a deterministic order and we get stable ordering even without an explicit sort in this spot.

But maybe as part of https://github.com/dotnet/runtimelab/issues/1631 we should enable `CoreRT_Determinism` in our testing and for that we need this extra sorting.
radekdoulik referenced this issue in radekdoulik/runtime Mar 30, 2022
I was trying to repro dotnet#66191 and assumed determinism is the root cause. I used `CoreRT_DeterminismSeed` to perturb the dependency node evaluation and it found differences.

Now, I don't think the differences really matter. We already seem to be deterministic in the multithreaded compilation (the outputs are always the same no matter how many times I try).

`CoreRT_DeterminismSeed` perturbs the order in which node dependencies are evaluated and that exposes more ordering issues than we practically have.

In normal multithreaded compilation we grab nodes in different order from the `NodeFactory` as part of method compilation, but the order in which those nodes are evaluated is still fixed. This means the nodes we grabbed are still getting marked in a deterministic order and we get stable ordering even without an explicit sort in this spot.

But maybe as part of https://github.com/dotnet/runtimelab/issues/1631 we should enable `CoreRT_Determinism` in our testing and for that we need this extra sorting.
@MichalStrehovsky MichalStrehovsky transferred this issue from dotnet/runtimelab Aug 15, 2022
@ghost ghost added the untriaged New issue has not been triaged by the area owner label Aug 15, 2022
@MichalStrehovsky MichalStrehovsky added this to the 8.0.0 milestone Aug 15, 2022
@ghost ghost removed the untriaged New issue has not been triaged by the area owner label Aug 15, 2022
@MichalStrehovsky
Copy link
Member

We produce deterministic object files, but the linker then messes it all up with timestamps and GUIDs, so there's well understood diffs in the final output. The VC++ team has been dragging their feet adding support for this: https://developercommunity.visualstudio.com/t/support-for-cls-experimentaldeterministic-becoming/426033

Experimentally adding /EXPERIMENTAL:DETERMINISTIC to the linker command line produces even more diffs so there's probably a secret handshake with cl that we'll need to reproduce.

I'm reluctant adding undocumented link.exe switches to our targets. We need to revisit in 8.0 and we likely need a link.exe feature. I'm starting an email thread.

@MichalStrehovsky
Copy link
Member

The diffs are possible to patch out. A tool like this could be used: https://github.com/jasonwhite/ducible (Haven't actually tried)

@MichalStrehovsky
Copy link
Member

It was pointed out to me by someone over email that when link.exe is passed a res file, it will run cvtres to generate an obj file with a random name and the random name messes up some of the determinism. So we might need to fix #79634 to help the determinism. We're still limited by what link.exe can do though.

MichalStrehovsky added a commit that referenced this issue Jul 11, 2023
Fixes #79634.

To generate Win32 resources, we'd first use an MSBuild task to dump Win32 resources from the input IL module into a `.res` file. We'd then pass the `.res` file to link.exe. Link.exe would then invoke cvtres.exe to convert the `.res` file to `.obj`. Link.exe then links the `.obj` into the final executable.

Skip all of this and instead generate the correct object data in the compiler directly. I'm reusing Win32 resource emission logic from crossgen2. It already mostly does the right thing - we just needed to split the `.rsrc` section into `.rsrc$01` and `.rsrc$02` that link.exe expects. The first part has the data directory. The second part has the actual resource data.

Contributes to #73931. The cvtres step was actually non-deterministic because it creates an object file in TEMP and embeds this temporary name into the object.
@agocke
Copy link
Member Author

agocke commented Jul 18, 2023

Moving to future as we're fundamentally constrained on Windows

@agocke agocke modified the milestones: 8.0.0, Future Jul 18, 2023
MichalStrehovsky added a commit to MichalStrehovsky/runtime that referenced this issue Oct 17, 2023
Resolves dotnet#73931. This checks that the result of parallel build is the same as the result of single-threaded build.

I'd like to add some more code to this so that there's more junk to compile but don't have a good idea of what to add. `XmlSerializer`? Suggestions welcome.
@ghost ghost added the in-pr There is an active PR which will close this issue when it is merged label Oct 17, 2023
MichalStrehovsky added a commit that referenced this issue Nov 10, 2023
Resolves #73931. This checks that the result of parallel build is the same as the result of single-threaded build.
@ghost ghost removed the in-pr There is an active PR which will close this issue when it is merged label Nov 10, 2023
@github-actions github-actions bot locked and limited conversation to collaborators Dec 10, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants