Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Document the plan for fixing torn state compiler #1

Closed
wants to merge 10 commits into from

Conversation

jaredpar
Copy link
Owner

This is a very rough draft

documentation/general/torn-sdk.md Outdated Show resolved Hide resolved
documentation/general/torn-sdk.md Outdated Show resolved Hide resolved
documentation/general/torn-sdk.md Outdated Show resolved Hide resolved
Instead of `<PackageDownload>` the toolset package could be installed via `<PackageReference>`. This is how the existing Microsoft.Net.Compilers.Toolset package works today. This has a number of disadvantages:

1. PackageReferences are not side effect free and have been shown to cause issues in complex builds. For example this package did not work by default in the Visual Studio or Bing builds. Deploying this at scale to all our customers will almost ceratinly cause issues.
2. This approach does not solve all scenarios. The mechanics of restore mean that the compiler is not swapped until part way through the build. Build operations that happen earlier, such as legacy WPF, will still use the Visual Studio compiler and hence experience analyzer compatibility issues.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Build operations that happen earlier, such as legacy WPF, will still use the Visual Studio compiler and hence experience analyzer compatibility issues.

I'd love to learn more about this - WPF builds things before Restore?!

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@KirillOsenkov knows the details of why this occurs. I don't fully understand but I trust that he is correct on this.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe it's not that it builds before restore, it's that (older copies of) the WPF task/targets don't include restore outputs and thus aren't affected by restore. dotnet/wpf#9112 has more details if you were feeling too cheerful today.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it's a mess. There can be three modes:

  1. non-SDK (legacy) projects targeting desktop framework and using legacy WPF,
  2. SDK-style projects targeting desktop framework and legacy WPF, and
  3. SDK-style projects targeting modern .NET and modern open-source WPF (UseWPF=true).

Only the third (modern) mode has the fixes to see PackageReferences and restore outputs. The first two import the targets from C:\Windows :(

documentation/general/torn-sdk.md Outdated Show resolved Hide resolved
documentation/general/torn-sdk.md Outdated Show resolved Hide resolved
jaredpar and others added 2 commits June 20, 2024 20:24
Co-authored-by: Fred Silberberg <fred@silberberg.xyz>
@jaredpar
Copy link
Owner Author

Incorporated feedback and pushed another draft.


This will be moved to an issue when this PR is opened against dotnet/sdk proper:

- [ ]: Flow the Micosoft.Net.Compilers.Toolset.Framework package to the .NET SDK

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This step is already done, right?

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. I kept it here on a hedge that we may have to change the setup of it / ship a new package. If the doc ends up in this current state then will delete this when I file the issue.

documentation/general/torn-sdk.md Outdated Show resolved Hide resolved
documentation/general/torn-sdk.md Outdated Show resolved Hide resolved
Instead of `<PackageDownload>` the toolset package could be installed via `<PackageReference>`. This is how the existing Microsoft.Net.Compilers.Toolset package works today. This has a number of disadvantages:

1. PackageReferences are not side effect free and have been shown to cause issues in complex builds. For example this package did not work by default in the Visual Studio or Bing builds. Deploying this at scale to all our customers will almost ceratinly cause issues.
2. This approach does not solve all scenarios. The mechanics of restore mean that the compiler is not swapped until part way through the build. Build operations that happen earlier, such as legacy WPF, will still use the Visual Studio compiler and hence experience analyzer compatibility issues.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe it's not that it builds before restore, it's that (older copies of) the WPF task/targets don't include restore outputs and thus aren't affected by restore. dotnet/wpf#9112 has more details if you were feeling too cheerful today.

documentation/general/torn-sdk.md Outdated Show resolved Hide resolved
documentation/general/torn-sdk.md Show resolved Hide resolved
Instead of downloading a .NET Framework Roslyn in a torn state, the SDK could just use the .NET Core Roslyn it comes with. That would have virtually no real compatibility issues. This has a number of disadvantages:

1. Changes the process name from VBCSCompiler to `dotnet`. That seems insignificant but the process name is important as it's referenced in a lot of build scripts. This would be a subtle breaking change that we'd have to work with customers on.
2. This would make Visual Studio dependent on the servicing lifetime of the .NET Core runtime in the SDK. That has a very different lifetime and could lead to surprising outcomes down the road. For example if policy removed out of support runtimes from machines it would break the Visual Studio build.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah this is the harder part. How much worse is it than SDK lifecycle though? Is it even possible for the policy to remove stale runtimes that were installed as part of the SDK while leaving the SDK?

(Don't mean to push on this, though, I agree it's probably not the right direction for now)

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it even possible for the policy to remove stale runtimes that were installed as part of the SDK while leaving the SDK?

It is my understanding that this is exactly what MS IT does today. Brought that up in several meetings and have not been challenged on it. Would love to be wrong but do not think I am 😦

- [ ]: The contents will include a README.md stating the package is **not** supported for direct user consumption.
- [ ]: The package will follow the versioning scheme of the .NET SDK.
- [ ]: The package will be unlisted (ideal but not a hard requirement)
- [ ]: Change the Sdk.targets file to have copies of the following three `<UsingTasks>` from [Microsoft.Common.tasks][microsoft-common-tasks]. Having a copy in Sdk.targets means that resetting `$(RoslynTargetsPath)` during build will change the chosen compiler.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍🏻

documentation/general/torn-sdk.md Show resolved Hide resolved
documentation/general/torn-sdk.md Outdated Show resolved Hide resolved
documentation/general/torn-sdk.md Outdated Show resolved Hide resolved
generators\Microsoft.CodeAnalysis.Razor.Compiler.SourceGenerators.dll' references version '4.9.0.0' of the compiler, which is newer
than the currently running version '4.8.0.0'.

To address this issue we are going to separate Visual Studio and the .NET SDK in build scenarios. That will change our matrix to the following:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should include data to back up that this is a common problem. Here's the data I have from back in December:
image

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this telemetry include CI runs?


The .NET SDK will start producing a package named Microsoft.NETSdk.Compiler.Roslyn. This will contain a .NET Framework version of the Roslyn compiler that matches .NET Core version. This will be published as a part of the .NET SDK release process meaning it's available for all publicly available .NET SDKs.

The .NET SDK has the capability to [detect a torn state][pr-detect-torn-state]. When this is detected the matching Microsoft.NETSDK.Compiler.Roslyn package for this version of the .NET SDK will be downloaded via `<PackageDownload>`. Then the `$(RoslynTargetsPath)` will be reset to point into the package contents. This will cause MSBuild to load the Roslyn compiler from that location vs. the Visual Studio location.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we need to worry about a malicious user overriding RoslynTargetsPath to their own evil version of the compiler?

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No. Any code which has the ability to override that property today could override the compiler in many other ways. For example you could change CscToolPath to change the compiler being used.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another way of thinking about it is that if malicious code has the ability to control MSBuild properties, there are many existing ways that they could hijack the compiler (or other tooling). The change here makes it more centralized but that just makes having a fully functional compiler easier, it doesn't change the attack surface


This approach enables us to take actions like NGEN or R2R analyzers inside of Visual Studio. This is a long standing request from the Visual Studio perf team but very hard to satisfy in our current

This approach is already taken by [the Razor generator][code-razor-vs-load]. This work was done for other reliability reasons but turned out working well for this scenario. There is initial upfront work to get the Visual Studio copy loading in design time builds but it has virtually zero maintanence cost.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You say virtually zero maintenance cost. Is that because Razor is already inserting into VS so including their generator was low marginal cost. What about teams like aspnet and windows desktop that don't have separate insertions today? Wouldn't this be adding additional insertion management to them separate from the SDK insertion?

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is that because Razor is already inserting into VS so including their generator was low marginal cost

Basically yes. Once we had done the work insert the generator it just happens automatically as part of the Razor insertion.

What about teams like aspnet and windows desktop that don't have separate insertions today?

My expectation is that we'll settle on a design where .NET SDK insertions end up inserting all of the inbox analyzers. The copy used in the .NET SDK is the same that will be used in VS. Hence the most natural way to handle this is to insert these at the time the .NET SDK inserts into VS. That has the added bonus that we get VS perf runs on the experience as a whole.


This also means these analyzers can vastly simplify their development model by always targeting the latest version of Roslyn. There is no more need to multi-target because the version of the compiler the analyzer will be used with is known at ship time for all core build scenarios.

This does mean that our design time experience can differ from our command line experience when customers are in a torn state. Specifically it's possible, even likely in some cases, that the diagonstics produced by design time builds will differ from command line builds. That is a trade off that we are willing to make for reliability. Customers who wish to have a consistent experience between design time should not operate in a torn state.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any way to detect this situation? My main concern with this approach is customer confusion where they see CLI warnings but not in VS or vice versa and end up thinking something is wrong with our tooling. No idea how to detect that and warn them since we shouldn't know they are doing a CLI build separately from VS.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any way to detect this situation?

Going forward we should assume that if you're using .NET SDK style projects in a torn state then design time build and cli build will differ. That is essentially our stated policy. If the experiences are the same it's a coincidence, not a design.

jaredpar and others added 3 commits June 23, 2024 20:00
Co-authored-by: Rainer Sigwald <raines@microsoft.com>
documentation/general/torn-sdk.md Outdated Show resolved Hide resolved
documentation/general/torn-sdk.md Outdated Show resolved Hide resolved
documentation/general/torn-sdk.md Outdated Show resolved Hide resolved
documentation/general/torn-sdk.md Outdated Show resolved Hide resolved
documentation/general/torn-sdk.md Outdated Show resolved Hide resolved
jaredpar and others added 3 commits June 24, 2024 07:48
@jaredpar jaredpar closed this Jun 25, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants