-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ReadyToRun images are not as efficient for .NET Core 3 as NGen for .NET Framework #13339
Comments
Copied from WPF issue #1478 |
|
@fadimounir You can find published samples here: Results that I get on my test machine:
|
Do we have any progress or some workaround to this? |
Related issue: dotnet/fsharp#9061 |
@AlexChuev is it possible for you to run another test using the lastest .NET 5 Preview? There is a very recent blog post about performance as well (https://devblogs.microsoft.com/dotnet/performance-improvements-in-net-5/). Although it doesn't mention any improvements to ReadyToRun other than size reduction, there is a lot of changes in JIT though. |
@elachlan here are my results for the same test app converted to .NET 5 and targeting x64:
.NET 5 is slower than .NET Core 3 in my tests, to which I have no explanation so far. Note that I had to use a different machine than previously, so total numbers are a bit different. |
@AlexChuev thanks for the tests. Not so great that we have gone backwards again. @stephentoub Your article was a great run down of the performance improvements and was much appreciated. There was a comment on it that Preview 7 was going to have more improvements for AOT/R2R. Is that the intention? Based on Alex's tests, .NET 5 looks like its 155% worse than .NET Framework and 20% worse than .NET Core 3. Thanks! |
Thanks.
Which comment? |
Sorry, someone made a comment, in the comments. Not you specifically. I imagine there will be some sort of effort before release, just wondering if that was the aim. |
I see, thanks. @jeffschwMSFT may be able to provide more information. |
Adding @mangod9 and @jkotas. We are aware that there are performance gaps between r2r and ngen. One of the larger positive factors of r2r is the less fragile nature of its images, which comes with a performance trade mark. The way that we like to approach these problems are with real world examples that we can take a look at and tune. |
Compiling a small program with the F# compiler is a good example. A significant amount of time is spent in JIT with R2R. |
@AlexChuev might be able to provide a couple of example apps. Devexpress has a myriad of demo applications which would provide great examples of functional apps. @jeffschwMSFT is there a plan to revisit r2r performance before .NET 5 release? or are you after specific test scenarios? |
assume these perf numbers are for certain UI specific scenarios? |
@mangod9 Yes, AlexChuev's example is at https://github.com/AlexChuev/ReadyToRunPerformanceTest @AlexChuev is it possible for you to update your repo to include the changes for .NET 5? |
It might be worthwhile to try crossgen2 composite functionality available in 5 to check if it improves this scenario. I tried to get a composite built for the repro, but looks like we have an issue we need to fix for that. Will create a separate issue for that. |
@elachlan sure, updated |
@jeffschwMSFT @mangod9 I'll take a quick look at this. |
cc: @dotnet/crossgen-contrib |
@jkotas @davidwrighton
That's the bulk of the regression. I validated that the number of assemblies loaded is the same (though obviously different versions). I assume the number of types is roughly the same also given it is the same scenario. Is this enough of a breadcrumb? I can share the profiles to you if you'd like to take a look. |
@billwert Thanks for collecting the profiles. Could you please share them? |
@MichalStrehovsky pointed out since WPF implies mc++ cg2 wouldnt support it currently. We will continue to investigate the 3.1->5.0 regression. |
Last fall, I built up 3/4 of the logic necessary to see a view of the type system load events in a useful way in PerfView. I'll see if I can resurrect that work, and build up an understanding of what types we're actually working with here, and if there are significant differences from 3.1. |
I see non-trivial amount of time spent in the new covariant return checks:
|
I tested it with crossgen2 in .NET 5 RC2 on my machine but got exception in runtime:
|
Is there a setting to let R2R build slow tier1 instead of just quick tier0? I don't want tiered compilation (TC) to neither use additional CPU nor additional memory because the server app is hosted some hundret times on the same server (strict tenant isolation by IIS app pools). |
crossgen has been removed in favor of crossgen2 in 6. We will continue to optimize R2R for .net 7. |
I have copied @AlexChuev's repo and modified it for winforms.
I am looking into this as I am in the process of moving a .NET Framework winforms app to .NET 6. Start-up time and runtime performance are very important for customer experience in this scenario. Edit: added composite result. first run for composite was the worst performance. But after relaunching it was very quick. |
I have updated the winforms version to .NET 7 Looks like there are slight regressions in .NET 7 @AlexChuev, could you confirm with your test?
|
@mangod9 This displays a regression in r2r composite on .NET7. Perfview regression report:
The Perfview Diff:
|
@elachlan out of curiosity, can you try with |
@AlexChuev is it possible to get the JIT team a devexpress nuget feed for this test? @EgorBo, thanks for getting back to me. I will test this and get back to you here. |
@EgorBo setting that environment variable did bring the performance back in line with composite R2R. When you scale it to the whole app. Regression Report:
I narrowed the time range in Perfview from when the button is clicked. For a true test of the performance whilst running. This is with the env variable. Regression Report:
Here is the Exc% values greater than 5%
CC: @AlexChuev |
just to confirm is the regression with WriteXorExecute enabled only with composite? @tyrlek @janvorli |
The measurement directly above was with The other measurement in the post I tagged you, that did not have the env variable |
I am not sure if my issue is related to this one but I'll leave it here just in case. We have WPF app with .net framework 4.8. I measure JIT time during startup with Jetbrains dotTrace profiler. I noticed that using ngen our app have 3x less JIT during startup compared to R2R images generated from ngenR2R.exe. Here are the measurements
ngenR2R.exe tool comes from here https://www.nuget.org/packages/Microsoft.DotNet.Framework.NativeImageCompiler |
Problem Impact Estimate Further Information Thanks, |
@dennis-garavsky can you raise this in the winforms repo for better visibility? |
CC: @KlausLoeffelmann @merriemcgaw You might be interested in the designer perf issues mentioned above. |
Thanks @elachlan! We've addressed a bunch of serialization performance issues with VS in the most recent 17.8 Preview builds. @dennis-garavsky can your team see if these improvements are helping the situation? |
Hello @merriemcgaw, @elachlan and others Our main point is that DesignToolsServer.exe is reloaded each time after we make changes to the code and rebuild the project. In an empty winforms project, it takes 3-4 seconds. In a project with DevExpress assemblies, it takes 8-10 seconds. At the same time, this most likely occurs because our package has a very large number of dlls that need to be loaded in a process. That is why we expect to speed up loading time with Ready2Run technology. |
The DesignToolsServer.exe is expected to reload every time the project opens and again when there are certain types of changes to the project. @Shyam-Gupta is our domain expert there. We are actively looking for ways to make improvements in loading of the process. We have a lot of people off for summer holidays. When everyone is back we will explore bumping up the priority on these improvements (in the next couple weeks). |
Should @ekalchev open an issue in winforms repo? |
As per the current architecture, |
@danmoseley, @Shyam-Gupta I don't have winforms issue. You probably wanted to tag someone else? |
Can I ask any update here? I'm deeply grateful for the numerous performance optimizations made by the dotnet team in .NET 8. However, I still encounter issues with startup performance. Over the past six months, I've assisted many colleagues in migrating their projects from .NET Framework to .NET Core, with some even upgrading directly to .NET 8. Unfortunately, the common feedback I've received is that the startup performance after updating to .NET Core is slower than the original .NET Framework. My concern is that despite years of optimization, from the perspective of desktop application developers, the startup performance has not yet returned to the level of .NET Framework. This could potentially dampen the enthusiasm of desktop application developers. I believe there are still two main factors affecting the startup performance of .NET Core applications. The first is that although ReadyToRun can speed up the process, it leads to larger DLL sizes, which in turn results in longer file IO times. The second is that .NET Core requires loading a large number of DLL files during startup, and it's difficult to share them in the same way as .NET Framework's GAC, resulting in significant IO pressure, especially in HDD disk. What's worse is that as more and more users report that Windows Updater damages the .NET Core environment, an increasing number of desktop developers are forced to choose standalone (self-contained) publishing instead of framework-dependent methods. This further results in application programs not being able to enjoy the acceleration brought about by shared DLLs during cold startup. I understand that this is a challenging task, but I look forward to seeing better startup performance optimizations in .NET in the future, to make desktop developers happy. |
@lindexi Are those projects that are using WPF? My understanding is that an important portion of WPF cannot (currently) be R2R'd because they are mixed-mode assemblies (managed and native code), because they are written in C++/CLI. In the past this was due to |
@AlexChuev commented on Thu Aug 29 2019
Problem description:
Since Ngen.exe and Native Image Task are not available in .NET Core 3, ReadyToRun images are the only way to reduce the application startup time and avoid delays caused by JIT compilation. However, our tests show that for the same WPF application that initializes three times faster after using Ngen.exe, ReadyToRun images provide the difference of only 25%.
This is a huge hit for our users, since with many libraries and theme resources, JIT compilation is one of the main factors affecting load times. For some real-life applications, the use of Ngen.exe allowed our users to shave up to 6 seconds (half) off the initial start time.
Minimal repro:
You can find samples for .NET Core 3 and .NET Framework here: https://github.com/AlexChuev/ReadyToRunPerformanceTest
These samples use DevExpress WPF assemblies to demonstrate how Ngen.exe and ReadyToRun images affect projects with many classes and theme resources. If you need published apps or other samples for your tests, please let me know.
P.S. I'm posting this in the WPF repo because the difference between Ngen.exe and ReadyToRun images highly affects WPF applications. Normally, the JIT compiler processes classes and methods only when they are needed for the program execution. However, when WPF loads theme resources for a control, this causes all classes referenced in these resources (including classes referenced in currently unused or invisible parts) to be processed by the JIT compiler. In addition, static constructors of many WPF classes contain the DependencyProperty registration code that may cause even more classes and methods to be JITted.
@rladuca commented on Thu Aug 29 2019
@fadimounir Where should we file a companion bug on .NET?
@fadimounir commented on Thu Aug 29 2019
@rladuca Sure you can file a companion bug on dotnet/coreclr. We'd be interested in diagnosing this more, although one thing to keep in mind is that R2R by design principle will always be a bit slower than fragile native images, because they are version resilient (they do not have the same fragility as old ngen images).
@AlexChuev That would be very helpful
cc @jkotas FYI
The text was updated successfully, but these errors were encountered: