-
Notifications
You must be signed in to change notification settings - Fork 4.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[release/9.0] Fix perf problems found in investigation of issue #107728 #107817
Conversation
…ClassInitThrowing` didn't check to see if the class had been initialized before taking a lock - `EnsureTlsIndexAllocated` didn't check if the Tls index had been allocated before setting the flag via an expensive Interlocked call to indicate that it had been allocated - And finally `JIT_GetNonGCThreadStaticBaseOptimized` and `JIT_GetGCThreadStaticBaseOptimized` were missing the fast paths which avoided even calling those apis at all. Perf with a small benchmark which does complex multithreaded work... | Runtime | Time | | ---- | ---- | | .NET 8 | 00.9414682 s | | .NET 9 before this fix | 22.8079382 | | .NET 9 with this fix | 00.2004539 | Fixes #107728
Tagging subscribers to this area: @mangod9 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
approved. please get a code review. when ready this can be merged
Thanks, guys - that was quick! |
Is this fix available in .NET 9 RC2? If yes, I guess there is another regression (~ 15%) that I see in one of my tests. |
yes this should be included in RC2. Is it now a different test which is showing a regression? |
Yes. I don't have an isolated sample, so here is how to reproduce this:
|
For the sake of clarity, the test isn't about DB performance - it's mainly about Fusion caching logic. |
Backport of #107806 to release/9.0
/cc @davidwrighton
Customer Impact
This impacts customers who have multithreaded applications which use a number of apis which need the statics of a type to be fully initialized. The effect is that there is slowdown, especially in the case of multiple threads attempting to run code at once which requires the class initializer on types to have been run. The customer who discovered this was using unoptimized code which accessed thread statics, which was probably the biggest performance problem that was exposed by this bug. It was filed as #107728
Regression
Regression was introduced with PR #99183
Testing
The fix was verified by the production of a bespoke test case which stressed the multithreaded performance of unoptimized jitted code accessing thread statics. This is not a scenario which is part of our performance benchmarks due to a general lack of support in benchmark.net for test whether or not tests cause thread contention. Single threaded performance was effected by the bug, but the impact was minimal and below the noise floor of our general purpose microbenchmark tester.
Risk
Low risk, this change mostly works by checking flags early instead of after taking locks. The flag checking in this case is also entirely using volatile loads with barriers, so the memory ordering risks are minimal.
IMPORTANT: If this backport is for a servicing release, please verify that:
The PR target branch is
release/X.0-staging
, notrelease/X.0
.If the change touches code that ships in a NuGet package, you have added the necessary package authoring and gotten it explicitly reviewed.