Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Contention scheduling actions in HashedWheelTimerScheduler #7130

Closed
raypurchasett opened this issue Mar 26, 2024 · 7 comments · Fixed by #7144
Closed

Contention scheduling actions in HashedWheelTimerScheduler #7130

raypurchasett opened this issue Mar 26, 2024 · 7 comments · Fixed by #7144

Comments

@raypurchasett
Copy link
Contributor

Version Information
Version of Akka.NET? 1.5.18
Which Akka.NET Modules? Akka.Core

Describe the bug
It seems like the dotnet 6 and greater version of the HashedWheelTimerScheduler introduces significant contention/blocking when registering many actions. We have thousands of actors who schedule reminders to stop themselves, and often this happens at the same time. When I target dotnet 5 we don't see this issue. We initially noticed this because our thread pool count went through the roof.

See this gist:
https://gist.github.com/raypurchasett/605ee8a0beeeafa6834b9daa6c960a8e

We've seen this running on: windows (rider), docker desktop (windows), and EKS.

@Aaronontheweb
Copy link
Member

cc @Arkatufus who designed this feature.

So I believe this is coming from our decision to re-model the HashedWheelTimerScheduler to use .NET 6's new PeriodicTimer mechanism - whereas anything running on an older runtime uses our original dedicated-thread-based HashedWheelTimerScheduler - do you have any profiler data you can share with us @raypurchasett ?

@Aaronontheweb Aaronontheweb added this to the 1.5.19 milestone Apr 4, 2024
@Arkatufus
Copy link
Contributor

There shouldn't be any extra thread allocation with PeriodicTimer since its just a single thread scheduled to invoke a callback every x miliseconds, all other aspect of the scheduler is the same, all scheduled actions are stored inside a bucket that gets invoked when their time comes

@Aaronontheweb
Copy link
Member

@raypurchasett so @Arkatufus was able to reproduce your result - I wrote a little script to do this as well.

NET Core App 3.1

C:\Program Files\dotnet\sdk\8.0.104\Sdks\Microsoft.NET.Sdk\targets\Microsoft.NET.EolTargetFrameworks.targets(32,5): warning NETSDK1138: The target framework 'netcoreapp3.1' is out of support and will not receive security updates in the future. Please refer to https://aka.ms/dotnet-core-support for more information about the support policy. [E:\Repositories\repros\HighSchedulerCpu\HighSchedulerCpu.csproj::TargetFramework=netcoreapp3.1]
C:\Program Files\dotnet\sdk\8.0.104\Sdks\Microsoft.NET.Sdk\targets\Microsoft.NET.EolTargetFrameworks.targets(32,5): warning NETSDK1138: The target framework 'netcoreapp3.1' is out of support and will not receive security updates in the future. Please refer to https://aka.ms/dotnet-core-support for more information about the support policy. [E:\Repositories\repros\HighSchedulerCpu\HighSchedulerCpu.csproj::TargetFramework=netcoreapp3.1]
E:\Repositories\repros\HighSchedulerCpu\Program.cs(29,9): warning CS8602: Dereference of a possibly null reference. [E:\Repositories\repros\HighSchedulerCpu\HighSchedulerCpu.csproj::TargetFramework=netcoreapp3.1]
Waiting to start
Starting
[INFO][04/11/2024 02:45:33.369Z][Thread 0009][akka://System/user/1] 10
[INFO][04/11/2024 02:45:33.369Z][Thread 0005][akka://System/user/0] 10
[INFO][04/11/2024 02:45:33.369Z][Thread 0007][akka://System/user/3] 10
[INFO][04/11/2024 02:45:33.369Z][Thread 0016][akka://System/user/11] 10
[INFO][04/11/2024 02:45:33.369Z][Thread 0017][akka://System/user/12] 10
[INFO][04/11/2024 02:45:33.369Z][Thread 0013][akka://System/user/8] 10
[INFO][04/11/2024 02:45:33.369Z][Thread 0019][akka://System/user/9] 10
[INFO][04/11/2024 02:45:33.369Z][Thread 0010][akka://System/user/7] 10
[INFO][04/11/2024 02:45:33.369Z][Thread 0006][akka://System/user/2] 10
[INFO][04/11/2024 02:45:33.369Z][Thread 0012][akka://System/user/6] 10
[INFO][04/11/2024 02:45:33.370Z][Thread 0008][akka://System/user/4] 10
[INFO][04/11/2024 02:45:33.370Z][Thread 0011][akka://System/user/5] 10
[INFO][04/11/2024 02:45:33.370Z][Thread 0018][akka://System/user/13] 10
[INFO][04/11/2024 02:45:33.370Z][Thread 0014][akka://System/user/14] 10
[INFO][04/11/2024 02:45:33.370Z][Thread 0015][akka://System/user/10] 10
[INFO][04/11/2024 02:45:33.415Z][Thread 0012][akka://System/user/20] 0

NET 8.0

C:\Program Files\dotnet\sdk\8.0.104\Sdks\Microsoft.NET.Sdk\targets\Microsoft.NET.EolTargetFrameworks.targets(32,5): warning NETSDK1138: The target framework 'netcoreapp3.1' is out of support and will not receive security updates in the future. Please refer to https://aka.ms/dotnet-core-support for more information about the support policy. [E:\Repositories\repros\HighSchedulerCpu\HighSchedulerCpu.csproj::TargetFramework=netcoreapp3.1]
Waiting to start
Starting
[INFO][04/11/2024 02:44:59.214Z][Thread 0010][akka://System/user/3] 6345
[INFO][04/11/2024 02:44:59.214Z][Thread 0009][akka://System/user/1] 6344
[INFO][04/11/2024 02:44:59.214Z][Thread 0024][akka://System/user/6] 4052
[INFO][04/11/2024 02:44:59.214Z][Thread 0021][akka://System/user/39] 6344
[INFO][04/11/2024 02:44:59.214Z][Thread 0029][akka://System/user/8] 0
[INFO][04/11/2024 02:44:59.214Z][Thread 0017][akka://System/user/37] 6344
[INFO][04/11/2024 02:44:59.214Z][Thread 0011][akka://System/user/2] 6345
[INFO][04/11/2024 02:44:59.214Z][Thread 0013][akka://System/user/5] 6344
[INFO][04/11/2024 02:44:59.214Z][Thread 0018][akka://System/user/164] 6344
[INFO][04/11/2024 02:44:59.214Z][Thread 0020][akka://System/user/87] 6344
[INFO][04/11/2024 02:44:59.214Z][Thread 0022][akka://System/user/40] 5574
[INFO][04/11/2024 02:44:59.214Z][Thread 0016][akka://System/user/36] 6345
[INFO][04/11/2024 02:44:59.214Z][Thread 0026][akka://System/user/88] 2023
[INFO][04/11/2024 02:44:59.214Z][Thread 0007][akka://System/user/0] 6344
[INFO][04/11/2024 02:44:59.214Z][Thread 0008][akka://System/user/34] 6345
[INFO][04/11/2024 02:44:59.214Z][Thread 0023][akka://System/user/41] 5065
[INFO][04/11/2024 02:44:59.214Z][Thread 0027][akka://System/user/42] 1516
[INFO][04/11/2024 02:44:59.214Z][Thread 0028][akka://System/user/7] 509
[INFO][04/11/2024 02:44:59.214Z][Thread 0012][akka://System/user/4] 6345
[INFO][04/11/2024 02:44:59.214Z][Thread 0015][akka://System/user/35] 6345
[INFO][04/11/2024 02:44:59.214Z][Thread 0014][akka://System/user/86] 6344
[INFO][04/11/2024 02:44:59.214Z][Thread 0025][akka://System/user/4999] 3014
[INFO][04/11/2024 02:44:59.214Z][Thread 0019][akka://System/user/38] 6344
[INFO][04/11/2024 02:44:59.239Z][Thread 0021][akka://System/user/43] 0
[INFO][04/11/2024 02:44:59.239Z][Thread 0014][akka://System/user/85] 0
[INFO][04/11/2024 02:44:59.239Z][Thread 0020][akka://System/user/10] 0

I'm not getting any useful contention data back using dotnet-counters, but I do see deadlocks at startup occasionally - which @Arkatufus was also able to reproduce. I'm going to test this with his fix real quick.

@Aaronontheweb
Copy link
Member

My gist with the PowerShell script https://gist.github.com/Aaronontheweb/fb1ac0b9577abebe422f3976f9c1d63f

@Aaronontheweb
Copy link
Member

Looks like @Arkatufus ' solution on #7144 fixes this issue:

NET 8.0

C:\Program Files\dotnet\sdk\8.0.104\Sdks\Microsoft.NET.Sdk\targets\Microsoft.NET.EolTargetFrameworks.targets(32,5): warning NETSDK1138: The target framework 'netcoreapp3.1' is out of support and will not receive security updates in the future. Please refer to https://aka.ms/dotnet-core-support for more information about the support policy. [E:\Repositories\repros\HighSchedulerCpu\HighSchedulerCpu.csproj::TargetFramework=netcoreapp3.1]
E:\Repositories\repros\HighSchedulerCpu\Program.cs(29,9): warning CS8602: Dereference of a possibly null reference. [E:\Repositories\repros\HighSchedulerCpu\HighSchedulerCpu.csproj::TargetFramework=net8.0]
Waiting to start
Starting
[INFO][04/11/2024 03:08:48.715Z][Thread 0025][akka://System/user/12] 2
[INFO][04/11/2024 03:08:48.716Z][Thread 0009][akka://System/user/2] 9
[INFO][04/11/2024 03:08:48.716Z][Thread 0018][akka://System/user/9] 9
[INFO][04/11/2024 03:08:48.716Z][Thread 0030][akka://System/user/81] 1
[INFO][04/11/2024 03:08:48.716Z][Thread 0027][akka://System/user/80] 2
[INFO][04/11/2024 03:08:48.716Z][Thread 0007][akka://System/user/1] 9
[INFO][04/11/2024 03:08:48.716Z][Thread 0022][akka://System/user/78] 2
[INFO][04/11/2024 03:08:48.716Z][Thread 0036][akka://System/user/738] 0
[INFO][04/11/2024 03:08:48.716Z][Thread 0033][akka://System/user/82] 1
[INFO][04/11/2024 03:08:48.716Z][Thread 0017][akka://System/user/148] 9
[INFO][04/11/2024 03:08:48.716Z][Thread 0013][akka://System/user/5] 9
[INFO][04/11/2024 03:08:48.716Z][Thread 0034][akka://System/user/185] 1
[INFO][04/11/2024 03:08:48.716Z][Thread 0020][akka://System/user/182] 9
[INFO][04/11/2024 03:08:48.716Z][Thread 0037][akka://System/user/15] 0
[INFO][04/11/2024 03:08:48.716Z][Thread 0023][akka://System/user/11] 2
[INFO][04/11/2024 03:08:48.716Z][Thread 0031][akka://System/user/184] 1
[INFO][04/11/2024 03:08:48.716Z][Thread 0035][akka://System/user/14] 0
[INFO][04/11/2024 03:08:48.716Z][Thread 0032][akka://System/user/13] 1
[INFO][04/11/2024 03:08:48.716Z][Thread 0014][akka://System/user/6] 9
[INFO][04/11/2024 03:08:48.716Z][Thread 0011][akka://System/user/3] 9
[INFO][04/11/2024 03:08:48.716Z][Thread 0015][akka://System/user/7] 9
[INFO][04/11/2024 03:08:48.716Z][Thread 0021][akka://System/user/149] 9
[INFO][04/11/2024 03:08:48.716Z][Thread 0024][akka://System/user/79] 2
[INFO][04/11/2024 03:08:48.716Z][Thread 0008][akka://System/user/0] 9
[INFO][04/11/2024 03:08:48.716Z][Thread 0010][akka://System/user/4] 9
[INFO][04/11/2024 03:08:48.716Z][Thread 0028][akka://System/user/183] 1
[INFO][04/11/2024 03:08:48.716Z][Thread 0012][akka://System/user/77] 9
[INFO][04/11/2024 03:08:48.716Z][Thread 0026][akka://System/user/708] 0
[INFO][04/11/2024 03:08:48.716Z][Thread 0019][akka://System/user/10] 9
[INFO][04/11/2024 03:08:48.716Z][Thread 0016][akka://System/user/8] 9
[INFO][04/11/2024 03:08:48.751Z][Thread 0018][akka://System/user/707] 0
[INFO][04/11/2024 03:08:48.751Z][Thread 0032][akka://System/user/736] 0
[INFO][04/11/2024 03:08:48.751Z][Thread 0011][akka://System/user/706] 0
[INFO][04/11/2024 03:08:48.751Z][Thread 0033][akka://System/user/733] 0
[INFO][04/11/2024 03:08:48.751Z][Thread 0013][akka://System/user/737] 0
[INFO][04/11/2024 03:08:48.751Z][Thread 0007][akka://System/user/734] 0
[INFO][04/11/2024 03:08:48.751Z][Thread 0021][akka://System/user/735] 0
[INFO][04/11/2024 03:08:48.752Z][Thread 0033][akka://System/user/730] 0
[INFO][04/11/2024 03:08:48.752Z][Thread 0007][akka://System/user/729] 0
[INFO][04/11/2024 03:08:48.752Z][Thread 0013][akka://System/user/700] 0
[INFO][04/11/2024 03:08:48.751Z][Thread 0010][akka://System/user/732] 0
[INFO][04/11/2024 03:08:48.752Z][Thread 0033][akka://System/user/728] 0
[INFO][04/11/2024 03:08:48.752Z][Thread 0007][akka://System/user/727] 0
[INFO][04/11/2024 03:08:48.752Z][Thread 0013][akka://System/user/699] 0
[INFO][04/11/2024 03:08:48.751Z][Thread 0017][akka://System/user/147] 0
[INFO][04/11/2024 03:08:48.752Z][Thread 0007][akka://System/user/697] 0
[INFO][04/11/2024 03:08:48.752Z][Thread 0013][akka://System/user/724] 0
[INFO][04/11/2024 03:08:48.752Z][Thread 0033][akka://System/user/698] 0
[INFO][04/11/2024 03:08:48.752Z][Thread 0010][akka://System/user/725] 0
[INFO][04/11/2024 03:08:48.751Z][Thread 0030][akka://System/user/705] 0
[INFO][04/11/2024 03:08:48.752Z][Thread 0021][akka://System/user/726] 0
[INFO][04/11/2024 03:08:48.752Z][Thread 0010][akka://System/user/695] 0
[INFO][04/11/2024 03:08:48.752Z][Thread 0033][akka://System/user/723] 0
[INFO][04/11/2024 03:08:48.752Z][Thread 0013][akka://System/user/696] 0
[INFO][04/11/2024 03:08:48.752Z][Thread 0007][akka://System/user/722] 0

Aaronontheweb added a commit that referenced this issue Apr 11, 2024
…#7144)

* Add unit test

* Add contention fix

* Replace CountdownEvent with TaskCompletionSource in net6.0

* Cleanup code, revert switch to if

* Revert if block

* Fix unit test

---------

Co-authored-by: Aaron Stannard <aaron@petabridge.com>
@raypurchasett
Copy link
Contributor Author

Apologies, been away for a while. Thanks for fixing.

@Aaronontheweb
Copy link
Member

This change was introduced in version v1.5.14 - going to flag v1.5.14 --> v1.5.18 as having critical bugs in them on NuGet, so it prompts users to upgrade.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants