-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
v2.5 Backlog: Use AutoResetEvent for backlog thread lingering #2008
Conversation
This prevents so many threads from starting/stopping as we finish flushing a backlog.
@@ -909,7 +913,7 @@ private async Task ProcessBacklogAsync() | |||
// TODO: vNext handoff this backlog to another primary ("can handle everything") connection | |||
// and remove any per-server commands. This means we need to track a bit of whether something | |||
// was server-endpoint-specific in PrepareToPushMessageToBridge (was the server ref null or not) | |||
await ProcessBridgeBacklogAsync(); // Needs handoff | |||
await ProcessBridgeBacklogAsync().ConfigureAwait(false); // Needs handoff |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use .ForAwait()
instead of .ConfigureAwait(false)
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The new backlog code added quite a few other awaited calls without .ForAwait()
. Do those need to be updated too?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This isn't actually relevant either way - I was testing an allocation thing - can be removed entirely (we kick it off without a context in a thread with none). Can tidy as a follow-up though!
/// This allows us to keep the thread around for a full flush and prevent "feathering the throttle" trying | ||
/// to flush it. In short, we don't start and stop so many threads with a bit of linger. | ||
/// </summary> | ||
private readonly AutoResetEvent _backlogAutoReset = new AutoResetEvent(false); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see where this is disposed?
A few tweaks to the changes #2008 for disposing and normalization.
Alrighty, #2008 did something exceedingly stupid: it lingered *inside the lock*, jamming the connection and backlog during linger. Instead, this moves the thread preservation up in a much cleaner way and doesn't occupy the lock. Also adds a SpinningDown status so we can see it proper in exceptions, always.
In troubleshooting these 2 tests, I realized what's happening: a really dumb placement mistake in #2008. Now, instead of locking inside the damn lock, it loops outside a bit cleaner and higher up. Performance wins are the same but it's a lot sander and doesn't block both the backlog and the writer for another 5 seconds. Now only the thread lingers and it'll try to get the lock when running another pass, if it gets any in the next 5 seconds.
This prevents so many threads from starting/stopping as we finish flushing a backlog. In short: starting a Thread is expensive, really expensive in the grand scheme of things. By ending the thread immediately when a backlog finished flushing, it had a decent change to start back up immediately to get the next item if a backlog was triggered by the lock transfer.
The act of finishing the backlog itself was happening inside the lock and exiting could take a moment causing an immediate re-queue of a follow-up item. This meant: lots of threads starting under parallel high contention load leading to higher CPU, more thread starts, and more allocations from thread starts.
Here's a memory view:
(note the 570k
object[]
instances - those are almost entirely thread starts)Here are thread views:
(note the scrollbar size)
This was initially discovered from testing some of the heavy parallel scenarios vs 1.2.6. For an example scenario with heavy ops load (pictured in profiles above):
main
(v2.5.x)So note that this is an improvement over 2.2.88 even with the new backlog functionality. We're not quite at the 1.2.6 levels of performance but we're a) closer, and b) a lot of things are more correct in 2.x, and there's a cost to that. I'm very happy with the wins here.
All of that is timings, but CPU usage for the same load is dramatically lower as well though this will depend on workload.
Example code:
Anyway...yeah, this was a problem. An AutoReset event is the best way I can think of to solve it. Throwing this up for review, maybe we have an even better idea of how to solve it.