perf: Optimize async method allocations #328
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This is a rework of a closed PR from corefx, dotnet/corefx#37254 and I wrote a small essay explaining why it was a good idea in the initial post for that so it's worth clicking through and reading it.
Profiling the DataAccessPerformance project which emulates the TechEmpower fortunes benchmark you can see that some of the top allocations are state closures for async methods like
ReadAsync
GetFieldValueAsync
. Investigating this I found the code to be much more complex than I'd expected and quite confusing. In order to reduce the allocations I reworked the async infrastructure to use concrete classes and then cached some which are used repeatedly.Before:
After:
DataAccessPerformance results:
Which shows a ~5% throughput improvement in situations where CPU time is a limitation and less improvements where it isn't a limitation. I recommend reviewing manually because the diff makes the large number of changes and the extraction of lambdas look particularly confusing.