-
Notifications
You must be signed in to change notification settings - Fork 289
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Perf: Static delegates #1060
Perf: Static delegates #1060
Conversation
I think you are confusing anonymous methods and expression trees both of which can be created using lambda expressions. Anonymous methods are compiled by c# compiler once at compilation time, while expression trees can be compiled to a delegate at runtime. You can also investigate this using windbg tool which shows the method table information for a given type-object. Static anonymous function has been introduced to restrict the access to instance and local variables if it is not intended so by the design. |
Captures contribute to the size and number of hidden classes generated by the compiler. Passing using existing state parameter placement decreases or obviates the need for captures at all. Using static lambdas allows the compiler to cache the delegate again avoiding per-call allocations. I do know what I'm doing here. Try running a memory profile of an async test case with and without the PR and see what difference it makes. |
If you are talking about delegate object itself then does it make sense to do optimization in this code? Because probably even the average connection string is several times bigger than a delegate and captured state together.
Could you please share the result of your experiment? |
Each connection string is allocated once and re-used. The paths through these methods allocate on each call. If you call them once you'll see a very minor improvement, if you call them thousands or millions of times that adds up considerably. It's a question of scale. Results from running the https://github.com/aspnet/DataAccessPerformance which is an implementation of the TE Fortunes benchmark.
So a 0.9% throughput increase from some very minor changes. These changes barely impact the size of the code in comparison to the overall size of the codebase. In general the size of the code is far less relevant than the correctness and speed. If your main concern is code size at the driver level I think your priorities are wrong. |
@cmeyertons there are a number of changes in bulk copy here and they may improve the performance enough to outweigh the problem in #1048 if you want to give them a try. |
Sure, I completely agree scaling is important as well as performance overall. It's just, even if we run simple non-query request with new connection, it consumes about 25Kb, and empty delegate is 32B . Compiler does optimization and caches the delegate if possible whether it's static or not, it not necessary creates hidden classes if there is no need to capture the stack, right?
That interesting tool uses DateTime.UtcNow to measure time intervals, Task.Wait for synchronization etc. |
If you believe that there is work that can be done on the entries in that profile then please go ahead, I've already made some changes in #521 and dotnet/corefx#34393 but any additional improvements you can make are welcome. I didn't write that tool. The ASP net core team did. I was pointed to it as a reliable perf measure by one of the maintainers of this library when I started working on it back when it was System.Data.SqlClient in corefx. I use a variety of tools including profilers, benchmarks, user provided and synthetic scenario tests to identify functional and performance issues. Have a look through my PR history and you'll find all sort of things. This code demonstrably and measurably produces a small performance increase. It may be a small increase but scale is important as we've agreed. We can go all the way down to IL and ASM but I really don't see the need to do that. Do you have any specific problem with the code change as written other than it slightly increases code complexity and size? |
no |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just to double-check; The following methods have used lambda expression. Please look at them if they're worth being static:
- Net Core:
- SqlBulkCopy.WriteToServerInternalRestAsync
- SqlCommand.GetParameterEncryptionDataReaderAsync
- SqlConnection.OnError
src/Microsoft.Data.SqlClient/netcore/src/Microsoft/Data/SqlClient/SqlBulkCopy.cs
Show resolved
Hide resolved
src/Microsoft.Data.SqlClient/netcore/src/Microsoft/Data/SqlClient/SqlUtil.cs
Show resolved
Hide resolved
In general the more parameters are captured by a closure the less likely it is that it is worth converting. Tuples hurt code clarity and we can deal with some of that for improved performance of users but there is a limit somewhere and more than 3 parameters is where I currently feel readability suffers more than users benefit. This can in change (and may have done since I've been working on perf) with new information like profiling results and discussions. The changes made to convert to static delegates in this PR were not driven by profile results so they should retain clarity as much as possible while improving performance. I think I would only reduce clarity further if I had a very strong argument that it would improve performance. Let me know what you think and whether I've gone too far in reducing clarity. There may be cases where I've converted to tuples in the past that it might be better to use closures for clarity. |
Generally, I agree with keeping code legible; I think except the number of parameters in a tuple, it's important to avoid using it in a medium or large block of code. If a block of code is less than 10 lines, it seems reasonable to use a tuple with more than three parameters. And, definitely, it's a trade off among performance, memory allocation, and readability which should be considered. I would say go ahead with your best judgement. Let's see if anyone has other idea. |
); | ||
} | ||
}, ctoken); // We do not need to propagate exception, etc, from reconnect task, we just need to wait for it to finish. | ||
return tcs.Task; | ||
} | ||
else | ||
{ | ||
AsyncHelper.WaitForCompletion(reconnectTask, BulkCopyTimeout, () => { throw SQL.CR_ReconnectTimeout(); }, rethrowExceptions: false); | ||
AsyncHelper.WaitForCompletion(reconnectTask, BulkCopyTimeout, static () => throw SQL.CR_ReconnectTimeout(), rethrowExceptions: false); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not quiet sure or probably did not understand what benefit we get from changing this particular line to static? Isn't the purpose of static expression to prevent unintentional capture of local variables or instance state by the lambda?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that is the purpose. There is also a useful side effect of the way that the compiler implements it because it's a static delegate means that the delegate can be cached. This sharplab example shows that you get a hidden static class generated and that the action delegate is created the first time it is needed and re-used every time it is called after that. So we get lazy single allocation for free with a nice syntax.
public void B()
{
AsyncHelper.WaitForCompletion(null, 0, static () => throw SQL.CR_ReconnectTimeout(), rethrowExceptions: false);
}
is lowered to:
[Serializable]
[CompilerGenerated]
private sealed class <>c
{
public static readonly <>c <>9 = new <>c();
public static Action <>9__0_0;
internal void <B>b__0_0()
{
throw SQL.CR_ReconnectTimeout();
}
}
public void B()
{
AsyncHelper.WaitForCompletion(null, 0, <>c.<>9__0_0 ?? (<>c.<>9__0_0 = new Action(<>c.<>9.<B>b__0_0)), false);
}
src/Microsoft.Data.SqlClient/netcore/src/Microsoft/Data/SqlClient/SqlBulkCopy.cs
Show resolved
Hide resolved
src/Microsoft.Data.SqlClient/netcore/src/Microsoft/Data/SqlClient/SqlBulkCopy.cs
Show resolved
Hide resolved
src/Microsoft.Data.SqlClient/netcore/src/Microsoft/Data/SqlClient/SqlBulkCopy.cs
Show resolved
Hide resolved
src/Microsoft.Data.SqlClient/netcore/src/Microsoft/Data/SqlClient/SqlUtil.cs
Show resolved
Hide resolved
/azp run CI-Ub18-Enclave-NetStd |
You have several pipelines (over 10) configured to build pull requests in this repository. Specify which pipelines you would like to run by using /azp run [pipelines] command. You can specify multiple pipelines using a comma separated list. |
No pipelines are associated with this pull request. |
@Wraith2 Can you trigger a build by making a small change in your PR and make the tests to run once more? I have tried from our side and somethings are missing... |
Fixed up a couple of anonymous typed lambdas. |
@Wraith2 LGTM, I am waiting for the Enclave tests to finish and see if the failures are fixed after I cleaned up the databases. I am jumping to other PRs. Do you have any blocking PRs that needs to be done before other steps to come in? |
…net#1060 from netcore to common file).
Now we've got c#9 i've audited all lambda expressions in the library and changed some of them over to using static delegates which allows the compiler to cache the delegate on first use.
I also adjusted some other callsites to use state parameters to reduce the size of the closure slightly or allow onFailure, or onCancellation delegates to be static.
I tried to apply a consistent style to all the lambda usage layout spacing named params etc. It feels better to me overall but let me know if you want a different style.