-
Notifications
You must be signed in to change notification settings - Fork 294
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
perf: rework TdsParserStateObject and SqlDataReader snapshots #198
Conversation
Is this already merged in corefx? |
No and it won't be going into corefx now that SqlClient there has gone into deprecated mode. See discussion at the end of dotnet/corefx#37241 |
Just want to comment on review status. I've been reviewing these changes today as I have time. The changes for points 1-3 look good; have not found any issues. It might take me some time to digest the changes for points 4 and 5 since they aren't as simple as the previous points. Obviously we want to make sure there are no regressions introduced. Thanks for your patience. |
Understandable. I know the obvious surface area works because queries work but I don't know how to be sure that things like multiple levels of plp data are functional. |
src/Microsoft.Data.SqlClient/netcore/src/Microsoft/Data/SqlClient/TdsParserStateObject.cs
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the changes look good. The changes from points 4 and 5, while appearing to be very different, make sense once untangled, are obviously an improvement, and even seem a bit more understandable than the original code to me (even if they make me a little nervous).
Any update on this? I'm pretty sure that to address #245 will require touching snapshots which is going to cause merge conflicts with this PR. |
snapshot.Clear(); | ||
} | ||
_snapshot = snapshot; | ||
_snapshot.Snap(this); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any special reason to use extra variable snapshot
here? Can you reduce this to:
if (_snapshot is null)
{
_snapshot = Interlocked.Exchange(ref _cachedSnapshot, null) ?? new StateSnapshot();
}
else
{
_snapshot.Clear();
}
_snapshot.Snap(this);
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's there so there is no intermediate state where _snapshot
is not null but is not a correctly setup snapshot. It sets it up locally and then assigns it back to the shared variable. It could be an internlocked assignment but that isn't seems necessary.
#if DEBUG | ||
_rollingPend = 0; | ||
_rollingPendCount = 0; | ||
_stateObj._lastStack = null; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know you didn't add this intentionally, but this seems unnecessary step as stateObj
will be null
eventually. Also in above places.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
_laststack
is used in some DEBUG pn;y sanity checking so it's a good idea to clear it here to make sure that code stays working.
src/Microsoft.Data.SqlClient/netcore/src/Microsoft/Data/SqlClient/TdsParserStateObject.cs
Outdated
Show resolved
Hide resolved
It's coming up on 2 months now. Any feedback on what is blocking this? |
@Wraith2 These core changes always make us nervous about introducing regressions. But obviously I do understand the need for them to move performance improvements forward. I'd like to get this PR merged. Can you merge the latest changes in master to trigger a new CI build against the new public CI pipeline? |
Profiling the DataAccessPerformance project which emulates the TechEmpower fortunes benchmark shows that a lot the work done by SqlClient is spent in managing state snapshots. The data returned to the user is all string instances which are placed in a Fortune object but these aren't the dominators in the memory profile.
This PR changes the implementation of the snapshot mechanisms used by
SqlDataReader
andTdsStateParserObject
to:Keep track of a cached
SqlDataReader
snapshot object once one is created so that it can be efficiently reused. This is possible because a only a single async operation is permitted at any time. Access to the cached instance uses interlocked to take the instance so that it cannot ever be used twice and lazily returns the cleared object using standard assignment because creating a new one every now again again isn't a problem as long as it is usually reused. Under load one snapshot was created per reader and reused cleanly.Keep track of a cached
TdsStateParserObject
snapshot in a similar way toSqlDataReader
but using interlocked for both rent and return to the cache variable.Use slightly smaller data structures by compressing multiple boolean fields on
TdsParserStateObject
into a flags enumeration, this makes multiple flags copy and restore a single copy not 5. All access to the affected properties is now done through accessor functions.Introduce a small class
Snapshot.PLPData
to store any partially length prefixed data state if it is used, if it is not used the allocation object object and continual tracking of 128 bits of data it contains are avoided.Change
Snapshot.PacketData
to be a self assembling singly linked list. This removesList<PacketData>
and linkedPacketData[]
allocations when taking snapshots and allows a cached PacketData link to be retained in the snapshot since one will always be required is a snapshot is used.I also removed most of the setting of
TdsParserStateObject
variables to default values so it is now easier to tell when they are initialized to non-default values. The only exceptions are some variables which must be initialized to default because they are only touched through reflection in testing so the compiler will complain that it can't see them being set.Profile results before, note that of the top 6 lines only the 4th is a result object we actually want:
After:
I have another branch which removes lines 3 4 and 5 intervening async machinery allocations which will give some more gains
Performance results are reasonable showing 8-10% throughput increase.
The function and manual tests pass. This work had already been done and tested on the corefx branch so it's had some scrutiny there as well. Coverage I can't tell about but the fact that I'm directly seeing the types being used to take them out of the profile indicates they're definitely being touched. The DataAccessPerformance benchmark is a reasonable stress test, and I've found race conditions with it before so getting through it repeatedly successfully for minutes at a time is an indicator of stability to me.
Note: if this is ported to netfx the snapshot enum comparison in TdsParserStateObject will need to be changed from using HasFlag to manual masking. On coreclr HasFlag was optimized to a jit primitive so it produces very simple and fast assembly instructions. On netfx HasFlag performance is notoriously poor and causes boxing of the argument.