This repository has been archived by the owner on Jan 23, 2023. It is now read-only.
SqlClient optimize SqlDataReader and TdsStateParser snapshots #37241
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Profiling the DataAccessPerformance project which emulates the TechEmpower fortunes benchmark shows that a lot the work done by SqlClient is spent in managing state snapshots. The data returned to the user is all string instances which are placed in a Fortune object but these aren't the dominators in the memory profile.
This PR changes the implementation of the snapshot mechanisms used by
SqlDataReader
andTdsStateParserObject
to:Keep track of a cached
SqlDataReader
snapshot object once one is created so that it can be efficiently reused. This is possible because a only a single async operation is permitted at any time. Access to the cached instance uses interlocked to take the instance so that it cannot ever be used twice and lazily returns the cleared object using standard assignment because creating a new one every now again again isn't a problem as long as it is usually reused. Under load one snapshot was created per reader and reused cleanly.Keep track of a cached
TdsStateParserObject
snapshot in a similar way toSqlDataReader
but using interlocked for both rent and return to the cache variable.Use slightly smaller data structures by compressing multiple boolean fields on
TdsParserStateObject
into a flags enumeration, this makes multiple flags copy and restore a single copy not 5. All access to the affected properties is now done through accessor functions.Introduce a small class
Snapshot.PLPData
to store any partially length prefixed data state if it is used, if it is not used the allocation object object and continual tracking of 128 bits of data it contains are avoided.Change
Snapshot.PacketData
to be a self assembling singly linked list. This removesList<PacketData>
and linkedPacketData[]
allocations when taking snapshots and allows a cached PacketData link to be retained in the snapshot since one will always be required is a snapshot is used.I also removed most of the setting of
TdsParserStateObject
variables to default values so it is now easier to tell when they are initialized to non-default values. The only exceptions are some variables which must be initialized to default because they are only touched through reflection in testing so the compiler will complain that it can't see them being set.Profile results before, green are the result objects we actually want:
![snapshot-master](https://user-images.githubusercontent.com/13322696/56849086-170dea00-68e8-11e9-981b-3031564bc544.PNG)
After:
![snapshot-branch](https://user-images.githubusercontent.com/13322696/56849093-2725c980-68e8-11e9-876d-5bc8a6205006.PNG)
I have another branch which removes 3 of the 4 intervening async machinery allocations which will give some more gains but they're more modest and it needs a little more polish.
Performance results are good
17% throughput increase and halved variance in query time.
Manual and functional test pass in native mode. DataAccessPerfomance under pure load and profilers has no problems.
/cc area owners @afsanehr, @tarikulsabbir, @Gary-Zh , @David-Engel , people interested in perf @divega @roji @saurabh500