Skip to content
This repository has been archived by the owner on Jan 23, 2023. It is now read-only.
/ corefx Public archive

SqlClient optimize SqlDataReader and TdsStateParser snapshots #37241

Closed
wants to merge 1 commit into from

Conversation

Wraith2
Copy link
Contributor

@Wraith2 Wraith2 commented Apr 27, 2019

Profiling the DataAccessPerformance project which emulates the TechEmpower fortunes benchmark shows that a lot the work done by SqlClient is spent in managing state snapshots. The data returned to the user is all string instances which are placed in a Fortune object but these aren't the dominators in the memory profile.

This PR changes the implementation of the snapshot mechanisms used by SqlDataReader and TdsStateParserObject to:

  • Keep track of a cached SqlDataReader snapshot object once one is created so that it can be efficiently reused. This is possible because a only a single async operation is permitted at any time. Access to the cached instance uses interlocked to take the instance so that it cannot ever be used twice and lazily returns the cleared object using standard assignment because creating a new one every now again again isn't a problem as long as it is usually reused. Under load one snapshot was created per reader and reused cleanly.

  • Keep track of a cached TdsStateParserObject snapshot in a similar way to SqlDataReader but using interlocked for both rent and return to the cache variable.

  • Use slightly smaller data structures by compressing multiple boolean fields on TdsParserStateObject into a flags enumeration, this makes multiple flags copy and restore a single copy not 5. All access to the affected properties is now done through accessor functions.

  • Introduce a small class Snapshot.PLPData to store any partially length prefixed data state if it is used, if it is not used the allocation object object and continual tracking of 128 bits of data it contains are avoided.

  • Change Snapshot.PacketData to be a self assembling singly linked list. This removes List<PacketData> and linked PacketData[] allocations when taking snapshots and allows a cached PacketData link to be retained in the snapshot since one will always be required is a snapshot is used.

I also removed most of the setting of TdsParserStateObject variables to default values so it is now easier to tell when they are initialized to non-default values. The only exceptions are some variables which must be initialized to default because they are only touched through reflection in testing so the compiler will complain that it can't see them being set.

Profile results before, green are the result objects we actually want:
snapshot-master

After:
snapshot-branch

I have another branch which removes 3 of the 4 intervening async machinery allocations which will give some more gains but they're more modest and it needs a little more polish.

Performance results are good

name sync threads TPS stdev description
ado-sqlclient+async+64 async 64 59184 2175 snapshot master
ado-sqlclient+async+64 async 64 69371 1019 snapshot branch

17% throughput increase and halved variance in query time.

Manual and functional test pass in native mode. DataAccessPerfomance under pure load and profilers has no problems.
/cc area owners @afsanehr, @tarikulsabbir, @Gary-Zh , @David-Engel , people interested in perf @divega @roji @saurabh500

@karelz
Copy link
Member

karelz commented May 4, 2019

@tarikulsabbir same here ...

@AfsanehR-zz
Copy link
Contributor

Thanks @Wraith2 🎉 We will be looking in to these PRs in near future. Working on internal deadlines which will take us some time to review these PRs.

@Wraith2 Wraith2 force-pushed the sqlperf-snapshots branch from 2d18c7f to fdd847f Compare June 12, 2019 23:10
@karelz karelz added this to the 5.0 milestone Aug 3, 2019
@cheenamalhotra
Copy link
Member

@Wraith2 This PR is being tracked and will be picked up for dotnet/SqlClient. Appreciate your patience for some more time while we come back to review and merge your contributions! 🙏

@Wraith2
Copy link
Contributor Author

Wraith2 commented Aug 6, 2019

How is that going to be done exactly?

@cheenamalhotra
Copy link
Member

Our primary focus right now is publishing GA Release v1.0 for Microsoft.Data.SqlClient in August, for which we are not taking further pulls unless they are necessarily needed. Our next preview release will be a near snapshot of what's coming in GA, for which we are wrapping up activities currently.

Post GA 1.0 we will be open sourcing driver on dotnet/SqlClient and will then come back to review, port, test and merge all the remaining PRs in Microsoft.Data.SqlClient while also considering them to merge in System.Data.SqlClient (if needed). That's when we're aiming to speed up development and delivery and address PRs more actively! :)

@Wraith2 Wraith2 closed this Aug 7, 2019
@Wraith2 Wraith2 deleted the sqlperf-snapshots branch August 7, 2019 15:44
@Wraith2
Copy link
Contributor Author

Wraith2 commented Aug 7, 2019

Based on that plan I've closed all PR's relating to performance improvements for SqlClient and when the code for M.D.SqlClient is available I can rework them onto that codebase. I've left open my bug fix PR's because those are small and applicable to both versions.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants