-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Prepared statement performance fixes #11
Prepared statement performance fixes #11
Conversation
1) Further speedups to prepared statement hashing 2) Caching of '?' chararacter positiobs in prepared statements to speed parameter substitution
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The changes to ParsedSQLMetadata.java, SQLServerParameterMetaData.java, and SQLServerPreparedStatement.java are all for the caching of parameter positions. Of the changes to SQLServerConnection.java, most of them are or the caching of parameter positions; the exceptions are lines 126/127 (a simple speedup related to the CityHash128 change -- originally suggested by @brettwooldridge), and lines 130/131-133 (also a speedup related to the CityHash128 change). Note that line 132 of this discards the top 8 bits of the SQL text while converting it to a byte array -- the SQL will mostly be 7-bit ASCII, and in our application is entirely 7-bit ASCII, so for us this gives maximum hashing performance, but for some customers the SQL could contain UTF-16 schema/table/column names. For alphabetic languages that use only a 256 character codepage, discarding the top 8 bits should still be safe, but for ideographic languages with larger codepages discarding the top 8 bits could be discarding vital information, if they had schema/tables/columns whose names differed only in the top 8-bits. So in general, this isn't safe for all locales. Possibly solutions include: doing something slower but safe for all users, providing a setting to select between fast and safe behaviors, or choosing between fast and safe behaviors depending on some locale setting.
Codecov Report
@@ Coverage Diff @@
## hashKeyChanges #11 +/- ##
====================================================
+ Coverage 48.15% 48.18% +0.02%
- Complexity 2574 2581 +7
====================================================
Files 113 113
Lines 26724 26734 +10
Branches 4474 4476 +2
====================================================
+ Hits 12870 12882 +12
+ Misses 11717 11708 -9
- Partials 2137 2144 +7
Continue to review full report at Codecov.
|
segments = CityHash.cityHash128(s.getBytes(), 0, s.length()); | ||
byte[] bytes = new byte[s.length()]; | ||
s.getBytes(0, s.length(), bytes, 0); | ||
segments = CityHash.cityHash128(bytes, 0, bytes.length); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@RDearnaley I would suggest going ahead with the slightly safer:
byte[] bytes = new byte[s.length() * 2];
for (int i = 0; i < s.length(); i++) {
final int c = s.charAt(i);
bytes[i * 2] = (byte) c;
bytes[(i * 2) + 1] = (byte)(c >> 8);
}
segments = CityHash.cityHash128(bytes, 0, bytes.length);
1 Receiving the charAt()
into an int
avoids one implicit conversion, as shift operators are not supported on char
directly and the (byte) c
cast also likely inflates to an int
before casting.
Further speedups to prepared statement hashing
Caching of '?' chararacter positiobs in prepared statements to speed parameter substitution