-
Notifications
You must be signed in to change notification settings - Fork 927
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Int32Type: avoid unnecessary boxing for common cases #3513
Conversation
@wilbit can you show some numbers? |
@hazzik , sure! What numbers would you like to see? Memory traffic difference, performance difference, something else? |
A agree with this effort, but I believe it should go beyond just Int32Type. Every primitive type suffers from this to some extent. However, the only thing that significantly improves performance (as in CPU performance, time taken) is to not be permissive about the type returned in the reader. GetInt32 is at least twice as fast as GetValue (rs[...]), but that performance advantage is gone when preceding it with a GetFieldType. What remains is getting rid of the unnecessary boxing, which is of course very good in itself (probably more important) and the stated purpose of the PR. |
Would it be worth it to add a new option, something like That would allow for a simpler implementation of the optimization, which would be opt-in for those ensuring their database typing matches their mapping. |
I think there will be boxing anyway. What we want to achieve, if I understand correctly, is unnecessary I think following code should suffice var value = rs[name]; // object
return rs[name] switch
{
int _ => value,
BigInteger bi => (int) bi,
var c => Convert.ToInt32(c)
}; @wilbit could you please check if that would would improve your scenario? |
One step at a time.
This new options seems to me fragile, because different DbReaders have different behaviours how they handle small type mismatchings (like
Interesting. It should be the next: var value = rs[name]; // object
return value switch
{
int _ => value,
BigInteger bi => (int) bi,
var c => Convert.ToInt32(c)
}; to avoid 1 boxing, but I got you point. Yes, it will work for what is said in the PR.
So, it won't work for my scenario, but will work for the improvement in this PR. PS: after some considerations, I think your approach also will work for me (it will change nothing for Firebird 4/BigInteger case, but I'm OK with that). |
The double boxing has been introduced by 990cd6a twenty years ago due to some deficiencies of a driver at that time. Prior to that, the code was using the typed methods of the data reader, without attempting to support unexpected types. I consider it to be a bad legacy. It is neither sound nor safe to have a type mismatch between the model property type and the underlying table column type. Or am I missing cases where that is legit? |
Yes: SQLite. It was long on my todo list to redesign the type system to uncouple the Get and Set logic from the types themselves. Hibernate done this in I think v 6.0. |
The SQLite driver handles SQLite lax typing itself. It does perform the required conversion itself when a typed method is used, according to deAtog in #3530. |
I like the concept, but I do not agree with the approach taken here. In my opinion, NHibernate should always call the type specific methods of the DbDataReader. If and only if that fails, should any other conversions be performed using the value of the indexer property. Any calls to Convert must specify a locale which must be configurable by the user and default to CultureInfo.InvariantCulture. Any driver agnostic conversion must not rely on the type returned from calling GetFieldType of a DbDataReader. The System.Data.SQLite driver for instance will automatically convert numeric types to a text form when inserting into a TEXT column and automatically convert from the text form to the numeric type when requested to do so. In such cases, a call to GetFieldType would return a String while calling SQLiteDataReader.GetDouble would return a double after converting the value. The underlying database driver must be given the opportunity to convert the value to the correct type if necessary prior to any other conversion. This allows the underlying driver to perform the conversion using the locale of the database, or perform any other database specific conversions. The GetDateTime method of the System.Data.SQLite driver, for instance, will attempt to convert a value stored in a TEXT, INTEGER, or NUMERIC column to a DateTime type using the DateTimeFormat, DateTimeFormatString, and DateTimeKind properties of the connection string. The indexer method of the SQLiteDbDataReader will not perform this conversion unless the DetectTextAffinity or DetectStringType flags are set on the Connection or Command object. These flags cause the driver to examine the values of all string columns returned from the database and negatively impact performance dramatically. The GetDateTime method does not have this problem as it only attempts conversions of requested columns. 990cd6a indicates there might be issues with the Oracle driver in regards to the type specific methods. If those methods throw exceptions, then any subsequent conversion using the indexer property of the DbDataReader would work as expected. However, this would be extremely inefficient for the Oracle driver as exceptions are inherently slow in .Net and this would likely cause a flurry of exceptions if these methods throw an exception on every call. In my opinion, the problematic driver should have been encapsulated by a proxy that implemented the type specific functions of the DbDataReader by converting the value from the indexer method to the appropriate type in the Oracle specific way. This would eliminate the flurry of exceptions for that driver. The above approach could be taken here if the Microsoft.Data.SqlClient driver cannot perform the requested conversions without throwing exceptions for common type conversions. Such an implementation could rely on the value of calling GetFieldType of the underlying DbDataReader if necessary. |
For examples of why calling Convert without specifying a locale is wrong, see all the failing test cases I added in this pull request: #3548 The original failing case involved System.Data.SQLite's handling of DateTime which stores DateTime values as text by default. I expanded my tests to include other common cases where a user might map a numeric property to a text column. A majority of these tests fail for the databases in the test suite here as the database will coerce numeric and date values to text upon insert, but NHibernate is not able to correctly retrieve the value. I suspect many of the failing tests would pass if the type specific methods of the DbDataReader were called instead of using Convert. This is certainly the case for SQLite. |
@wilbit could you please update the PR there are conflicts now? |
Sure, will do, and will apply your suggestions. |
and use the previous implementation as a fallback for the rest types
When can it be merged, @hazzik? |
@wilbit sorry, missed the notification about force push. It seems that async code needs to be regenerated. Could you please do it or "allow edits from contributors" on the PR? |
I tried to run "ShowBuildMenu -> Generate async code" before submitting the PR, but it does not produce any changes for committing =( I have no Idea what "Generate Async code / generate-async (pull_request_target)" task tries to commit 🤔
This feature is not available for me because my fork is not under personal GitHub account. |
Indexed properties of
DbDataReader
returns a boxed value, which we convert toInt32
(and then box again, but it is a different story).This is a performance hit and produces memory traffic. Why? Because, for example,
SqlDataReader
in Microsoft.Data.SqlClient storesint
(and all other reference types) as a reference value, and reading of value asobject
forcesSqlDataReader
to box it.