Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix comparison of nullable values #33757

Merged
merged 9 commits into from
May 31, 2024

Conversation

ranma42
Copy link
Contributor

@ranma42 ranma42 commented May 19, 2024

Handle comparison of nullable values

  • add a test that reproduces the issue
  • extend SqlNullabilityProcessor to handle comparisons of nullable values
  • cleanup code that was needed because of missing handling of comparisons

Fixes #33752

@ranma42 ranma42 force-pushed the fix-nullable-int-compare-int branch from a8a8641 to 859b2cd Compare May 20, 2024 13:01
@roji roji requested a review from maumar May 20, 2024 15:08
@maumar
Copy link
Contributor

maumar commented May 20, 2024

@ranma42 additional tests are needed. Specifically ones where nullable comparisons are wrapped around Not operation. In SqlNullabilityProcessor -> VisitSqlUnary -> OptimizeNonNullableNotExpression we make an assumption that for non-nullable expression inside Not we can perform a bunch of simplifications:

// !(a > b) -> a <= b
// !(a >= b) -> a < b
// !(a < b) -> a >= b
// !(a <= b) -> a > b

This is not true for operations with nullable values. We should make sure that we do the right thing and don't try to apply this "optimization", given that we add coalesce(x, false) and mark nullability of the expression as false.

@ranma42
Copy link
Contributor Author

ranma42 commented May 21, 2024

@ranma42 additional tests are needed. Specifically ones where nullable comparisons are wrapped around Not operation. In SqlNullabilityProcessor -> VisitSqlUnary -> OptimizeNonNullableNotExpression we make an assumption that for non-nullable expression inside Not we can perform a bunch of simplifications:

// !(a > b) -> a <= b
// !(a >= b) -> a < b
// !(a < b) -> a >= b
// !(a <= b) -> a > b

This is not true for operations with nullable values. We should make sure that we do the right thing and don't try to apply this "optimization", given that we add coalesce(x, false) and mark nullability of the expression as false.

I re-enabled the Negated_order_comparison_on_nullable_arguments_doesnt_get_optimized test which seems to cover exactly the risk you are describing.
Note that it only checks for the "WHERE" case, if we also want to test that the nullability processing is correct in the "SELECT" case we might want another test (OTOH optimizations in WHERE are generally going to be more aggressive than in SELECT).

Is there a way to observe code coverage using the current testing infrastructure?

@maumar
Copy link
Contributor

maumar commented May 21, 2024

@ranma42 we don't have any dedicated code coverage infra and as such we don't rely on code coverage too much. I debugged into the code and the COALESCE node blocks the optimizations I was worried about, so we should be safe here. I would add a test for projection, just in case.

There are also some test failures due to sql changes, e.g. Composite_key_join_on_groupby_aggregate_projecting_only_grouping_key

before:

SELECT [l2].[Key]
FROM [LevelOne] AS [l]
INNER JOIN (
    SELECT [l1].[Key], COALESCE(SUM([l1].[Id]), 0) AS [Sum]
    FROM (
        SELECT [l0].[Id], [l0].[Id] % 3 AS [Key]
        FROM [LevelTwo] AS [l0]
    ) AS [l1]
    GROUP BY [l1].[Key]
) AS [l2] ON [l].[Id] = [l2].[Key] AND CAST(1 AS bit) = CASE
    WHEN [l2].[Sum] > 10 THEN CAST(1 AS bit)
    ELSE CAST(0 AS bit)
END

now:

SELECT [l2].[Key]
FROM [LevelOne] AS [l]
INNER JOIN (
    SELECT [l1].[Key], COALESCE(SUM([l1].[Id]), 0) AS [Sum]
    FROM (
        SELECT [l0].[Id], [l0].[Id] % 3 AS [Key]
        FROM [LevelTwo] AS [l0]
    ) AS [l1]
    GROUP BY [l1].[Key]
) AS [l2] ON [l].[Id] = [l2].[Key] AND COALESCE(CASE
    WHEN [l2].[Sum] > 10 THEN CAST(1 AS bit)
    ELSE CAST(0 AS bit)
END, CAST(0 AS bit)) = CAST(1 AS bit)

this is not ideal since the CASE block is already non-nullable, so the COALESCE is unnecessary.

@maumar
Copy link
Contributor

maumar commented May 21, 2024

the problem is due to search condition converting visitor on sql server. What's happening in the scenario is that when going through null semantics processor the expression is (true == x.Sum > 10), that gets converted to (true == COALESCE(x.Sum > 10, false)), which makes sense.
Then, simplifying visitor kicks in and removes the redundant true == so we end up with COALESCE(x.Sum > 10, false).
And finally we go through search condition conversion which ends up with

COALESCE(CASE
    WHEN l2.Sum > 10 THEN CAST(1 AS bit)
    ELSE CAST(0 AS bit)
END, CAST(0 AS bit)) == CAST(1 AS bit)

at each step we do the right thing, but end up with worse sql, because search condition is already effectively doing the null protection in this case (but we can't know that at the time we apply the coalesce, and it's only on sql server)

@ranma42
Copy link
Contributor Author

ranma42 commented May 21, 2024

Sorry for the noise, I will check on the full SQL Server testsuite in the next runs 😇

I would eventually like the SQL Server translation to be

SELECT [l2].[Key]
FROM [LevelOne] AS [l]
INNER JOIN (
    SELECT [l1].[Key], COALESCE(SUM([l1].[Id]), 0) AS [Sum]
    FROM (
        SELECT [l0].[Id], [l0].[Id] % 3 AS [Key]
        FROM [LevelTwo] AS [l0]
    ) AS [l1]
    GROUP BY [l1].[Key]
) AS [l2] ON [l].[Id] = [l2].[Key] AND [l2].[Sum] > 10

(no need for any CASE or COALESCE; NULL and FALSE are equivalent in the join condition).

I will try and check whether it is easier to preserve the original translation in this PR or go directly for the improved one.

@ranma42 ranma42 marked this pull request as draft May 21, 2024 21:06
@maumar
Copy link
Contributor

maumar commented May 22, 2024

@ranma42 one thing to keep in mind, for join where keys are anonymous objects we actually want c# null semantics (to mimic linq to objects behavior). See #27071 (comment) for some extra context. If you decide to improve the translation, make sure to add tests that exercise this as well, i.e. join with composite keys where the key elements could end up being nulls

@ranma42
Copy link
Contributor Author

ranma42 commented May 22, 2024

@ranma42 one thing to keep in mind, for join where keys are anonymous objects we actually want c# null semantics (to mimic linq to objects behavior). See #27071 (comment) for some extra context. If you decide to improve the translation, make sure to add tests that exercise this as well, i.e. join with composite keys where the key elements could end up being nulls

Uh, i was not aware of this! In general, it is not trivial to know whether EFCore uses C# or SQL semantics 🤔 and it is even harder along the translation pipeline, as the same expression changes semantics along the way (see ExpressionType.Equal which AFAICT usually starts with C# semantics, but IIUC after the nullability processor should have SQL semantics). I guess I will have to study more on the subject. Is there any documentation I should look up besides https://www.youtube.com/watch?v=1Ld3dtnTrMw ?
I found dotnet/EntityFramework.Docs#1920 (comment) but if it's up to date, I might just have to explore the code (which so far has been fun, so that's not so bad 😅 )

I found the missing optimization that would make the query ideal (see #33776), but regardless of that that optimization I will try and investigate a bit more the pipeline around the changes I am making.

@maumar
Copy link
Contributor

maumar commented May 22, 2024

@ranma42 yeah, it is a problem with working in this area - there is a lot of tribal knowledge and edge cases that need to be taken into consideration. We don't have any up to date documentation regarding this (issue is tracked here: dotnet/EntityFramework.Docs#3692) In general, we try to mimic the c# semantics as much as we can, meaning the final result of the query should (ideally) be identical with the same query ran on Linq 2 objects (ignoring stuff we can't realistically control like case-insensitive comparisons) with EF adding null ref protection to the mix.

Good example of going to great lengths to mimic c# is 517fc18

Internally, we do two types of expression trees, regular expression tree which represents linq query (and uses c# semantics) and SqlExpression tree which represents sql query (and follows sql null semantics), but they all use ExpressionType to describe the operation.

Our usual solution is to add a lot of tests - AssertQuery infra does full result verification vs linq to objects, and they are relatively fast/cheap to run. So when in doubt - add a test, even if there is potential duplication with existing tests. Things to keep in mind:

  • filter vs projection,
  • operations wrapped in NOT,
  • making sure things work well recursively (combining multiple operations, nesting one within another),

@ranma42 ranma42 marked this pull request as ready for review May 22, 2024 08:54
@ranma42 ranma42 marked this pull request as draft May 22, 2024 08:54
@ranma42 ranma42 force-pushed the fix-nullable-int-compare-int branch from cdf1b2c to dd83431 Compare May 22, 2024 21:43
@ranma42
Copy link
Contributor Author

ranma42 commented May 23, 2024

By emitting CASE WHEN expr THEN TRUE ELSE FALSE END instead of COALESCE(expr, FALSE) I managed to fix most of the SQL Server regressions (I believe there are further opportunities for improvements of COALESCE and equivalent operations in SQL Server, but I will try to keep each PR as narrow as possible 😇 ).

The only remaining regression I observe in the current test suite is

--- a/test/EFCore.SqlServer.FunctionalTests/Query/FunkyDataQuerySqlServerTest.cs
+++ b/test/EFCore.SqlServer.FunctionalTests/Query/FunkyDataQuerySqlServerTest.cs
@@ -158,7 +158,10 @@ public override async Task String_contains_on_argument_with_wildcard_column_nega
 SELECT [f].[FirstName] AS [fn], [f0].[LastName] AS [ln]
 FROM [FunkyCustomers] AS [f]
 CROSS JOIN [FunkyCustomers] AS [f0]
-WHERE NOT ([f].[FirstName] IS NOT NULL AND [f0].[LastName] IS NOT NULL AND (CHARINDEX([f0].[LastName], [f].[FirstName]) > 0 OR [f0].[LastName] LIKE N''))
+WHERE [f].[FirstName] IS NULL OR [f0].[LastName] IS NULL OR (CASE
+    WHEN CHARINDEX([f0].[LastName], [f].[FirstName]) > 0 THEN CAST(1 AS bit)
+    ELSE CAST(0 AS bit)
+END = CAST(0 AS bit) AND [f0].[LastName] NOT LIKE N'')
 """);
     }

which I believe is related to some issues with nullability propagation. I am still investigating (and learning things about the translation pipeline 🤩).

@ranma42 ranma42 force-pushed the fix-nullable-int-compare-int branch 3 times, most recently from 9154317 to e0be248 Compare May 25, 2024 19:59
@ranma42
Copy link
Contributor Author

ranma42 commented May 25, 2024

This is now based on:

which takes care of implementing the nullability computation for SqlFunctionExpressions. Thanks to that, the diffs now should look ok.

@ranma42
Copy link
Contributor Author

ranma42 commented May 29, 2024

I rebased on main so that it includes the fixes to the PRs:

Now that the stack of PRs has resolved, this should be ready for review 🎉

@ranma42 ranma42 marked this pull request as ready for review May 29, 2024 22:33
As reported in dotnet#33752, `SELECT`ing the result of a comparison between `int?` and
`int` can trigger an exception in the shaper.
@maumar
Copy link
Contributor

maumar commented May 30, 2024

@ranma42 just the small comment update and it should be good to go!

ranma42 added 6 commits May 30, 2024 22:46
In C# an ordered comparison (<, >, <=, >=) between two nullable values always
returns a boolean value: if either operand is null, the result is false;
otherwise, the result is that of the comparison of the (non-null) values.

Fixes dotnet#33752
The additional `IS NOT NULL` check is not needed anymore since now we do null
semantics/compensation for comparison.
Several queries are now simpler as when filtering we can treat NULL and FALSE as
equivalent results (they both discard the record).
Now that the comparison null semantics has been implemented, it works as
intended.
@ranma42 ranma42 force-pushed the fix-nullable-int-compare-int branch from 89c6c78 to 6e0c36c Compare May 30, 2024 20:50
@ranma42
Copy link
Contributor Author

ranma42 commented May 30, 2024

I added the tests before seeing your edit; I think these tests might be valuable because they make it easy to spot that there are some missing optimizations.
I looked for the WHEN instr() pattern in the code and I was unable to find out which tests they are duplicating; if you point me towards those tests, I will drop the new ones and instead make sure that the SQL emitted by the Sqlite provider is asserted on the original ones.

Sorry for the comment, it was left that way from a previous revision 😅

Comment on lines +881 to +884
WHERE NOT (CASE
WHEN instr("c"."CompanyName", "c"."ContactName") > 0 THEN 1
ELSE 0
END)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ideally this should be:

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
WHERE NOT (CASE
WHEN instr("c"."CompanyName", "c"."ContactName") > 0 THEN 1
ELSE 0
END)
WHERE CASE
WHEN instr("c"."CompanyName", "c"."ContactName") > 0 THEN 0
ELSE 1
END

i.e. we could remove the NOT by distributing it over the CASE.
Can we do better?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea, filed #33857. We also have #25977 around case nullability, but it's much more complicated and arguably not that useful.

@maumar
Copy link
Contributor

maumar commented May 30, 2024

@ranma42 the tests you added are valuable and I would keep them for sure. My comment was to add more tests for un-optimized string.contains (so that we can check that CASE conversion properly compensates for the fact that we no longer do null check on the argument to string.Contains). But then I noticed that we already have Optional_navigation_type_compensation_works_with_binary_and_expression, so we have the coverage. In general, don't worry about test duplication. More tests is pretty much always welcome.

@maumar
Copy link
Contributor

maumar commented May 30, 2024

Cosmos has some failures, let me take care of it, so you don't need to jump through all the hoops to set it up @ranma42

@maumar maumar merged commit b77d2f4 into dotnet:main May 31, 2024
7 checks passed
@maumar
Copy link
Contributor

maumar commented May 31, 2024

yay, big improvement! thanks again @ranma42

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Exception when SELECTing (x ? A : default(int?) >= B)
3 participants