-
Notifications
You must be signed in to change notification settings - Fork 412
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Storages: Fix comparing null in MinMaxIndex #9373
Conversation
@@ -258,8 +258,6 @@ RSResults MinMaxIndex::checkNullableIn( | |||
{ | |||
const auto & column_nullable = static_cast<const ColumnNullable &>(*minmaxes); | |||
const auto & null_map = column_nullable.getNullMapColumn(); | |||
|
|||
RSResults results(pack_count, RSResult::SomeNull); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just move some results
to a smaller scope.
check(createGreater(attr("Int64"), null_value), RSResult::NoneNull); | ||
check(createGreaterEqual(attr("Int64"), null_value), RSResult::NoneNull); | ||
check(createLess(attr("Int64"), null_value), RSResult::AllNull); | ||
check(createLessEqual(attr("Int64"), null_value), RSResult::AllNull); | ||
check(createEqual(attr("Int64"), null_value), RSResult::NoneNull); | ||
check(createNotEqual(attr("Int64"), null_value), RSResult::AllNull); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why "less"/"lessequal"/"noteuqal" return "AllNull" but others return "NoneNull"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because Less
is transformed to not GreaterEqual
in implementation...
GreaterEqual(NULL)
will return RSResult::NoneNull
as expected.
tiflash/dbms/src/Storages/DeltaMerge/Filter/Less.h
Lines 32 to 37 in d113404
RSResults roughCheck(size_t start_pack, size_t pack_count, const RSCheckParam & param) override | |
{ | |
auto results = minMaxCheckCmp<RoughCheck::CheckGreaterEqual>(start_pack, pack_count, param, attr, value); | |
std::transform(results.begin(), results.end(), results.begin(), [](RSResult result) { return !result; }); | |
return results; | |
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LessEqual
is transformed to not Greater
.
NotEqual
is not Equal
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Better add some test case about this: https://dev.mysql.com/doc/refman/8.4/en/working-with-null.html
Two NULL values are regarded as equal in a GROUP BY.
When doing an ORDER BY, NULL values are presented first if you do ORDER BY ... ASC and last if you do ORDER BY ... DESC.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added
Co-authored-by: JaySon <tshent@qq.com>
Co-authored-by: JaySon <tshent@qq.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: JaySon-Huang, Lloyd-Pottiger The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
[LGTM Timeline notifier]Timeline:
|
What problem does this PR solve?
Issue Number: close #9374
Problem Summary:
The filter of
select * from test.t where not (a between NULL and '2024-08-25')
will be transformed in TiFlash asnot (a between NULL and '2024-08-25')
=>not (a >= NULL and a <= '2024-08-25')
=>a < NULL or a > '2024-08-25'
.The problem is how
RSOperator
andMinMaxIndex
handlea < NULL
.As the code below,
Less
will be transformed intonot GreaterEqual
:tiflash/dbms/src/Storages/DeltaMerge/Filter/Less.h
Lines 32 to 37 in 81cd947
MinMaxIndex
executeGreaterEqual
and the result for comparing withNULL
isRSResult::None
.tiflash/dbms/src/Storages/DeltaMerge/Index/MinMaxIndex.cpp
Lines 422 to 426 in 81cd947
not RSResult::None
equals toRSResult::All
. And PR #9361 will skip filter if the filter result of a Block isRSResult::All
.The root cause is that the result of comparing any value with null is null, returning
RSResult::None
is incorrect.What is changed and how it works?
Handling null input value, in
checkCmp
:tiflash/dbms/src/Storages/DeltaMerge/Index/MinMaxIndex.cpp
Lines 422 to 426 in 81cd947
Handling null input value, in
checkIn
:tiflash/dbms/src/Storages/DeltaMerge/Index/RoughCheck.h
Lines 76 to 78 in 81cd947
For some filters like
checkCmp(null)
, the result isRSResult::NoneNull
and the block can be skipped.For some filters like
not checkCmp(null)
, the result isRSResult::AllNull
and the block will be read and filtered.In fact, the result
checkCmp(null)
andnot checkCmp(null)
should always be null. Maybe we can extendRSResult
to support this. However, comparing(>, <, >=, <=, ==, !=, in) with null is not a normal operation in MySQL. So just let it fallback toFilterTransforamAction
if neccessary.Check List
Tests
Side effects
Documentation
Release note