Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

subquery NULL compatibility #3773

Closed
breezewish opened this issue Jul 17, 2017 · 7 comments · Fixed by #19620
Closed

subquery NULL compatibility #3773

breezewish opened this issue Jul 17, 2017 · 7 comments · Fixed by #19620
Assignees
Labels
challenge-program help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. priority/P2 The issue has P2 priority. severity/critical sig/execution SIG execution type/bug The issue is confirmed as a bug.
Milestone

Comments

@breezewish
Copy link
Member

breezewish commented Jul 17, 2017

Description

MySQL:

mysql> drop table t1;
mysql> create table t1 (a varchar(20));
mysql> insert into t1 values ('A'),('BC'),('DEF');
mysql> select NULL IN (SELECT * from t1);
+----------------------------+
| NULL IN (SELECT * from t1) |
+----------------------------+
|                       NULL |
+----------------------------+
1 row in set (0.00 sec)

TiDB:

mysql> drop table t1;
mysql> create table t1 (a varchar(20));
mysql> insert into t1 values ('A'),('BC'),('DEF');
mysql> select NULL IN (SELECT * from t1);
mysql> select NULL IN (SELECT * from t1);
+----------------------------+
| NULL IN (SELECT * from t1) |
+----------------------------+
|                          0 |
+----------------------------+
1 row in set (0.00 sec)

SIG slack channel

#sig-exec

Score

  • 900

Mentor

@zz-jason zz-jason self-assigned this Jul 31, 2017
@zz-jason zz-jason added the WIP label Aug 1, 2017
@zz-jason
Copy link
Member

zz-jason commented Aug 1, 2017

another case:

drop table if exists subq;
drop table if exists t;
create table subq(a bigint);
create table t(a bigint, b bigint);
insert into subq (a) values(1), (2), (3), (null);
insert into t (a, b) values (null, 1), (2, null);
select a in (select * from subq) from t;
select b in (select * from subq) from t;
select null in (select * from subq) from t;
select 1 in (select * from subq) from t;
select 6 in (select * from subq) from t;

@zz-jason
Copy link
Member

zz-jason commented Aug 1, 2017

TiDB > select a in (select * from subq) from t;
+---------------------------+
| a in (select * from subq) |
+---------------------------+
|                      NULL |
|                         1 |
+---------------------------+
2 rows in set (0.00 sec)

this behaves the same as MySQL

TiDB > select null in (select * from subq) from t;
+------------------------------+
| null in (select * from subq) |
+------------------------------+
|                            0 |
|                            0 |
+------------------------------+
2 rows in set (0.00 sec)

TiDB > select 6 in (select * from subq) from t;
+---------------------------+
| 6 in (select * from subq) |
+---------------------------+
|                         0 |
|                         0 |
+---------------------------+
2 rows in set (0.00 sec)

in MySQL, the above two sql returns two NULL:

mysql> select null in (select * from subq) from t;
+------------------------------+
| null in (select * from subq) |
+------------------------------+
|                         NULL |
|                         NULL |
+------------------------------+
2 rows in set (0.00 sec)

mysql> select 6 in (select * from subq) from t;
+---------------------------+
| 6 in (select * from subq) |
+---------------------------+
|                      NULL |
|                      NULL |
+---------------------------+
2 rows in set (0.00 sec)

the IN sub-query is built as a semi-join in TiDB:

TiDB > explain select null in (select * from subq) from t;
+----------------+----------------+------+-------------------------------------------------+
| id             | parents        | task | operator info                                   |
+----------------+----------------+------+-------------------------------------------------+
| TableScan_9    |                | cop  | table:t, range:(-inf,+inf), keep order:false    |
| TableReader_10 | HashSemiJoin_8 | root | data:TableScan_9                                |
| TableScan_11   |                | cop  | table:subq, range:(-inf,+inf), keep order:false |
| TableReader_12 | Selection_7    | root | data:TableScan_11                               |
| Selection_7    | HashSemiJoin_8 | root | eq(null, cast(test.subq.a))                     |
| HashSemiJoin_8 |                | root | right:Selection_7, aux                          |
+----------------+----------------+------+-------------------------------------------------+
6 rows in set (0.00 sec)
TiDB > explain select 6 in (select * from subq) from t;
+----------------+----------------+------+-------------------------------------------------+
| id             | parents        | task | operator info                                   |
+----------------+----------------+------+-------------------------------------------------+
| TableScan_8    |                | cop  | table:t, range:(-inf,+inf), keep order:false    |
| TableReader_9  | HashSemiJoin_7 | root | data:TableScan_8                                |
| TableScan_10   | Selection_11   | cop  | table:subq, range:(-inf,+inf), keep order:false |
| Selection_11   |                | cop  | eq(6, test.subq.a)                              |
| TableReader_12 | HashSemiJoin_7 | root | data:Selection_11                               |
| HashSemiJoin_7 |                | root | right:TableReader_12, aux                       |
+----------------+----------------+------+-------------------------------------------------+
6 rows in set (0.01 sec)

to fix this issue, we have to:

  • do not extract on conditions composed by a column and constant to left or right table filter conditions in function extractOnCondition
  • support constant join keys

@morgo
Copy link
Contributor

morgo commented Nov 24, 2018

Confirming that this can still be reproduced:

tidb> drop table t1;
Query OK, 0 rows affected (0.02 sec)

tidb> create table t1 (a varchar(20));
Query OK, 0 rows affected (0.02 sec)

tidb> insert into t1 values ('A'),('BC'),('DEF');
Query OK, 3 rows affected (0.00 sec)

tidb> select NULL IN (SELECT * from t1);
+----------------------------+
| NULL IN (SELECT * from t1) |
+----------------------------+
|                          0 |
+----------------------------+
1 row in set (0.00 sec)

tidb> SELECT tidb_version()\G
*************************** 1. row ***************************
tidb_version(): Release Version: v2.1.0-rc.3-219-g1e0876fe8-dirty
Git Commit Hash: 1e0876fe810a832721aac52275dd2b7792fd2892
Git Branch: flush
UTC Build Time: 2018-11-24 01:12:47
GoVersion: go version go1.11 linux/amd64
Race Enabled: false
TiKV Min Version: 2.1.0-alpha.1-ff3dd160846b7d1aed9079c389fc188f7f5ea13e
Check Table Before Drop: false
1 row in set (0.00 sec)

@wwar
Copy link

wwar commented May 3, 2020

No table required, the easiest test-case for this is:

SELECT NULL IN (SELECT NULL);

TiDB returns a bool (FALSE) versus NULL in MySQL.

@ghost
Copy link

ghost commented Aug 3, 2020

An example to show how this could manifest into bugs:

mysql> SELECT NOT NULL IN (SELECT NULL);
+---------------------------+
| NOT NULL IN (SELECT NULL) |
+---------------------------+
|                         1 |
+---------------------------+
1 row in set (0.00 sec)

This should return NULL.

@ghost ghost added the correctness label Aug 3, 2020
@SunRunAway SunRunAway assigned fzhedu and unassigned zz-jason Aug 11, 2020
@qw4990 qw4990 added the priority/P1 The issue has P1 priority. label Aug 27, 2020
@fzhedu fzhedu added priority/P2 The issue has P2 priority. and removed priority/P1 The issue has P1 priority. labels Aug 28, 2020
@SunRunAway SunRunAway added the type/bug The issue is confirmed as a bug. label Oct 19, 2020
@scsldb scsldb added this to the v4.0.13 milestone Oct 29, 2020
@lzmhhh123 lzmhhh123 added challenge-program help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. and removed status/WIP labels Oct 30, 2020
@lzmhhh123 lzmhhh123 linked a pull request Nov 2, 2020 that will close this issue
@ti-srebot
Copy link
Contributor

ti-srebot commented Nov 12, 2020

Please edit this comment or add a new comment to complete the following information

Not a bug

  1. Remove the 'type/bug' label
  2. Add notes to indicate why it is not a bug

Duplicate bug

  1. Add the 'type/duplicate' label
  2. Add the link to the original bug

Bug

Note: Make Sure that 'component', and 'severity' labels are added
Example for how to fill out the template: #20100

1. Root Cause Analysis (RCA) (optional)

2. Symptom (optional)

3. All Trigger Conditions (optional)

4. Workaround (optional)

5. Affected versions

6. Fixed versions

v5.0.0-rc

@ti-srebot
Copy link
Contributor

( AffectedVersions ) fields are empty.
The values in ( FixedVersions ) fields are incorrect.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
challenge-program help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. priority/P2 The issue has P2 priority. severity/critical sig/execution SIG execution type/bug The issue is confirmed as a bug.
Projects
None yet
Development

Successfully merging a pull request may close this issue.