optimizer lacks feature of OR expansion and predicate elimination #56005

mzhang77 · 2024-09-11T03:13:43Z

Bug Report

1. Minimal reproduce step (Required)

CREATE TABLE `dt` (
  `a` bigint(20) unsigned NOT NULL,
  `pk` bigint(20) unsigned NOT NULL AUTO_INCREMENT,
  `b` longblob DEFAULT NULL,
  `c` int(11) unsigned NOT NULL,
  PRIMARY KEY (`pk`)
);

CREATE TABLE `it` (
  `pk` bigint(20) unsigned NOT NULL,
  `a` varbinary(767) DEFAULT NULL,
  `b` varbinary(767) DEFAULT NULL,
  `c` varbinary(767) DEFAULT NULL,
  `d` bigint(20) DEFAULT NULL,
  `e` varbinary(767) DEFAULT NULL,
  `f` int(11) unsigned NOT NULL,
  PRIMARY KEY (`pk`) ,
  KEY `a` (`a`,`d`,`c`,`pk`),
  KEY `b` (`b`,`pk`),
  KEY `c` (`d`,`pk`),
  KEY `d` (`e`,`pk`),
  KEY `e` (`c`,`pk`),
  KEY `f` (`a`,`pk`)
);

explain SELECT
  dt.*
FROM
  it 
  LEFT JOIN dt ON it.pk = dt.pk
WHERE
  it.a = "a"
  AND (
    (
      it.a > "a"
    )
    OR (
      it.a = "a" AND it.pk > 1
    )
  )
ORDER BY
  it.pk
LIMIT
  240;

2. What did you expect to see? (Required)

There is an index KEY f (a,pk), and there is where predicate it.a = "a" AND it.pk > 1. We expect to see range:("a" 1,"a" +inf] in execution plan

3. What did you see instead (Required)

This is the full execution plan

+---------------------------------------+---------+-----------+--------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| id                                    | estRows | task      | access object            | operator info                                                                                                                                                                                                                                                                                      |
+---------------------------------------+---------+-----------+--------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Projection_12                         | 4.19    | root      |                          | test.dt.a, test.dt.pk, test.dt.b, test.dt.c                                                                                                                                                                                                    |
| └─Limit_19                            | 4.19    | root      |                          | offset:0, count:240                                                                                                                                                                                                                                                                                |
|   └─IndexHashJoin_80                  | 4.19    | root      |                          | left outer join, inner:TableReader_75, outer key:test.it.pk, inner key:test.dt.pk, equal cond:eq(test.it.pk, test.dt.pk), other cond:or(gt(test.it.a, "a"), and(eq(test.it.a, "a"), gt(test.dt.pk, 1))) |
|     ├─Limit_36(Build)                 | 3.36    | root      |                          | offset:0, count:240                                                                                                                                                                                                                                                                                |
|     │ └─IndexReader_54                | 3.37    | root      |                          | index:Limit_53                                                                                                                                                                                                                                                                                     |
|     │   └─Limit_53                    | 3.37    | cop[tikv] |                          | offset:0, count:240                                                                                                                                                                                                                                                                                |
|     │     └─Selection_52              | 3.37    | cop[tikv] |                          | or(gt(test.it.a, "a"), and(eq(test.it.a, "a"), gt(test.it.pk, 1)))                                                                                                                                                                                          |
|     │       └─IndexRangeScan_51       | 10.00   | cop[tikv] | table:it, index:f(a, pk) | range:["a","a"], keep order:true, stats:pseudo                                                                                                                                                                                                                                                     |
|     └─TableReader_75(Probe)           | 3.36    | root      |                          | data:TableRangeScan_74                                                                                                                                                                                                                                                                             |
|       └─TableRangeScan_74             | 3.36    | cop[tikv] | table:dt                 | range: decided by [test.it.pk], keep order:false, stats:pseudo                                                                                                                                                                                                                        |
+---------------------------------------+---------+-----------+--------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+

We see range:["a","a"]. This is causing unnecessary additional data access and poor performance.

4. What is your TiDB version? (Required)

mysql> select @@version;
+--------------------+
| @@version          |
+--------------------+
| 8.0.11-TiDB-v7.5.1 |
+--------------------+
1 row in set (0.00 sec)

The text was updated successfully, but these errors were encountered:

mzhang77 · 2024-09-11T03:21:45Z

To process the above query. Optimizer need to first do an OR expansion:

explain SELECT * FROM
 ( SELECT
  dt.*
FROM
  it 
  LEFT JOIN dt ON it.pk = dt.pk
WHERE
  it.a = "a" AND it.a > "a"
UNION
SELECT
  dt.*
FROM
  it 
  LEFT JOIN dt ON it.pk = dt.pk
WHERE it.a = "a"
      AND it.a = "a"
      AND it.pk > 1
) tb
ORDER BY
  tb.pk
LIMIT
  240;

Then optimizer should eliminate the first branch of the OR expansion, because it.a = "a" AND it.a > "a" is always false. So this should be the final SQL and plan:

explain SELECT * FROM
 ( 
SELECT
  dt.*
FROM
  it 
  LEFT JOIN dt ON it.pk = dt.pk
WHERE it.a = "a"
      AND it.a = "a"
      AND it.pk > 1
) tb
ORDER BY
  tb.pk
LIMIT
  240;

+-------------------------------+---------+-----------+--------------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| id                            | estRows | task      | access object            | operator info                                                                                                                                                                |
+-------------------------------+---------+-----------+--------------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| TopN_14                       | 41.67   | root      |                          | test.dt.pk, offset:0, count:240                                                                                                                                 |
| └─IndexJoin_27                | 41.67   | root      |                          | left outer join, inner:TableReader_23, outer key:test.it.pk, inner key:test.dt.pk, equal cond:eq(test.it.pk, test.dt.pk) |
|   ├─IndexReader_39(Build)     | 33.33   | root      |                          | index:IndexRangeScan_38                                                                                                                                                      |
|   │ └─IndexRangeScan_38       | 33.33   | cop[tikv] | table:it, index:f(a, pk) | range:("a" 1,"a" +inf], keep order:false, stats:pseudo                                                                                                                       |
|   └─TableReader_23(Probe)     | 11.11   | root      |                          | data:Selection_22                                                                                                                                                            |
|     └─Selection_22            | 11.11   | cop[tikv] |                          | gt(test.dt.pk, 1)                                                                                                                                               |
|       └─TableRangeScan_21     | 33.33   | cop[tikv] | table:dt                 | range: decided by [test.it.pk], keep order:false, stats:pseudo                                                                                                  |
+-------------------------------+---------+-----------+--------------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+

ghazalfamilyusa · 2024-09-11T16:05:59Z

We have separate logic of predicate simplification and index range derivation. I think it can be enhancement for either one and we need to see what is the best way to do it.

ghazalfamilyusa · 2024-09-13T20:21:43Z

The current logic actually does such simplification but the issue is the different type/collation between it.a (binary) and the constant "a" which is utf8mb4_0900_ai_ci. A workaround is to replace the constants with binary constants like b'......'.
Now, in general and regardless of type/collation it.a = "a" and it.a > "a" is false and that can be done as an enhancement which is not straightforward. See if the workaround works cc @terry1purcell , @mzhang77

mzhang77 · 2024-09-14T00:22:15Z

Yes it works

mysql> explain SELECT
    ->   dt.*
    -> FROM
    ->   it 
    ->   LEFT JOIN dt ON it.pk = dt.pk
    -> WHERE
    ->   it.a = 0xaa
    ->   AND (
    ->     (
    ->       it.a > 0xaa
    ->     )
    ->     OR (
    ->       it.a = 0xaa AND it.pk > 1
    ->     )
    ->   )
    -> ORDER BY
    ->   it.pk
    -> LIMIT
    ->   240;
+-------------------------------------+---------+-----------+--------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| id                                  | estRows | task      | access object            | operator info                                                                                                                                                                                                 |
+-------------------------------------+---------+-----------+--------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Projection_12                       | 41.67   | root      |                          | test.dt.a, test.dt.pk, test.dt.b, test.dt.c                                                                                                                                                                   |
| └─Limit_19                          | 41.67   | root      |                          | offset:0, count:240                                                                                                                                                                                           |
|   └─IndexHashJoin_71                | 41.67   | root      |                          | left outer join, inner:TableReader_66, outer key:test.it.pk, inner key:test.dt.pk, equal cond:eq(test.it.pk, test.dt.pk), other cond:or(gt(test.it.a, "0xaa"), and(eq(test.it.a, "0xaa"), gt(test.dt.pk, 1))) |
|     ├─Limit_36(Build)               | 33.33   | root      |                          | offset:0, count:240                                                                                                                                                                                           |
|     │ └─IndexReader_46              | 33.33   | root      |                          | index:Limit_45                                                                                                                                                                                                |
|     │   └─Limit_45                  | 33.33   | cop[tikv] |                          | offset:0, count:240                                                                                                                                                                                           |
|     │     └─IndexRangeScan_44       | 33.33   | cop[tikv] | table:it, index:f(a, pk) | range:("\xaa" 1,"\xaa" +inf], keep order:true, stats:pseudo                                                                                                                                                   |
|     └─TableReader_66(Probe)         | 33.33   | root      |                          | data:TableRangeScan_65                                                                                                                                                                                        |
|       └─TableRangeScan_65           | 33.33   | cop[tikv] | table:dt                 | range: decided by [test.it.pk], keep order:false, stats:pseudo                                                                                                                                                |
+-------------------------------------+---------+-----------+--------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
9 rows in set (0.00 sec)

close #56005

mzhang77 added the type/bug The issue is confirmed as a bug. label Sep 11, 2024

hawkingrei self-assigned this Sep 11, 2024

hawkingrei added type/enhancement The issue or PR belongs to an enhancement. sig/planner SIG: Planner type/bug The issue is confirmed as a bug. and removed type/bug The issue is confirmed as a bug. type/enhancement The issue or PR belongs to an enhancement. labels Sep 11, 2024

time-and-fate added type/enhancement The issue or PR belongs to an enhancement. and removed type/bug The issue is confirmed as a bug. labels Sep 12, 2024

ghazalfamilyusa mentioned this issue Sep 19, 2024

Optimizer : Add OR pruning to predicate simplification | tidb-test=pr/2397 #56136

Merged

12 tasks

ti-chi-bot bot closed this as completed in #56136 Sep 25, 2024

ti-chi-bot bot pushed a commit that referenced this issue Sep 25, 2024

Optimizer : Add OR pruning to predicate simplification (#56136)

3cb22d6

close #56005

mzhang77 mentioned this issue Nov 26, 2024

further improvement of pagination query #57694

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

optimizer lacks feature of OR expansion and predicate elimination #56005

optimizer lacks feature of OR expansion and predicate elimination #56005

mzhang77 commented Sep 11, 2024 •

edited

Loading

mzhang77 commented Sep 11, 2024 •

edited

Loading

ghazalfamilyusa commented Sep 11, 2024

ghazalfamilyusa commented Sep 13, 2024

mzhang77 commented Sep 14, 2024

optimizer lacks feature of OR expansion and predicate elimination #56005

optimizer lacks feature of OR expansion and predicate elimination #56005

Comments

mzhang77 commented Sep 11, 2024 • edited Loading

Bug Report

1. Minimal reproduce step (Required)

2. What did you expect to see? (Required)

3. What did you see instead (Required)

4. What is your TiDB version? (Required)

mzhang77 commented Sep 11, 2024 • edited Loading

ghazalfamilyusa commented Sep 11, 2024

ghazalfamilyusa commented Sep 13, 2024

mzhang77 commented Sep 14, 2024

mzhang77 commented Sep 11, 2024 •

edited

Loading

mzhang77 commented Sep 11, 2024 •

edited

Loading