Allow filter condition push down to IndexScan for prefix index. #21145

coocood · 2020-11-19T09:12:31Z

Feature Request

Allow filter condition push down to IndexScan for prefix index.

mysql> create table t (a text, b text, index a_b (a(255), b(255)));
Query OK, 0 rows affected (0.01 sec)

mysql> explain select * from t where a between 'a' and 'b' and b = 'b';
+-------------------------------+---------+-----------+--------------------------+---------------------------------------------------------+
| id                            | estRows | task      | access object            | operator info                                           |
+-------------------------------+---------+-----------+--------------------------+---------------------------------------------------------+
| IndexLookUp_11                | 0.25    | root      |                          |                                                         |
| ├─IndexRangeScan_8(Build)     | 250.00  | cop[tikv] | table:t, index:a_b(a, b) | range:["a","b"], keep order:false, stats:pseudo         |
| └─Selection_10(Probe)         | 0.25    | cop[tikv] |                          | eq(test.t.b, "b"), ge(test.t.a, "a"), le(test.t.a, "b") |
|   └─TableRowIDScan_9          | 250.00  | cop[tikv] | table:t                  | keep order:false, stats:pseudo                          |
+-------------------------------+---------+-----------+--------------------------+---------------------------------------------------------+

The condition b = 'b' should be able to pushed down to IndexScan phase to reduce the lookup table cost.

We only need to make sure the equal condition value of b is less than the index column length 255.

The ideal plan should be

mysql> explain select * from t where a between 'a' and 'b' and b = 'b';
+--------------------------+---------+-----------+--------------------------+-------------------------------------------------+
| id                       | estRows | task      | access object            | operator info                                   |
+--------------------------+---------+-----------+--------------------------+-------------------------------------------------+
| IndexReader_7            | 0.25    | root      |                          | index:Selection_6                               |
| └─Selection_6            | 0.25    | cop[tikv] |                          | eq(test.t.b, "b")                               |
|   └─IndexRangeScan_5     | 250.00  | cop[tikv] | table:t, index:a_b(a, b) | range:["a","b"], keep order:false, stats:pseudo |
+--------------------------+---------+-----------+--------------------------+-------------------------------------------------+

Describe alternatives you've considered:

Teachability, Documentation, Adoption, Migration Strategy:

The text was updated successfully, but these errors were encountered:

xuyifangreeneyes · 2022-10-21T09:24:26Z

The optimization may cause some correctness problems when new collation is enabled. For example, using #38215, we can get the following wrong result.

mysql> create table t1 (
    ->  id int not null,
    ->  city varchar(20) not null,
    ->  key (city(7),id)
    -> ) character set=utf8;
Query OK, 0 rows affected (0.03 sec)

mysql> insert into t1 values (1,'Durban North');
Query OK, 1 row affected (0.00 sec)

mysql> insert into t1 values (2,'Durban');
Query OK, 1 row affected (0.00 sec)

mysql> select * from t1 where city = 'Durban';
+----+---------+
| id | city    |
+----+---------+
|  1 | Durban  |
|  2 | Durban  |
+----+---------+
2 rows in set (0.00 sec)

mysql> explain select * from t1 where city = 'Durban';
+------------------------+---------+-----------+--------------------------------+-----------------------------------------------------------+
| id                     | estRows | task      | access object                  | operator info                                             |
+------------------------+---------+-----------+--------------------------------+-----------------------------------------------------------+
| IndexReader_6          | 0.00    | root      |                                | index:IndexRangeScan_5                                    |
| └─IndexRangeScan_5     | 0.00    | cop[tikv] | table:t1, index:city(city, id) | range:["Durban","Durban"], keep order:false, stats:pseudo |
+------------------------+---------+-----------+--------------------------------+-----------------------------------------------------------+
2 rows in set (0.00 sec)

'Durban' and 'Durban ' is equal under utf8mb4_bin collation. When inserting 'Durban North' into index, we first cut it to 'Durban ', then trim it to 'Durban' and fill into index. When we apply city = 'Durban' on index, we don't know whether the index key 'Durban' has been trimmed or not. Therefore, we must apply city = 'Durban' again on table.

xuyifangreeneyes · 2022-10-24T05:56:41Z

There are some differences between MySQL and TiDB when handling prefix column. MySQL pads space to prefix length while TiDB trims space. In both implementations we cannot distinguish whether the real value length is less than the index prefix column length. MySQL also needs double scan for select * from t1 where city = 'Durban' and applies city = 'Durban' on table.

xuyifangreeneyes · 2022-10-24T06:14:09Z

There are still two optimizations we can do:

col is null can be pushed to prefix index. See planner: avoid double scan for index prefix col is (not) null #38555 for details.
Filter conditions on index prefix column can be applied twice on both index and table. Even if the const length exceeds the prefix column length, we can cut the const and then apply the filter condition on index.

ref #21145

rverma-dev · 2024-03-12T10:16:50Z

Is this supported now? Can we also do push down to greater and less than scenario too?

coocood added the type/feature-request Categorizes issue or PR as related to a new feature. label Nov 19, 2020

coocood added the sig/planner SIG: Planner label Nov 19, 2020

winoros added the help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. label Nov 19, 2020

xuyifangreeneyes mentioned this issue Dec 10, 2020

planner: allow filter condition pushing down to IndexScan for prefix index #21651

Closed

XuHuaiyu assigned xuyifangreeneyes Dec 11, 2020

xuyifangreeneyes mentioned this issue Sep 28, 2022

planner, util/ranger: allow filter condition pushing down to IndexScan for prefix index #38215

Closed

12 tasks

xuyifangreeneyes mentioned this issue Oct 19, 2022

planner: avoid double scan for index prefix col is (not) null #38555

Merged

12 tasks

ti-chi-bot pushed a commit that referenced this issue Oct 24, 2022

planner: avoid double scan for index prefix col is (not) null (#38555)

64051f9

ref #21145

xuyifangreeneyes mentioned this issue Oct 27, 2022

planner: add more tests for pushing IsNull to prefix index #38697

Merged

12 tasks

ti-chi-bot pushed a commit that referenced this issue Oct 28, 2022

planner: add more tests for pushing IsNull to prefix index (#38697)

95d177a

ref #21145

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Allow filter condition push down to IndexScan for prefix index. #21145

Allow filter condition push down to IndexScan for prefix index. #21145

coocood commented Nov 19, 2020

xuyifangreeneyes commented Oct 21, 2022

xuyifangreeneyes commented Oct 24, 2022

xuyifangreeneyes commented Oct 24, 2022

rverma-dev commented Mar 12, 2024

Allow filter condition push down to IndexScan for prefix index. #21145

Allow filter condition push down to IndexScan for prefix index. #21145

Comments

coocood commented Nov 19, 2020

Feature Request

xuyifangreeneyes commented Oct 21, 2022

xuyifangreeneyes commented Oct 24, 2022

xuyifangreeneyes commented Oct 24, 2022

rverma-dev commented Mar 12, 2024