Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

overflow during plan cost calculation leads to wrong join reorder #56704

Closed
apollodafoni opened this issue Oct 17, 2024 · 7 comments · Fixed by #56752
Closed

overflow during plan cost calculation leads to wrong join reorder #56704

apollodafoni opened this issue Oct 17, 2024 · 7 comments · Fixed by #56752
Assignees
Labels
affects-6.1 This bug affects the 6.1.x(LTS) versions. affects-6.5 This bug affects the 6.5.x(LTS) versions. affects-7.1 This bug affects the 7.1.x(LTS) versions. affects-7.5 This bug affects the 7.5.x(LTS) versions. affects-8.1 This bug affects the 8.1.x(LTS) versions. affects-8.4 impact/wrong-result severity/critical sig/planner SIG: Planner type/bug The issue is confirmed as a bug.

Comments

@apollodafoni
Copy link

apollodafoni commented Oct 17, 2024

Bug Report

Please answer these questions before submitting your issue. Thanks!

1. Minimal reproduce step (Required)

CREATE TABLE `lrr_test` (   `COL102` double DEFAULT NULL,   `COL1` double GENERATED ALWAYS AS (`COL102` + 10) STORED NOT NULL,   PRIMARY KEY (`COL1`) /*T![clustered_index] CLUSTERED */ ) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_bin;

The following datasets can be used:

mysql> select * from lrr_test;
+-------------------------+-------------------------+
| COL102 | COL1 |
+-------------------------+-------------------------+
| -1.704648925036604e308 | -1.704648925036604e308 |
| -1.6888619680353582e308 | -1.6888619680353582e308 |
| -1.6685908644498436e308 | -1.6685908644498436e308 |
| -1.6311134967437805e308 | -1.6311134967437805e308 |
| -1.6128280680807152e308 | -1.6128280680807152e308 |
| -1.5899713947158026e308 | -1.5899713947158026e308 |
| -1.5709457594070477e308 | -1.5709457594070477e308 |
| -1.4925714566991343e308 | -1.4925714566991343e308 |
| -1.4705985087370154e308 | -1.4705985087370154e308 |
| -1.4451316666300039e308 | -1.4451316666300039e308 |
| -1.3946576985986583e308 | -1.3946576985986583e308 |
| -1.3695679630646804e308 | -1.3695679630646804e308 |
| -1.3208992137984086e308 | -1.3208992137984086e308 |
| -1.2887981369134862e308 | -1.2887981369134862e308 |
| -1.2119996449796167e308 | -1.2119996449796167e308 |
| -1.195172956104992e308 | -1.195172956104992e308 |
| -1.1929781068369925e308 | -1.1929781068369925e308 |
| -1.1746351299417647e308 | -1.1746351299417647e308 |
| -1.1237012620945195e308 | -1.1237012620945195e308 |
| -1.1223448185004882e308 | -1.1223448185004882e308 |
| -1.0974439629672084e308 | -1.0974439629672084e308 |
| -1.0657654808610821e308 | -1.0657654808610821e308 |
| -1.0582598945271716e308 | -1.0582598945271716e308 |
| -1.0565276887850733e308 | -1.0565276887850733e308 |
| -1.0416104832981696e308 | -1.0416104832981696e308 |
| -1.0368741532690337e308 | -1.0368741532690337e308 |
| -1.033521479407133e308 | -1.033521479407133e308 |
| -1.0232269544119505e308 | -1.0232269544119505e308 |
| -9.31943312515408e307 | -9.31943312515408e307 |
| -9.05107332838438e307 | -9.05107332838438e307 |
| -8.276443475796885e307 | -8.276443475796885e307 |
| -7.845086666145396e307 | -7.845086666145396e307 |
| -7.664543340054255e307 | -7.664543340054255e307 |
| -7.235369799352141e307 | -7.235369799352141e307 |
| -7.047280050755922e307 | -7.047280050755922e307 |
| -6.62205033356235e307 | -6.62205033356235e307 |
| -6.35964999739255e307 | -6.35964999739255e307 |
| -5.989391229038818e307 | -5.989391229038818e307 |
| -5.974526205854541e307 | -5.974526205854541e307 |
| -5.798684586589338e307 | -5.798684586589338e307 |
| -4.98047732376121e307 | -4.98047732376121e307 |
| -4.4623979626128605e307 | -4.4623979626128605e307 |
| -4.3248436443381234e307 | -4.3248436443381234e307 |
| -3.3391152928792773e307 | -3.3391152928792773e307 |
| -3.2694282487729395e307 | -3.2694282487729395e307 |
| -3.2461091065368577e307 | -3.2461091065368577e307 |
| -2.8613054009714654e307 | -2.8613054009714654e307 |
| -2.7176814604572905e307 | -2.7176814604572905e307 |
| -2.1301127705458223e307 | -2.1301127705458223e307 |
| -1.7280065154718344e307 | -1.7280065154718344e307 |
| -1.6743061442642827e307 | -1.6743061442642827e307 |
| -4.862812928655648e306 | -4.862812928655648e306 |
| -3.3262533560429795e305 | -3.3262533560429795e305 |
| 4.124952267435051e305 | 4.124952267435051e305 |
| 5.4576487694211726e306 | 5.4576487694211726e306 |
| 1.1237742400537221e307 | 1.1237742400537221e307 |
| 1.569984332645614e307 | 1.569984332645614e307 |
| 1.7966188405412235e307 | 1.7966188405412235e307 |
| 1.8619233341238355e307 | 1.8619233341238355e307 |
| 2.1152066540419881e307 | 2.1152066540419881e307 |
| 2.1764927570795164e307 | 2.1764927570795164e307 |
| 2.99416682762135e307 | 2.99416682762135e307 |
| 3.0545414962788647e307 | 3.0545414962788647e307 |
| 3.262967770716021e307 | 3.262967770716021e307 |
| 3.288944887183685e307 | 3.288944887183685e307 |
| 4.9025219351381e307 | 4.9025219351381e307 |
| 5.250864486081297e307 | 5.250864486081297e307 |
| 5.52054372134351e307 | 5.52054372134351e307 |
| 6.311436996747818e307 | 6.311436996747818e307 |
| 6.870852232080436e307 | 6.870852232080436e307 |
| 7.501871137935436e307 | 7.501871137935436e307 |
| 7.925709054822421e307 | 7.925709054822421e307 |
| 8.438195254661318e307 | 8.438195254661318e307 |
| 8.446731596918706e307 | 8.446731596918706e307 |
| 9.43580947190119e307 | 9.43580947190119e307 |
| 9.66735866233596e307 | 9.66735866233596e307 |
| 1.0022043827847664e308 | 1.0022043827847664e308 |
| 1.020869767928594e308 | 1.020869767928594e308 |
| 1.0327408606815872e308 | 1.0327408606815872e308 |
| 1.0402383684235906e308 | 1.0402383684235906e308 |
| 1.0690255622829305e308 | 1.0690255622829305e308 |
| 1.1623306052784659e308 | 1.1623306052784659e308 |
| 1.1906116361044565e308 | 1.1906116361044565e308 |
| 1.2221839628780758e308 | 1.2221839628780758e308 |
| 1.3112927565356536e308 | 1.3112927565356536e308 |
| 1.3307364382402157e308 | 1.3307364382402157e308 |
| 1.3646958839720612e308 | 1.3646958839720612e308 |
| 1.425066345632827e308 | 1.425066345632827e308 |
| 1.4433864261103511e308 | 1.4433864261103511e308 |
| 1.5038532858735658e308 | 1.5038532858735658e308 |
| 1.5079450808097928e308 | 1.5079450808097928e308 |
| 1.553628680980576e308 | 1.553628680980576e308 |
| 1.6241456663280369e308 | 1.6241456663280369e308 |
| 1.6295729949930798e308 | 1.6295729949930798e308 |
| 1.6328703529666413e308 | 1.6328703529666413e308 |
| 1.6832354056195887e308 | 1.6832354056195887e308 |
| 1.7017315016390902e308 | 1.7017315016390902e308 |
| 1.7134206410400048e308 | 1.7134206410400048e308 |
| 1.7240829054261275e308 | 1.7240829054261275e308 |
| 1.7257738639648862e308 | 1.7257738639648862e308 |
| 1.7262297095455299e308 | 1.7262297095455299e308 |
| 1.7905151735809062e308 | 1.7905151735809062e308 |
+-------------------------+-------------------------+
102 rows in set (0.12 sec)

first step:

select t1. col1, t2.col1 from lrr_test as t1 right join lrr_test as t2 on t1.col1 = t2.col1 where t1.col1 >=0;
explain select t1. col1, t2.col1 from lrr_test as t1 right join lrr_test as t2 on t1.col1 = t2.col1 where t1.col1 >=0;

second step:

analyze table lrr_test;
select t1. col1, t2.col1 from lrr_test as t1 right join lrr_test as t2 on t1.col1 = t2.col1 where t1.col1 >=0;
explain select t1. col1, t2.col1 from lrr_test as t1 right join lrr_test as t2 on t1.col1 = t2.col1 where t1.col1 >=0;

2. What did you expect to see? (Required)

The second step should return 49 rows!

3. What did you see instead (Required)

First step return 49 rows which is correct, and its execution plan is merge_join.
After analyze table, second step return 2401 rows which is wrong, and its execution plan is hash_join.

+-----------------------------+---------+-----------+---------------+----------------------------------+
| id                          | estRows | task      | access object | operator info                    |
+-----------------------------+---------+-----------+---------------+----------------------------------+
| HashJoin_9                  | NaN     | root      |               | CARTESIAN inner join             |
| ├─TableReader_14(Build)     | 49.00   | root      |               | data:TableRangeScan_13           |
| │ └─TableRangeScan_13       | 49.00   | cop[tikv] | table:t2      | range:[0,+inf], keep order:false |
| └─TableReader_12(Probe)     | 49.00   | root      |               | data:TableRangeScan_11           |
|   └─TableRangeScan_11       | 49.00   | cop[tikv] | table:t1      | range:[0,+inf], keep order:false |
+-----------------------------+---------+-----------+---------------+----------------------------------+
5 rows in set (0.04 sec)

Some returned result fragments:

| 1.7905151735809062e308 | 6.311436996747818e307 |
| 1.7905151735809062e308 | 5.52054372134351e307 |
| 1.7905151735809062e308 | 5.250864486081297e307 |
| 1.7905151735809062e308 | 4.9025219351381e307 |
| 1.7905151735809062e308 | 3.288944887183685e307 |
| 1.7905151735809062e308 | 3.262967770716021e307 |
| 1.7905151735809062e308 | 3.0545414962788647e307 |
| 1.7905151735809062e308 | 2.99416682762135e307 |
| 1.7905151735809062e308 | 2.1764927570795164e307 |
| 1.7905151735809062e308 | 2.1152066540419881e307 |
| 1.7905151735809062e308 | 1.8619233341238355e307 |
| 1.7905151735809062e308 | 1.7966188405412235e307 |
| 1.7905151735809062e308 | 1.569984332645614e307 |
| 1.7905151735809062e308 | 1.1237742400537221e307 |
| 1.7905151735809062e308 | 5.4576487694211726e306 |
| 1.7905151735809062e308 | 4.124952267435051e305 |
+------------------------+------------------------+
2401 rows in set (0.15 sec)

4. What is your TiDB version? (Required)

Release Version: v8.4.0
Edition: Community
Git Commit Hash: 1832b27
Git Branch: HEAD
UTC Build Time: 2024-10-15 03:47:37
GoVersion: go1.23.2
Race Enabled: false
Check Table Before Drop: false
Store: tikv

@apollodafoni apollodafoni added the type/bug The issue is confirmed as a bug. label Oct 17, 2024
@apollodafoni
Copy link
Author

/severity moderate
/component planner
/impact wrong-result

Copy link

ti-chi-bot bot commented Oct 17, 2024

@apollodafoni: The label(s) component/planner cannot be applied, because the repository doesn't have them.

In response to this:

/severity moderate
/component planner
/impact wrong-result

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the ti-community-infra/tichi repository.

@apollodafoni
Copy link
Author

inner join also return wrong result : select t1.col1, t2.col1 from lrr_test as t1 inner join lrr_test as t2 on t1.col1 = t2.col1 where t1.col1 != 3.538547054120427e307

|  1.7905151735809062e308 | -1.6311134967437805e308 |
|  1.7905151735809062e308 | -1.6685908644498436e308 |
|  1.7905151735809062e308 | -1.6888619680353582e308 |
|  1.7905151735809062e308 |  -1.704648925036604e308 |
+-------------------------+-------------------------+
10404 rows in set (0.36 sec)

@apollodafoni
Copy link
Author

left join also return wrong result : select t1. col1, t2.col1 from lrr_test as t1 left join lrr_test as t2 on t1.col1 = t2.col1 where t1.col1 <= 0;

| -3.3262533560429795e305 | -1.5899713947158026e308 |
| -3.3262533560429795e305 | -1.6128280680807152e308 |
| -3.3262533560429795e305 | -1.6311134967437805e308 |
| -3.3262533560429795e305 | -1.6685908644498436e308 |
| -3.3262533560429795e305 | -1.6888619680353582e308 |
| -3.3262533560429795e305 |  -1.704648925036604e308 |
+-------------------------+-------------------------+
2809 rows in set (0.28 sec)

@apollodafoni apollodafoni changed the title right join may return error result when using hash_join right join may return error result when using HashJoin Oct 18, 2024
@winoros winoros added sig/execution SIG execution and removed sig/execution SIG execution labels Oct 18, 2024
@winoros
Copy link
Member

winoros commented Oct 18, 2024

@apollodafoni Can you change the table data into INSERT INTO or csv? It's hard to use.
And I could not reproduce the problem currently.

@windtalker
Copy link
Contributor

windtalker commented Oct 21, 2024

@winoros I can reproduce this bug. I think there is 2 unexpected thing:

  1. the hash join use CARTESIAN inner join, but it actually has an equal join condition
mysql> explain select t1. col1, t2.col1 from lrr_test as t1  join lrr_test as t2 on t1.col1 = t2.col1 where t1.col1 >=0;
+-----------------------------+---------+-----------+---------------+----------------------------------+
| id                          | estRows | task      | access object | operator info                    |
+-----------------------------+---------+-----------+---------------+----------------------------------+
| HashJoin_10                 | NaN     | root      |               | CARTESIAN inner join             |
| ├─TableReader_15(Build)     | 49.00   | root      |               | data:TableRangeScan_14           |
| │ └─TableRangeScan_14       | 49.00   | cop[tikv] | table:t2      | range:[0,+inf], keep order:false |
| └─TableReader_13(Probe)     | 49.00   | root      |               | data:TableRangeScan_12           |
|   └─TableRangeScan_12       | 49.00   | cop[tikv] | table:t1      | range:[0,+inf], keep order:false |
+-----------------------------+---------+-----------+---------------+----------------------------------+
  1. The worst thing is the join condition is lost, you can see in the plan HashJoin_10 does not have other condtion, this makes the join result wrong.

@winoros winoros changed the title right join may return error result when using HashJoin overflow during plan cost calculation leads to wrong join reorder Oct 21, 2024
@winoros
Copy link
Member

winoros commented Oct 21, 2024

NaN is got because the data of the table is too wide(It has values near MinFlaot64 and values near MaxFloat64), the histogram width of the column exceeds MaxFloat64, becoming +inf, then leading to NaN.

@winoros winoros added affects-6.1 This bug affects the 6.1.x(LTS) versions. affects-6.5 This bug affects the 6.5.x(LTS) versions. affects-7.1 This bug affects the 7.1.x(LTS) versions. affects-7.5 This bug affects the 7.5.x(LTS) versions. affects-8.1 This bug affects the 8.1.x(LTS) versions. and removed may-affects-8.1 labels Oct 21, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
affects-6.1 This bug affects the 6.1.x(LTS) versions. affects-6.5 This bug affects the 6.5.x(LTS) versions. affects-7.1 This bug affects the 7.1.x(LTS) versions. affects-7.5 This bug affects the 7.5.x(LTS) versions. affects-8.1 This bug affects the 8.1.x(LTS) versions. affects-8.4 impact/wrong-result severity/critical sig/planner SIG: Planner type/bug The issue is confirmed as a bug.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants