Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[lightning] Estimate rowid might overflow when import data and auto fill int column. #28776

Closed
fubinzh opened this issue Oct 13, 2021 · 9 comments · Fixed by #34288
Closed

[lightning] Estimate rowid might overflow when import data and auto fill int column. #28776

fubinzh opened this issue Oct 13, 2021 · 9 comments · Fixed by #34288
Assignees
Labels
affects-6.0 affects-6.1 This bug affects the 6.1.x(LTS) versions. affects-6.2 affects-6.3 affects-6.4 affects-6.5 This bug affects the 6.5.x(LTS) versions. component/lightning This issue is related to Lightning of TiDB. severity/major sig/migrate type/bug The issue is confirmed as a bug.

Comments

@fubinzh
Copy link

fubinzh commented Oct 13, 2021

Bug Report

Please answer these questions before submitting your issue. Thanks!

1. Minimal reproduce step (Required)

This issue was created according to ONCALL-3504

Using lightning to import data, before lightning import, create the schema in TiDB manually and adding an extra column: id_auto_incr int(11) NOT NULL AUTO_INCREMENT, during import, lightning will estimate the rowid and auto fill this column (as this column doesn't exists in data source), and rowid might overflow in some cases and result in data inconsistency.

2. What did you expect to see? (Required)

Lightning import should be successful and data consistency should be ensured.

3. What did you see instead (Required)

Lighting import failed with checksum mistch.

4. What is your TiDB version? (Required)

Lightning: 4.0.14

@fubinzh fubinzh added type/bug The issue is confirmed as a bug. severity/major sig/migrate component/lightning This issue is related to Lightning of TiDB. labels Oct 13, 2021
@glorv
Copy link
Contributor

glorv commented Oct 13, 2021

@kennytm What do you think, seems do this check in precheck is not easy? Can we only check in chunkRestore?🤔

@kennytm
Copy link
Contributor

kennytm commented Oct 14, 2021

given that we have max-error now i think we should turn the default SQL mode back to strict_trans_table to catch the 32-bit overflow.

@jebter jebter added the affects-4.0 This bug affects 4.0.x versions. label Jan 11, 2022
@niubell
Copy link
Contributor

niubell commented Apr 20, 2022

/assign buchuitoudegou

@niubell
Copy link
Contributor

niubell commented Apr 20, 2022

/unassign glorv

@buchuitoudegou
Copy link
Contributor

buchuitoudegou commented Apr 26, 2022

There are other issues when filling auto_increment field with lightning.

test:

  • source csv file: 312535480 rows with only one column a ranging from 0~312535479.
  • create table with auto_increment id
  • lightning restore

result:

mysql> select count(*) from test use index(id);
+-----------+
| count(*)  |
+-----------+
| 312535480 |
+-----------+
1 row in set (15.22 sec)

mysql> select count(*) from test;
+-----------+
| count(*)  |
+-----------+
| 312535480 |
+-----------+
1 row in set (15.06 sec)

mysql> select * from test order by id asc limit 10;
+----+---+
| id | a |
+----+---+
|  1 | 0 |
|  2 | 1 |
|  3 | 2 |
|  4 | 3 |
|  5 | 4 |
|  6 | 5 |
|  7 | 6 |
|  8 | 7 |
|  9 | 8 |
| 10 | 9 |
+----+---+
10 rows in set (0.01 sec)

mysql> select * from test order by id desc limit 10;
+------------+-----------+
| id         | a         |
+------------+-----------+
| 1651778746 | 312535479 |
| 1651778745 | 312535478 |
| 1651778744 | 312535477 |
| 1651778743 | 312535476 |
| 1651778742 | 312535475 |
| 1651778741 | 312535474 |
| 1651778740 | 312535473 |
| 1651778739 | 312535472 |
| 1651778738 | 312535471 |
| 1651778737 | 312535470 |
+------------+-----------+
10 rows in set (0.01 sec)

id didn't overflow. But we were expecting id == a+1 but got 165177874*. I guess both originated from the same bugs of Lightning.

@buchuitoudegou
Copy link
Contributor

Possible root cause:
Lightning evaluates the data file's size and infers PrevRowMax and RowIDMax for each chunk. The PrevRowMax is assigned to parser.lastRow.rowID before reading and parsing the data file.

After reading a row from the file, parser.lastRow is set this row, with rowID = parser.lastRow.rowID++. However, since PrevRowMax is an estimate, it obviously doesn't strictly equal the true rowID, and thus, we may have id much larger than what we expected.

Similarly, when PrevRowMax is over-estimated (e.g. overflows int(32)), we may probably make id exceeding the maximum of int.

It's intuitive to add a check or re-estimate these values, but would it be better to have a global ID allocator managing these stuff for us? /cc @gozssky @niubell

@VelocityLight VelocityLight added the affects-6.1 This bug affects the 6.1.x(LTS) versions. label May 20, 2022
@D3Hunter
Copy link
Contributor

this issue affects 4.0/6.0, don't affects 5.x?

@buchuitoudegou
Copy link
Contributor

this issue affects 4.0/6.0, don't affects 5.x?

It might affect all of them.

@sleepymole sleepymole removed the affects-4.0 This bug affects 4.0.x versions. label Jun 29, 2022
@sleepymole sleepymole reopened this Jun 29, 2022
@ti-chi-bot ti-chi-bot added may-affects-4.0 This bug maybe affects 4.0.x versions. may-affects-5.0 This bug maybe affects 5.0.x versions. may-affects-5.1 This bug maybe affects 5.1.x versions. may-affects-5.2 This bug maybe affects 5.2.x versions. may-affects-5.3 This bug maybe affects 5.3.x versions. may-affects-5.4 This bug maybe affects 5.4.x versions. labels Jun 29, 2022
@D3Hunter D3Hunter removed may-affects-4.0 This bug maybe affects 4.0.x versions. may-affects-5.1 This bug maybe affects 5.1.x versions. may-affects-5.2 This bug maybe affects 5.2.x versions. may-affects-5.3 This bug maybe affects 5.3.x versions. may-affects-5.4 This bug maybe affects 5.4.x versions. may-affects-5.0 This bug maybe affects 5.0.x versions. labels Jun 29, 2022
@VelocityLight VelocityLight added the affects-6.5 This bug affects the 6.5.x(LTS) versions. label Dec 2, 2022
@buchuitoudegou
Copy link
Contributor

This issue has a workaround: setting the auto-increment key as bigint or unsigned bigint can allow tidb to have a larger key space so that the auto-increment key won't exceed the limit.

#34288 is not sophisticated enough so it was reverted temporarily. We will try to find out a better way to resolve it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
affects-6.0 affects-6.1 This bug affects the 6.1.x(LTS) versions. affects-6.2 affects-6.3 affects-6.4 affects-6.5 This bug affects the 6.5.x(LTS) versions. component/lightning This issue is related to Lightning of TiDB. severity/major sig/migrate type/bug The issue is confirmed as a bug.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

10 participants