Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TTL tasks may fail if the TiDB sets isolation-read.engines to tidb,tiflash and the table doesn't have TiFlash replica. #56402

Closed
YangKeao opened this issue Sep 29, 2024 · 3 comments · Fixed by #56604
Assignees
Labels
affects-7.1 This bug affects the 7.1.x(LTS) versions. affects-7.5 This bug affects the 7.5.x(LTS) versions. affects-8.1 This bug affects the 8.1.x(LTS) versions. severity/major sig/sql-infra SIG: SQL Infra type/bug The issue is confirmed as a bug.

Comments

@YangKeao
Copy link
Member

Bug Report

Please answer these questions before submitting your issue. Thanks!

1. Minimal reproduce step (Required)

  1. Start a TiDB server with isolation-read.engines set to tidb,tiflash.
  2. Create a new TTL table:
CREATE TABLE t1 (
    id int PRIMARY KEY,
    created_at TIMESTAMP
) TTL = `created_at` + INTERVAL 3 MONTH;
  1. Select the tidb_ttl_job_history. You'll find that the ttl job has failed:
+----------------------------------+----------+-----------------+--------------+------------+----------------+---------------------+---------------------+---------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+--------------+--------------+-------------------+----------+
| job_id                           | table_id | parent_table_id | table_schema | table_name | partition_name | create_time         | finish_time         | ttl_expire          | summary_text                                                                                                                                                                                                                                                               | expired_rows | deleted_rows | error_delete_rows | status   |
+----------------------------------+----------+-----------------+--------------+------------+----------------+---------------------+---------------------+---------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+--------------+--------------+-------------------+----------+
| 8cdb068f35564d7c81f3795a4cb6afe7 |      104 |             104 | test         | t1         | NULL           | 2024-09-29 14:43:42 | 2024-09-29 14:44:00 | 2024-06-29 14:43:42 | {"total_rows":0,"success_rows":0,"error_rows":0,"total_scan_task":1,"scheduled_scan_task":1,"finished_scan_task":1,"scan_task_err":"[planner:1815]Internal : No access path for table 't1' is found with 'tidb_isolation_read_engines' = '', valid values can be 'tikv'."} |            0 |            0 |                 0 | finished |
+----------------------------------+----------+-----------------+--------------+------------+----------------+---------------------+---------------------+---------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+--------------+--------------+-------------------+----------+
1 row in set (0.01 sec)

Here are some acceptable choices:

  1. Don't acquire the scan task if the node itself cannot read on the table (e.g. the table doesn't have tiflash replica, but the node is set to use tidb,tiflash engine).
  2. Always use the tidb,tikv,tiflash engines for TTL session.

4. What is your TiDB version? (Required)

mysql> select tidb_version();
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| tidb_version()                                                                                                                                                                                                                                |
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Release Version: v8.3.0
Edition: Community
Git Commit Hash: 1a0c3ac3292fff7742faa0c00a662ccb66ba40db
Git Branch: HEAD
UTC Build Time: 2024-08-20 10:13:01
GoVersion: go1.21.10
Race Enabled: false
Check Table Before Drop: false
Store: tikv |
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.00 sec)
@YangKeao YangKeao added the type/bug The issue is confirmed as a bug. label Sep 29, 2024
@YangKeao YangKeao self-assigned this Sep 29, 2024
@YangKeao YangKeao added affects-7.1 This bug affects the 7.1.x(LTS) versions. affects-7.5 This bug affects the 7.5.x(LTS) versions. affects-8.1 This bug affects the 8.1.x(LTS) versions. labels Sep 29, 2024
@YangKeao YangKeao removed may-affects-5.4 This bug maybe affects 5.4.x versions. may-affects-6.1 may-affects-6.5 labels Sep 29, 2024
@time-and-fate
Copy link
Member

A previous similar issue (which is from an oncall ticket): #45202

@YangKeao
Copy link
Member Author

YangKeao commented Sep 29, 2024

A previous similar issue (which is from an oncall ticket): #45202

👍. It seems that we did nothing for #45202, right?

For TTL, I don't think it's a good situation. The user has no way to control whether a TTL task can be allocated to a node. If a user is facing this issue, there is no workaround for him 🤔

@time-and-fate
Copy link
Member

Yes, I only recorded this issue for that oncall ticket. And I also didn't hear of any other people working on that.
Considering these 2 issues, looks like letting these internal sessions ignore the setting is a reasonable idea and simple solution.

(Btw, internal sessions (or system sessions) have been sort of problematic in my impression. 😅 Sometimes it has some surprising behaviors, like not reading the updates of global variables, which is also mentioned in my issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
affects-7.1 This bug affects the 7.1.x(LTS) versions. affects-7.5 This bug affects the 7.5.x(LTS) versions. affects-8.1 This bug affects the 8.1.x(LTS) versions. severity/major sig/sql-infra SIG: SQL Infra type/bug The issue is confirmed as a bug.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants