Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

resource_control: add runaway queries #14242

Merged
merged 51 commits into from
Jun 26, 2023
Merged
Show file tree
Hide file tree
Changes from 16 commits
Commits
Show all changes
51 commits
Select commit Hold shift + click to select a range
7acbe33
add runaway doc
Connor1996 Jun 15, 2023
45c2bf2
address comment
Connor1996 Jun 15, 2023
2309e90
refine doc
Connor1996 Jun 15, 2023
68068c2
Update tidb-resource-control.md
hfxsd Jun 16, 2023
473eff4
Apply suggestions from code review
hfxsd Jun 16, 2023
127f67f
Update sql-statement-alter-resource-group.md
hfxsd Jun 16, 2023
e25f4e2
Update tidb-resource-control.md
hfxsd Jun 16, 2023
24d4de4
Update sql-statement-create-resource-group.md
hfxsd Jun 16, 2023
de80e1a
Update tidb-resource-control.md
hfxsd Jun 16, 2023
9913183
Update tidb-resource-control.md
hfxsd Jun 16, 2023
9bd2bdb
Update sql-statement-alter-resource-group.md
hfxsd Jun 16, 2023
5ede79a
Apply suggestions from code review
hfxsd Jun 16, 2023
4c41a24
Update tidb-resource-control.md
hfxsd Jun 16, 2023
d6799b8
Update tidb-resource-control.md
hfxsd Jun 16, 2023
af99a7f
Update tidb-resource-control.md
hfxsd Jun 16, 2023
62f5bdc
Apply suggestions from code review
hfxsd Jun 16, 2023
4f0cdb3
Update tidb-resource-control.md
hfxsd Jun 16, 2023
22d9002
Update tidb-resource-control.md
hfxsd Jun 16, 2023
6d98642
Apply suggestions from code review
hfxsd Jun 16, 2023
f086871
Update tidb-resource-control.md
hfxsd Jun 16, 2023
4148077
Update sql-statements/sql-statement-create-resource-group.md
hfxsd Jun 16, 2023
1681fb5
Update tidb-resource-control.md
hfxsd Jun 16, 2023
0b48c10
Update sql-statements/sql-statement-create-resource-group.md
hfxsd Jun 16, 2023
e8bbec6
address comment
Connor1996 Jun 16, 2023
fbbc67b
add error
Connor1996 Jun 16, 2023
308c646
add admin table description
Connor1996 Jun 16, 2023
cc1ac79
rename
Connor1996 Jun 17, 2023
e6b02b8
fix anchor
Connor1996 Jun 17, 2023
4985fba
Apply suggestions from code review
hfxsd Jun 18, 2023
7a1c149
add QUERY_LIMIT in display result
hfxsd Jun 18, 2023
d33fd91
Update information-schema-resource-groups.md
hfxsd Jun 18, 2023
0aabeec
Update tidb-resource-control.md
hfxsd Jun 18, 2023
4672450
rename
Connor1996 Jun 18, 2023
ef8fc37
Update tidb-resource-control.md
hfxsd Jun 18, 2023
1e69618
Update error-codes.md
hfxsd Jun 18, 2023
ece554b
Update tidb-resource-control.md
hfxsd Jun 19, 2023
b3b4670
Update tidb-resource-control.md
hfxsd Jun 19, 2023
28d4e1d
Apply suggestions from code review
hfxsd Jun 19, 2023
777d1df
clean
Connor1996 Jun 19, 2023
2587820
Apply suggestions from code review
hfxsd Jun 19, 2023
f6be118
Update information-schema-resource-groups.md
hfxsd Jun 19, 2023
fc6bddb
Update sql-statement-alter-resource-group.md
hfxsd Jun 19, 2023
461160a
Update sql-statement-create-resource-group.md
hfxsd Jun 19, 2023
f557a50
Update sql-statement-drop-resource-group.md
hfxsd Jun 19, 2023
e1885a4
Apply suggestions from code review
hfxsd Jun 19, 2023
c706dc3
Update mysql-schema.md
hfxsd Jun 20, 2023
eb6ed9a
Apply suggestions from code review
hfxsd Jun 21, 2023
2ccfd51
Apply suggestions from code review
hfxsd Jun 21, 2023
b10fee4
Update tidb-resource-control.md
hfxsd Jun 25, 2023
88305a6
Apply suggestions from code review
hfxsd Jun 26, 2023
3875d60
Merge branch 'master' into runaway
hfxsd Jun 26, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
55 changes: 41 additions & 14 deletions sql-statements/sql-statement-alter-resource-group.md
Original file line number Diff line number Diff line change
Expand Up @@ -27,12 +27,35 @@ ResourceGroupOptionList ::=
DirectResourceGroupOption ::=
"RU_PER_SEC" EqOpt stringLit
| "PRIORITY" EqOpt ResourceGroupPriorityOption
| "BURSTABLE"
Connor1996 marked this conversation as resolved.
Show resolved Hide resolved
| "BURSTABLE" EqOpt Boolean
| "QUERY_LIMIT" EqOpt '(' ResourceGroupRunawayOptionList ')'
| "QUERY_LIMIT" EqOpt '(' ')'
| "QUERY_LIMIT" EqOpt "NULL"
hfxsd marked this conversation as resolved.
Show resolved Hide resolved

ResourceGroupPriorityOption ::=
LOW
ResourceGroupPriorityOption ::=![image](https://github.com/Connor1996/docs-cn/assets/35301108/b5cf0fc8-f26c-448b-aa3e-e948a67f5ecf)
Connor1996 marked this conversation as resolved.
Show resolved Hide resolved

LOW
| MEDIUM
| HIGH

ResourceGroupRunawayOptionList ::=
DirectResourceGroupRunawayOption
| ResourceGroupRunawayOptionList DirectResourceGroupRunawayOption
| ResourceGroupRunawayOptionList ',' DirectResourceGroupRunawayOption

DirectResourceGroupRunawayOption ::=
"EXEC_ELAPSED" EqOpt stringLit
| "ACTION" EqOpt ResourceGroupRunawayActionOption
| "WATCH" EqOpt ResourceGroupRunawayWatchOption "DURATION" EqOpt stringLit
hfxsd marked this conversation as resolved.
Show resolved Hide resolved

ResourceGroupRunawayWatchOption ::=
EXACT
| SIMILAR

ResourceGroupRunawayActionOption ::=
DRYRUN
| COOLDOWN
| KILL
```

TiDB 支持以下 `DirectResourceGroupOption`, 其中 [Request Unit (RU)](/tidb-resource-control.md#什么是-request-unit-ru) 是 TiDB 对 CPU、IO 等系统资源统一抽象的单位。
Expand All @@ -42,10 +65,13 @@ TiDB 支持以下 `DirectResourceGroupOption`, 其中 [Request Unit (RU)](/tidb-
| `RU_PER_SEC` | 每秒 RU 填充的速度 | `RU_PER_SEC = 500` 表示此资源组每秒回填 500 个 RU。 |
| `PRIORITY` | 任务在 TiKV 上处理的绝对优先级 | `PRIORITY = HIGH` 表示优先级高。若未指定则默认为 `MEDIUM`。 |
| `BURSTABLE` | 允许对应的资源组超出配额后使用空余的系统资源。 |
| `QUERY_LIMIT` | 当查询执行满足该条件时,识别为
并执行相应的操作 | `QUERY_LIMIT=(EXEC_ELAPSED='60s', ACTION=KILL, WATCH=EXACT DURATION='10m')` 表示当执行时间超过 60 秒后识别为 Runaway Query,对该查询执行终止操作,并在 10 分钟内对同样的 SQL 免疫直接执行终止操作。`QUERY_LIMIT=()` 或 `QUERY_LIMIT=NULL` 则表示不进行 Runaway 控制。具体参数介绍参见[管理资源消耗超出预期的查询 (Runaway Queries)](/tidb-resource-control.md#管理资源消耗超出预期的查询-runaway-queries)。 |
hfxsd marked this conversation as resolved.
Show resolved Hide resolved

> **注意:**
>
> `ALTER RESOURCE GROUP` 语句只能在全局变量 [`tidb_enable_resource_control`](/system-variables.md#tidb_enable_resource_control-从-v660-版本开始引入) 参数设置为 `ON` 时才能执行。
> `ALTER RESOURCE GROUP` 语句支持以增量方式修改,未指定的参数保持不变。但其中 `QUERY_LIMIT` 作为一个整体, 无法部分修改其中的参数。
hfxsd marked this conversation as resolved.
Show resolved Hide resolved

## 示例

Expand Down Expand Up @@ -74,18 +100,19 @@ SELECT * FROM information_schema.resource_groups WHERE NAME ='rg1';
```

```sql
+------+------------+----------+-----------+
| NAME | RU_PER_SEC | PRIORITY | BURSTABLE |
+------+------------+----------+-----------+
| rg1 | 100 | MEDIUM | YES |
+------+------------+----------+-----------+
+------+------------+----------+-----------+-------------+
| NAME | RU_PER_SEC | PRIORITY | BURSTABLE | QUERY_LIMIT |
+------+------------+----------+-----------+-------------+
| rg1 | 100 | MEDIUM | YES | NULL |
+------+------------+----------+-----------+-------------+
1 rows in set (1.30 sec)
```

```sql
ALTER RESOURCE GROUP rg1
RU_PER_SEC = 200
PRIORITY = LOW;
PRIORITY = LOW
QUERY_LIMIT = (EXEC_ELAPSED='1s' ACTION=COOLDOWN WATCH=EXACT[30s]);
Copy link
Contributor

@glorv glorv Jun 21, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@CabinfeverB WATCH=EXACT[30s] 没看到 parser 有支持这个语法

hfxsd marked this conversation as resolved.
Show resolved Hide resolved
```

```sql
Expand All @@ -97,11 +124,11 @@ SELECT * FROM information_schema.resource_groups WHERE NAME ='rg1';
```

```sql
+------+------------+----------+-----------+
| NAME | RU_PER_SEC | PRIORITY | BURSTABLE |
+------+------------+----------+-----------+
| rg1 | 200 | LOW | NO |
+------+------------+----------+-----------+
+------+------------+----------+-----------+----------------------------------------------------+
| NAME | RU_PER_SEC | PRIORITY | BURSTABLE | QUERY_LIMIT |
+------+------------+----------+-----------+----------------------------------------------------+
| rg1 | 200 | LOW | YES | EXEC_ELAPSED=1s, ACTION=COOLDOWN, WATCH=EXACT[30s] |
+------+------------+----------+-----------+----------------------------------------------------+
1 rows in set (1.30 sec)
```

Expand Down
41 changes: 32 additions & 9 deletions sql-statements/sql-statement-create-resource-group.md
Original file line number Diff line number Diff line change
Expand Up @@ -27,12 +27,34 @@ ResourceGroupOptionList ::=
DirectResourceGroupOption ::=
"RU_PER_SEC" EqOpt stringLit
| "PRIORITY" EqOpt ResourceGroupPriorityOption
| "BURSTABLE"
Connor1996 marked this conversation as resolved.
Show resolved Hide resolved
| "BURSTABLE" EqOpt Boolean
| "QUERY_LIMIT" EqOpt '(' ResourceGroupRunawayOptionList ')'
| "QUERY_LIMIT" EqOpt '(' ')'
| "QUERY_LIMIT" EqOpt "NULL"

ResourceGroupPriorityOption ::=
LOW
| MEDIUM
| HIGH

ResourceGroupRunawayOptionList ::=
DirectResourceGroupRunawayOption
| ResourceGroupRunawayOptionList DirectResourceGroupRunawayOption
| ResourceGroupRunawayOptionList ',' DirectResourceGroupRunawayOption

DirectResourceGroupRunawayOption ::=
"EXEC_ELAPSED" EqOpt stringLit
| "ACTION" EqOpt ResourceGroupRunawayActionOption
| "WATCH" EqOpt ResourceGroupRunawayWatchOption "DURATION" EqOpt stringLit
Connor1996 marked this conversation as resolved.
Show resolved Hide resolved

ResourceGroupRunawayWatchOption ::=
EXACT
| SIMILAR

ResourceGroupRunawayActionOption ::=
DRYRUN
| COOLDOWN
| KILL
```

资源组的 `ResourceGroupName` 是全局唯一的,不允许重复。
Expand All @@ -43,7 +65,8 @@ TiDB 支持以下 `DirectResourceGroupOption`, 其中 [Request Unit (RU)](/tidb-
|---------------|--------------|--------------------------------------|
| `RU_PER_SEC` | 每秒 RU 填充的速度 | `RU_PER_SEC = 500` 表示此资源组每秒回填 500 个 RU。 |
| `PRIORITY` | 任务在 TiKV 上处理的绝对优先级 | `PRIORITY = HIGH` 表示优先级高。若未指定,则默认为 `MEDIUM`。 |
| `BURSTABLE` | 允许对应的资源组超出配额后使用空余的系统资源。 |
| `BURSTABLE` | 允许对应的资源组超出配额后使用空余的系统资源。 |
| `QUERY_LIMIT` | 当查询执行满足该条件时,识别为 Runaway Query 并进行相应的控制 | `QUERY_LIMIT=(EXEC_ELAPSED='60s', ACTION=KILL, WATCH=EXACT DURATION='10m')` 表示当时执行时间超过 60 秒后识别为 Runaway Query ,对该查询执行终止操作,并在 10 分钟内对同样的 SQL 免疫直接执行终止操作。若未指定,或 `QUERY_LIMIT=()` 或 `QUERY_LIMIT=NULL` 则表示不进行 Runaway 控制。具体参数介绍详见[管理资源消耗超出预期的查询 (Runaway Queries)](/tidb-resource-control.md#管理资源消耗超出预期的查询-runaway-queries)。 |
hfxsd marked this conversation as resolved.
Show resolved Hide resolved

> **注意:**
>
Expand Down Expand Up @@ -75,7 +98,7 @@ Query OK, 0 rows affected (0.08 sec)

```sql
CREATE RESOURCE GROUP IF NOT EXISTS rg2
RU_PER_SEC = 200;
RU_PER_SEC = 200 QUERY_LIMIT=(EXEC_ELAPSED='100ms', ACTION=KILL);
hfxsd marked this conversation as resolved.
Show resolved Hide resolved
```

```sql
Expand All @@ -87,12 +110,12 @@ SELECT * FROM information_schema.resource_groups WHERE NAME ='rg1' or NAME = 'rg
```

```sql
+------+------------+----------+-----------+
| NAME | RU_PER_SEC | PRIORITY | BURSTABLE |
+------+------------+----------+-----------+
| rg1 | 100 | HIGH | YES |
| rg2 | 200 | MEDIUM | NO |
+------+------------+----------+-----------+
+------+------------+----------+-----------+---------------------------------+
| NAME | RU_PER_SEC | PRIORITY | BURSTABLE | QUERY_LIMIT |
+------+------------+----------+-----------+---------------------------------+
| rg1 | 100 | HIGH | YES | NULL |
| rg2 | 200 | MEDIUM | NO | EXEC_ELAPSED=100ms, ACTION=KILL |
+------+------------+----------+-----------+---------------------------------+
2 rows in set (1.30 sec)
```

Expand Down
53 changes: 53 additions & 0 deletions tidb-resource-control.md
Original file line number Diff line number Diff line change
Expand Up @@ -160,6 +160,59 @@ SET RESOURCE GROUP rg1;
SELECT /*+ RESOURCE_GROUP(rg1) */ * FROM t limit 10;
```

### 管理资源消耗超出预期的查询 (Runaway Quries)

> **警告:**
>
> 当前该功能为实验特性,不建议在生产环境中使用。

Runaway Queries 指那些执行时间或者消耗的资源超出预期的查询。自 v7.2.0 起,TiDB 资源管控引入了对 Runaway Queries 的管理。你可以设置条件对 Runaway Queries 进行识别,并自动发起操作,防止集群资源完全被 Runaway Queries 占用而影响其他正常查询。
hfxsd marked this conversation as resolved.
Show resolved Hide resolved

hfxsd marked this conversation as resolved.
Show resolved Hide resolved
支持的条件设置:

- `EXEC_ELAPSED`: 当查询执行的时间超限时,识别为 Runaway Query。

支持的应对操作:
hfxsd marked this conversation as resolved.
Show resolved Hide resolved

- `DRYRUN`:不做任何应对。主要用于观测设置条件是否合理。
- `COOLDOWN`:将查询的执行优先级降到最低,查询仍旧会以低优先级继续执行,不占用其他操作的资源。
- `KILL`:识别到的查询将被自动终止。

为了避免并发的 Runaway Queries 太多,在被条件识别前就将系统资源耗尽,这里引入了一个快速识别的机制。借助子句 `WATCH`,当某一个查询被识别为 Runaway Quey 之后,在接下来的一段时间里 (通过 `DURATION` 定义) ,当前 TiDB 实例会将匹配到的查询直接标记为 Runaway Query,而不再等待其被条件识别。快速识别的匹配有两种方式:
hfxsd marked this conversation as resolved.
Show resolved Hide resolved

- `EXACT` 表示 SQL 文本完全相同的才会被快速识别
hfxsd marked this conversation as resolved.
Show resolved Hide resolved
- `SIMILAR` 表示会忽略字面值 (Literal),直接匹配所有模式 (pattern) 相同的 SQL

通过在 [`CREATE RESOURCE GROUP`](/sql-statements/sql-statement-create-resource-group.md) 或者 [`ALTER RESOURCE GROUP`](/sql-statements/sql-statement-alter-resource-group.md) 中配置 `QUERY_LIMIT` 字段,可以实现管理资源组的 Runaway Query。

`QUERY_LIMIT` 具体格式如下:

| 参数 | 含义 | 备注 |
|---------------|--------------|--------------------------------------|
| `EXEC_ELAPSED` | 当查询执行时间超过该值后被识别为 Runaway Query | EXEC_ELAPSED =`60s` 表示查询的执行时间超过 60 秒则被认为是 Runaway Query。 |
| `ACTION` | 当识别到 Runaway Query时进行的动作 | 可选值有 `DRYRUN`(无操作), `COOLDOWN` (降低至最低优先级执行),`KILL`(终止查询)。 |
hfxsd marked this conversation as resolved.
Show resolved Hide resolved
| `WATCH` | 快速匹配已经识别到的 Runaway Queries,即在一定时间内再碰到相同或相似查询直接进行相应动作 | 可选项,配置例如 `SIMILAR DURATION 60s`、`EXACT DURATION 60s`, `SIMILAR` 表示使用 Plan Digest 匹配,`EXACT` 表示使用 SQL 匹配。 |
hfxsd marked this conversation as resolved.
Show resolved Hide resolved

hfxsd marked this conversation as resolved.
Show resolved Hide resolved
示例如下:

1. 创建 `rg1` 资源组,限额是每秒 500 RU,并且定义超过 60s 为 Runaway Query,并对 Runaway Query 降低优先级执行。
hfxsd marked this conversation as resolved.
Show resolved Hide resolved

```sql
CREATE RESOURCE GROUP IF NOT EXISTS rg1 RU_PER_SEC = 500 QUERY_LIMIT=(EXEC_ELAPSED='60s', ACTION=COOLDOWN);
```

2. 修改 `rg1` 资源组, 对 Runaway Query 直接终止,并且在接下来的 10 分钟里,把相同模式的查询直接标记为 Runaway Query。
hfxsd marked this conversation as resolved.
Show resolved Hide resolved

```sql
ALTER RESOURCE GROUP rg1 QUERY_LIMIT=(EXEC_ELAPSED='60s', ACTION=KILL, WATCH=SIMILAR DURATION='10m');
hfxsd marked this conversation as resolved.
Show resolved Hide resolved
```

3. 修改 `rg1` 资源组,取消 Runaway Queries 检查。

```sql
ALTER RESOURCE GROUP rg1 QUERY_LIMIT=NULL;
```

## 关闭资源管控特性

1. 执行以下命令关闭资源管控特性:
Expand Down