Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pd not applying placement to partitions of new table #4467

Closed
Tracked by #18030
kolbe opened this issue Dec 15, 2021 · 8 comments
Closed
Tracked by #18030

pd not applying placement to partitions of new table #4467

kolbe opened this issue Dec 15, 2021 · 8 comments
Labels
type/bug The issue is confirmed as a bug.

Comments

@kolbe
Copy link
Contributor

kolbe commented Dec 15, 2021

Bug Report

What did you do?

I deployed a cluster of 9 TiKV stores in 3 geographical regions and gave each store a "region" label to match the cloud region it's deployed to.

MySQL [test]> select store_id, address, label from information_schema.tikv_store_status order by store_id;
+----------+------------------+-------------------------------------------------+
| store_id | address          | label                                           |
+----------+------------------+-------------------------------------------------+
|        1 | 10.138.0.7:20160 | [{"key": "region", "value": "us-west1"}]        |
|        4 | 10.138.0.5:20160 | [{"key": "region", "value": "us-west1"}]        |
|        7 | 10.138.0.6:20160 | [{"key": "region", "value": "us-west1"}]        |
|        8 | 10.156.0.4:20160 | [{"key": "region", "value": "europe-west3"}]    |
|        9 | 10.156.0.2:20160 | [{"key": "region", "value": "europe-west3"}]    |
|       10 | 10.156.0.3:20160 | [{"key": "region", "value": "europe-west3"}]    |
|       11 | 10.148.0.4:20160 | [{"key": "region", "value": "asia-southeast1"}] |
|       12 | 10.148.0.3:20160 | [{"key": "region", "value": "asia-southeast1"}] |
|       13 | 10.148.0.2:20160 | [{"key": "region", "value": "asia-southeast1"}] |
+----------+------------------+-------------------------------------------------+
9 rows in set (0.00 sec)

I created 3 placement policies, one for each of the regions where TiKV stores are deployed.

MySQL [test]> show placement;
+-----------------+------------------------------------------------------------+------------------+
| Target          | Placement                                                  | Scheduling_State |
+-----------------+------------------------------------------------------------+------------------+
| POLICY americas | PRIMARY_REGION="us-west1" REGIONS="us-west1"               | NULL             |
| POLICY asia     | PRIMARY_REGION="asia-southeast1" REGIONS="asia-southeast1" | NULL             |
| POLICY europe   | PRIMARY_REGION="europe-west3" REGIONS="europe-west3"       | NULL             |
+-----------------+------------------------------------------------------------+------------------+
3 rows in set (0.00 sec)

I created a table with 3 partitions, each of which uses a different placement policy.

MySQL [test]> show create table t1\G
*************************** 1. row ***************************
       Table: t1
Create Table: CREATE TABLE `t1` (
  `country` char(2) NOT NULL,
  `userdata` varchar(2056) NOT NULL
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_bin
PARTITION BY LIST COLUMNS(country) (
  PARTITION `europe` VALUES IN ("DE","FR","GB") /*T![placement] PLACEMENT POLICY=`europe` */,
  PARTITION `americas` VALUES IN ("US","CA","MX") /*T![placement] PLACEMENT POLICY=`americas` */,
  PARTITION `asia` VALUES IN ("SG","CN","JP") /*T![placement] PLACEMENT POLICY=`asia` */
)

MySQL [test]> show placement;
+----------------------------------+------------------------------------------------------------+------------------+
| Target                           | Placement                                                  | Scheduling_State |
+----------------------------------+------------------------------------------------------------+------------------+
| POLICY americas                  | PRIMARY_REGION="us-west1" REGIONS="us-west1"               | NULL             |
| POLICY asia                      | PRIMARY_REGION="asia-southeast1" REGIONS="asia-southeast1" | NULL             |
| POLICY europe                    | PRIMARY_REGION="europe-west3" REGIONS="europe-west3"       | NULL             |
| TABLE test.t1 PARTITION europe   | PRIMARY_REGION="europe-west3" REGIONS="europe-west3"       | INPROGRESS       |
| TABLE test.t1 PARTITION americas | PRIMARY_REGION="us-west1" REGIONS="us-west1"               | INPROGRESS       |
| TABLE test.t1 PARTITION asia     | PRIMARY_REGION="asia-southeast1" REGIONS="asia-southeast1" | INPROGRESS       |
+----------------------------------+------------------------------------------------------------+------------------+
6 rows in set (0.01 sec)
MySQL [test]> select region_id, group_concat(store_id order by store_id) store_id, group_concat(json_extract(label, '$[0].value') order by label) label from information_schema.tikv_region_status join information_schema.tikv_region_peers using (region_id) join information_schema.tikv_store_status using (store_id) where table_name='t1' group by region_id order by region_id;
+-----------+-----------+---------------------------------------------------------------+
| region_id | store_id  | label                                                         |
+-----------+-----------+---------------------------------------------------------------+
|       216 | 8,12,13   | "asia-southeast1","asia-southeast1","europe-west3"            |
|       220 | 1,8,12,13 | "asia-southeast1","asia-southeast1","europe-west3","us-west1" |
|       240 | 10,12,13  | "asia-southeast1","asia-southeast1","europe-west3"            |
+-----------+-----------+---------------------------------------------------------------+
3 rows in set (0.00 sec)

What did you expect to see?

After creating the new table, I expected all replicas for each of the partitions to be placed in the region designated in the associated placement policy.

What did you see instead?

PD does not seem to follow the placement policy.

pd.log

What version of PD are you using (pd-server -V)?

Release Version: v5.3.0
Edition: Community
Git Commit Hash: fe6fab9268d2d6fd34cd22edd1cf31a302e8dc5c
Git Branch: heads/refs/tags/v5.3.0
UTC Build Time:  2021-11-22 01:51:40
@kolbe kolbe added the type/bug The issue is confirmed as a bug. label Dec 15, 2021
@kolbe
Copy link
Contributor Author

kolbe commented Dec 16, 2021

@xhebox
Copy link
Contributor

xhebox commented Dec 16, 2021

After some researching, I found that the root cause may be isolationlevel in placement rules. As of pingcap/tidb@2a02e6e, it is set to region according to the rfc.

That means PD will meet the isolation level requirements by scheduling peers from stores with different "region label". So while the syntax is completely legal and possible, it is just won't schedule 3 stores with same region label(because of isolation, it is seen as a "bad schedule").

@rleungx Could you confirm the behavior? If that is true, then it is expected for PD. It is more of an issue of TiDB behaviors.

@rleungx
Copy link
Member

rleungx commented Dec 16, 2021

Yes, once you set the isolation level, PD will try to make the distribution of regions meet the requirement of isolation.

@xhebox
Copy link
Contributor

xhebox commented Dec 16, 2021

So isolation level takes a higher priority than the requirements of constraints.

@morgo What do you think about that? We could not give up setting isolation level from tidb, then there is no such problem. While it may not satisfy the requirements of isolation, PD should score labels by LocationLabels, which means it will try its best.

@morgo
Copy link

morgo commented Dec 16, 2021

@xhebox Sounds good. But lets break this into two requests:

  1. The replication API should report "UNABLE TO SCHEDULE" instead of "INPROGRESS" if the rule can not be applied immediately. There are many such scenarios where this could happen (followers=N and stores is < N).
  2. We remove the isolation requirement as proposed. This will fix this scenario.

I can write docs for (1). I think that it will be a common occurence of misconfiguration.

@kolbe
Copy link
Contributor Author

kolbe commented Dec 18, 2021

I manually edited the placement rules using pd-ctl to remove "isolation_level" from the rules generated by TiDB, and PD quickly applied the expected scheduling:

kolbe@kolbe-us-west1-3:~$ tiup ctl:v5.3.0 pd config placement-rules show
Starting component `ctl`: /home/kolbe/.tiup/components/ctl/v5.3.0/ctl pd config placement-rules show
[
  {
    "group_id": "TiDB_DDL_64",
    "id": "partition_rule_64_0",
    "index": 2,
    "start_key": "7480000000000000ff4000000000000000f8",
    "end_key": "7480000000000000ff4100000000000000f8",
    "role": "voter",
    "count": 3,
    "label_constraints": [
      {
        "key": "region",
        "op": "in",
        "values": [
          "europe-west3"
        ]
      },
      {
        "key": "engine",
        "op": "notIn",
        "values": [
          "tiflash"
        ]
      }
    ],
    "location_labels": [
      "region",
      "zone",
      "rack",
      "host"
    ],
    "version": 1
  },
  {
    "group_id": "TiDB_DDL_65",
    "id": "partition_rule_65_0",
    "index": 2,
    "start_key": "7480000000000000ff4100000000000000f8",
    "end_key": "7480000000000000ff4200000000000000f8",
    "role": "voter",
    "count": 3,
    "label_constraints": [
      {
        "key": "region",
        "op": "in",
        "values": [
          "us-west1"
        ]
      },
      {
        "key": "engine",
        "op": "notIn",
        "values": [
          "tiflash"
        ]
      }
    ],
    "location_labels": [
      "region",
      "zone",
      "rack",
      "host"
    ],
    "version": 1
  },
  {
    "group_id": "TiDB_DDL_67",
    "id": "partition_rule_67_0",
    "index": 2,
    "start_key": "7480000000000000ff4300000000000000f8",
    "end_key": "7480000000000000ff4400000000000000f8",
    "role": "voter",
    "count": 3,
    "label_constraints": [
      {
        "key": "region",
        "op": "in",
        "values": [
          "asia-southeast1"
        ]
      },
      {
        "key": "engine",
        "op": "notIn",
        "values": [
          "tiflash"
        ]
      }
    ],
    "location_labels": [
      "region",
      "zone",
      "rack",
      "host"
    ],
    "version": 1
  },
  {
    "group_id": "pd",
    "id": "default",
    "start_key": "",
    "end_key": "",
    "role": "voter",
    "count": 3
  }
]
MySQL [test]> select region_id, group_concat(store_id order by store_id) store_id, group_concat(json_extract(label, '$[0].value') order by label) label from information_schema.tikv_region_status join information_schema.tikv_region_peers using (region_id) join information_schema.tikv_store_status using (store_id) where table_name='t1' group by region_id order by region_id;
+-----------+----------+-------------------------------------------------------+
| region_id | store_id | label                                                 |
+-----------+----------+-------------------------------------------------------+
|       216 | 8,9,10   | "europe-west3","europe-west3","europe-west3"          |
|       220 | 1,4,7    | "us-west1","us-west1","us-west1"                      |
|       240 | 11,12,13 | "asia-southeast1","asia-southeast1","asia-southeast1" |
+-----------+----------+-------------------------------------------------------+
3 rows in set (0.01 sec)

@bb7133
Copy link

bb7133 commented Dec 22, 2021

@xhebox I guess that this can be closed by pingcap/tidb#30859, right?
/cc @morgo

@xhebox
Copy link
Contributor

xhebox commented Dec 23, 2021

@xhebox I guess that this can be closed by pingcap/tidb#30859, right? /cc @morgo

Yes, I've created pingcap/tidb#30960 to track another proposal from morgo. This issue could be closed, now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type/bug The issue is confirmed as a bug.
Projects
None yet
Development

No branches or pull requests

5 participants