Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[YSQL] Support backup for pre-split multi-tablet range tables #4873

Closed
ndeodhar opened this issue Jun 24, 2020 · 5 comments
Closed

[YSQL] Support backup for pre-split multi-tablet range tables #4873

ndeodhar opened this issue Jun 24, 2020 · 5 comments
Assignees
Labels
area/ysql Yugabyte SQL (YSQL) kind/bug This issue is a bug priority/high High Priority

Comments

@ndeodhar
Copy link
Contributor

ndeodhar commented Jun 24, 2020

Jira Link: DB-2111
Currently ysql_dump being used for distributed backups only adds SPLIT INTO clause (or tablet partition information) for hash tables. We should add support for pre-split range tables.

@ndeodhar ndeodhar added the area/ysql Yugabyte SQL (YSQL) label Jun 24, 2020
@OlegLoginov OlegLoginov changed the title [YSQL] Support backup for pre-split range tables [YSQL] Support backup for pre-split multi-tablet range tables Nov 4, 2020
@OlegLoginov
Copy link
Contributor

OlegLoginov commented Nov 17, 2021

The idea is simple:

  1. CREATE TABLE ... SPLIT AT VALUES... creates the range-table and the split points are stored in the Master table record in the sys-catalog.
  2. ysql_dump should get the info and decode back the stored info into the SPLIT AT VALUES () - so ysql_dump must generate correct/original CREATE TABLE statement.

@OlegLoginov
Copy link
Contributor

OlegLoginov commented Nov 17, 2021

Example of the range table: https://docs.yugabyte.com/latest/api/ysql/the-sql-language/statements/ddl_create_table/

SPLIT AT VALUES
For range-sharded tables, you can use the SPLIT AT VALUES clause to set split points to presplit range-sharded tables.

Example

CREATE TABLE tbl(
  a int,
  b int,
  primary key(a asc, b desc)
) SPLIT AT VALUES((100), (200), (200, 5));

In the example above, there are three split points and so four tablets will be created:

tablet 1: a=<lowest>, b=<lowest> to a=100, b=<lowest>
tablet 2: a=100, b=<lowest> to a=200, b=<lowest>
tablet 3: a=200, b=<lowest> to a=200, b=5
tablet 4: a=200, b=5 to a=<highest>, b=<highest>

@m-iancu m-iancu assigned yifanguan and unassigned OlegLoginov Dec 7, 2021
@m-iancu m-iancu added this to YQL-beta Dec 7, 2021
@jaki
Copy link
Contributor

jaki commented Dec 8, 2021

Commits b14485a and 96beb9e add generic YSQL backup/restore handling for any partitioning mismatch. Even if the SPLIT clause provided by ysql_dump (or lack thereof) causes partitioning that doesn't match with what's in the backup, these commits add checks to detect that then repartition the live table to match what's in the backup before/during importing.

In theory, this fix should extend to range partitioned tables. However, since the repartition takes time (and locks on master), it may be worth adding a SPLIT AT VALUES clause in ysql_dump to avoid the repartition upon restore (for most, not all cases, for same reasons as in hash partitioning: see issue #8229).

Also, @yifanguan, note that the first commit I mention adds a disabled test YBBackupTest.TestYSQLRangeSplitConstraint. Please enable it with your changes.

@m-iancu m-iancu added the priority/high High Priority label Feb 3, 2022
@mheer9
Copy link

mheer9 commented Feb 22, 2022

AS requested schema DDL attached to ticket
https://yugabyte.zendesk.com/agent/tickets/2562

m-iancu added a commit that referenced this issue Mar 1, 2022
…ations

Summary:
After 96beb9e, restoring a backup will
re-partition relations to match the splits from their actual tablet snapshots
if they differ from the pre-created ones.

As a result, even if the split information from the ysql_dump file is missing
or not accurate, restore should go through correctly.

Therefore, as a stop-gap fix for backup-restore make the error when dumping a
multi-tablet range-split relation (table or index) just a warning instead.
This diff also fixes an issue in TestYSQLRangeSplitConstraint index-data validation
code. Previously that data read was actually using the table instead of the index.

Fully supporting range-split tables in ysql_dump  (SPLIT AT clause) is still
needed for regular ysql_dump export flow and may also improve performance of
restore. So this will be addressed in a follow-up commit.

Test Plan:
YBBackupTest.TestYSQLRangeSplitConstraint
org.yb.pgsql.TestYsqlDump

Reviewers: yguan, jason

Reviewed By: yguan, jason

Subscribers: yql

Differential Revision: https://phabricator.dev.yugabyte.com/D15700
yifanguan added a commit that referenced this issue Mar 3, 2022
…range-split relations

Summary:
In this backport diff, formatted string's type field for `yb_table_properties.num_tablets`
is changed from `PRIu64` to `%u` because its type is `uint32_t` instead of `uint64_t`
on this backport branch.
The expected output files of test: TestYsqlDump are also changed accordingly based on branch 2.12.
This test passed on Jenkins.

After 96beb9e, restoring a backup will
re-partition relations to match the splits from their actual tablet snapshots
if they differ from the pre-created ones.

As a result, even if the split information from the ysql_dump file is missing
or not accurate, restore should go through correctly.

Therefore, as a stop-gap fix for backup-restore make the error when dumping a
multi-tablet range-split relation (table or index) just a warning instead.
This diff also fixes an issue in TestYSQLRangeSplitConstraint index-data validation
code. Previously that data read was actually using the table instead of the index.

Fully supporting range-split tables in ysql_dump  (SPLIT AT clause) is still
needed for regular ysql_dump export flow and may also improve performance of
restore. So this will be addressed in a follow-up commit.

Original Commit: 3156825

Original Differential Revision: https://phabricator.dev.yugabyte.com/D15700

Test Plan:
YBBackupTest.TestYSQLRangeSplitConstraint
org.yb.pgsql.TestYsqlDump

Reviewers: mihnea, jason

Reviewed By: jason

Subscribers: yql

Differential Revision: https://phabricator.dev.yugabyte.com/D15758
jayant07-yb pushed a commit to jayant07-yb/yugabyte-db that referenced this issue Mar 8, 2022
…plit relations

Summary:
After 96beb9e, restoring a backup will
re-partition relations to match the splits from their actual tablet snapshots
if they differ from the pre-created ones.

As a result, even if the split information from the ysql_dump file is missing
or not accurate, restore should go through correctly.

Therefore, as a stop-gap fix for backup-restore make the error when dumping a
multi-tablet range-split relation (table or index) just a warning instead.
This diff also fixes an issue in TestYSQLRangeSplitConstraint index-data validation
code. Previously that data read was actually using the table instead of the index.

Fully supporting range-split tables in ysql_dump  (SPLIT AT clause) is still
needed for regular ysql_dump export flow and may also improve performance of
restore. So this will be addressed in a follow-up commit.

Test Plan:
YBBackupTest.TestYSQLRangeSplitConstraint
org.yb.pgsql.TestYsqlDump

Reviewers: yguan, jason

Reviewed By: yguan, jason

Subscribers: yql

Differential Revision: https://phabricator.dev.yugabyte.com/D15700
yifanguan added a commit that referenced this issue Mar 15, 2022
…ange-split relations

Summary:
In this backport diff, for `pg_dump.c` and `ruleutils.c`, formatted string's type field for yb_table_properties.num_tablets
is changed from `PRIu64` to `%u` because its type is `uint32_t` instead of `uint64_t`
on this backport branch.
The expected output files of test: `TestYsqlDump` are also changed accordingly based on branch 2.8.
Changes to `yb_ysql_dump_verifier.out` is ignored because there is no test file: `yb_ysql_dump_verifier.sql` on branch 2.8.

After 96beb9e, restoring a backup will
re-partition relations to match the splits from their actual tablet snapshots
if they differ from the pre-created ones.

As a result, even if the split information from the ysql_dump file is missing
or not accurate, restore should go through correctly.

Therefore, as a stop-gap fix for backup-restore make the error when dumping a
multi-tablet range-split relation (table or index) just a warning instead.
This diff also fixes an issue in TestYSQLRangeSplitConstraint index-data validation
code. Previously that data read was actually using the table instead of the index.

Fully supporting range-split tables in ysql_dump  (SPLIT AT clause) is still
needed for regular ysql_dump export flow and may also improve performance of
restore. So this will be addressed in a follow-up commit.

Original Commit: 3156825

Original Differential Revision: https://phabricator.dev.yugabyte.com/D15700

Test Plan:
YBBackupTest.TestYSQLRangeSplitConstraint
org.yb.pgsql.TestYsqlDump

Reviewers: mihnea, jason

Reviewed By: jason

Subscribers: yql

Differential Revision: https://phabricator.dev.yugabyte.com/D15940
@tverona1 tverona1 added status/awaiting-triage Issue awaiting triage and removed status/awaiting-triage Issue awaiting triage labels Apr 5, 2022
@yugabyte-ci yugabyte-ci added the kind/bug This issue is a bug label Jun 9, 2022
@yugabyte-ci yugabyte-ci removed the status/awaiting-triage Issue awaiting triage label Jun 17, 2022
@yugabyte-ci yugabyte-ci moved this to Done in YQL-beta Jun 17, 2022
@m-iancu
Copy link
Contributor

m-iancu commented Jun 17, 2022

Closing as this is solved for backups by 3156825.
However for ysql_dump this is still not solved and should be addressed in a separate issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/ysql Yugabyte SQL (YSQL) kind/bug This issue is a bug priority/high High Priority
Projects
Status: Done
Development

No branches or pull requests

8 participants