Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[YSQL] Correctly handle table/index splits in backup-restore #8229

Closed
Tracked by #10175
m-iancu opened this issue Apr 29, 2021 · 2 comments
Closed
Tracked by #10175

[YSQL] Correctly handle table/index splits in backup-restore #8229

m-iancu opened this issue Apr 29, 2021 · 2 comments
Assignees
Labels
area/ysql Yugabyte SQL (YSQL) priority/high High Priority
Milestone

Comments

@m-iancu
Copy link
Contributor

m-iancu commented Apr 29, 2021

When backing up a YSQL database we store both the (YSQL) metadata and the DocDB snapshots.
When restoring we re-create the YSQL objects, and then match the snapshots data for each to restore the actual data.
In that process we currently check that the number of splits of the YSQL objects (tables/indexes) re-created from the metadata match with those from the corresponding snapshot.
To avoid issues there, we added the SPLIT INTO/AT clause for ysql_dump which ensures the objects get re-created with the correct number of splits.
However that has a few exceptions. Known ones are:

  • unique constraints: represented as unique indexes which can be split into tablets, but does not have SPLIT clause syntax support.
  • extensions: create extension can internally create objects (tables, indexes, unique constraints).

These can lead to mismatch between the tablet splits and therefore an error on restore.

This can be fixed individually by adding splitting information to the YSQL dump (e.g. SPLIT INTO/AT syntax), or generically by inferring the number of splits from the snapshot itself.
For instance, when the number of splits does not match, we could re-create the objects with the correct number of splits from the snapshot rather than erroring out.

@m-iancu m-iancu added the area/ysql Yugabyte SQL (YSQL) label Apr 29, 2021
@m-iancu m-iancu added this to the 2.7.x milestone Apr 29, 2021
@m-iancu m-iancu changed the title [YSQL] Backups should handle splits for unique constraints [YSQL] Correctly handle splits for unique constraints in backup-restore Apr 29, 2021
@m-iancu m-iancu changed the title [YSQL] Correctly handle splits for unique constraints in backup-restore [YSQL] Correctly handle table/index splits in backup-restore Aug 24, 2021
@m-iancu m-iancu assigned m-iancu and jaki and unassigned m-iancu Aug 24, 2021
@jaki
Copy link
Contributor

jaki commented Aug 31, 2021

Backed up a table with unique constraint. Restored on a cluster with tserver flag ysql_num_shards_per_tserver=1, and got this error:

Error running import_snapshot: Invalid argument (yb/master/catalog_manager_ent.cc:1237): Unable to import snapshot meta file snapshot/test.snapshot: Found by id 000030af000030008000000000004003 PGSQL_TABLE_TYPE table 'abc_i_key' in namespace id 000030af000030008000000000000000 has number of tablets=1, expected=2: SNAPSHOT_FAILED (master error 23)

@OlegLoginov suggests not using syntax to solve this because it's not possible for CREATE EXTENSION (like, what if creating extension involves tables, some which have been dynamically split, some not?).

@jaki
Copy link
Contributor

jaki commented Aug 31, 2021

When trying to restore to a different database, I get a slightly different error:

Error running import_snapshot: Internal error (yb/master/catalog_manager_ent.cc:1370): Unable to import snapshot meta file snapshot/test.snapshot: Wrong number of tablets in created PGSQL_TABLE_TYPE table 'abc_i_key' in namespace id 00004005000030008000000000000000: 1 (expected 2): SNAPSHOT_FAILED (master error 23)

This is according to the two cases of if (new_namespace_id == table_data->old_namespace_id).

@ttyusupov ttyusupov modified the milestones: 2.7.x, 2.9.x Sep 9, 2021
@bmatican bmatican added the priority/high High Priority label Sep 17, 2021
jaki added a commit that referenced this issue Sep 30, 2021
Summary:
Sometimes, ysql_dump may not provide enough partitioning hints.  Then,
when importing snapshots, an error may be thrown for a mismatch on
number of tablets between the snapshot table and ysql_dump-created
table.

Fix this by recreating partitions for the YSQL table in this case.  Do
not yet address the case where number of tablets are the same but
partitions are split differently.

In the implementation, do not worry about waiting for index tablets to
finish creating on tservers since that isn't done for YCQL, which
already recreates table, either.

Add test YBBackupTest.TestYSQLChangeDefaultNumTablets and disabled test
YBBackupTest.TestYSQLRangeSplitConstraint.

Also,

- add a VLOG for AsyncCreateReplica response
- fix AddTabletUnlocked thread annotation from REQUIRES_SHARED to
  REQUIRES
- fix ScopedTabletInfoCommitter to avoid double CommitMutation
- move ScopedTabletInfoCommitter from anonymous namespace to
  catalog_entity_info.h so that ent::CatalogManager can use it
- make ScopedTabletInfoCommitter a template class ScopedInfoCommitter so
  that it can later be used for other objects like TableInfo

Do not close issue #8229 since there will be a part 2 handling partition
schemas that differ on split points but have the same number of
partitions.

Test Plan:
    ./yb_build.sh --cxx-test tools_yb-backup-test_ent \
      --gtest_filter YBBackupTest.TestYSQLChangeDefaultNumTablets

Reviewers: nicolas, oleg

Reviewed By: oleg

Subscribers: ybase, bogdan

Differential Revision: https://phabricator.dev.yugabyte.com/D13122
jaki added a commit that referenced this issue Oct 21, 2021
Summary:
Since commit b14485a ([#8229] backup:
repartition table if needed on YSQL restore), there is a rare deadlock
issue between ProcessTabletReportBatch and RepartitionTable.  It can be
hit when

- thread 1 (RepartitionTable): import a YSQL snapshot where the number
  of tablets for a table mismatch between the cluster and the external
  snapshot
- thread 2 (ProcessTabletReportBatch): process tablets of that table
  from heartbeat

A more in-depth sequence of steps follows:

1. t1: table->LockForWrite (WriteLock)
1. t2: table->LockForRead (ReadLock)
1. t1: tablet->StartMutation (WriteLock)
2. t1: table_lock.Commit (UpgradeToCommitLock; blocks on t2 ReadLock)
3. t2: tablet->LockForWrite (WriteLock; blocks on t1 WriteLock)

To fix, for ProcessTabletReportBatch, take table write lock instead of
read lock.  The table metadata isn't mutated, so this is purely for
deadlock avoidance reasons (since only one writer is allowed at a time).

Bogdan thinks we should expect table write lock to be taken whenever
tablet write lock is taken.

Close: #10304

Test Plan:
    ./yb_build.sh \
      --cxx-test tools_yb-backup-test_ent \
      --gtest_filter YBBackupTest.TestYSQLChangeDefaultNumTablets \
      -n 1000 \
      --tp 1 \
      fastdebug

Reviewers: bogdan, nicolas

Reviewed By: nicolas

Subscribers: ybase

Differential Revision: https://phabricator.dev.yugabyte.com/D13459
@jaki jaki closed this as completed in 96beb9e Oct 23, 2021
jaki added a commit that referenced this issue Nov 1, 2021
Summary:
Since commit b14485a ([#8229] backup:
repartition table if needed on YSQL restore), there is a rare deadlock
issue between ProcessTabletReportBatch and RepartitionTable.  It can be
hit when

- thread 1 (RepartitionTable): import a YSQL snapshot where the number
  of tablets for a table mismatch between the cluster and the external
  snapshot
- thread 2 (ProcessTabletReportBatch): process tablets of that table
  from heartbeat

A more in-depth sequence of steps follows:

1. t1: table->LockForWrite (WriteLock)
1. t2: table->LockForRead (ReadLock)
1. t1: tablet->StartMutation (WriteLock)
2. t1: table_lock.Commit (UpgradeToCommitLock; blocks on t2 ReadLock)
3. t2: tablet->LockForWrite (WriteLock; blocks on t1 WriteLock)

To fix, for ProcessTabletReportBatch, take table write lock instead of
read lock.  The table metadata isn't mutated, so this is purely for
deadlock avoidance reasons (since only one writer is allowed at a time).

Bogdan thinks we should expect table write lock to be taken whenever
tablet write lock is taken.

Original Commit: cc90f01

Original Differential Revision: https://phabricator.dev.yugabyte.com/D13459

Test Plan:
    ./yb_build.sh \
      --cxx-test tools_yb-backup-test_ent \
      --gtest_filter YBBackupTest.TestYSQLChangeDefaultNumTablets \
      -n 1000 \
      --tp 1 \
      fastdebug

Reviewers: nicolas

Reviewed By: nicolas

Subscribers: bogdan

Differential Revision: https://phabricator.dev.yugabyte.com/D13730
jaki added a commit that referenced this issue Nov 2, 2021
…tore (2)

Summary:
Commit b14485a handles backup restore
for YSQL when number of tablets in the external snapshot doesn't match
that in the cluster.  Do part 2 of handling cases where the number of
tablets match but the partition boundaries don't.  This may cost some
performance because partitions need to be inspected for each YSQL table.

Master branch commit f9c8d2f ([#8148]
docdb: Potential issue on crash after creating post-split tablets) is
not in this backport 2.8 branch.  To clarify what that commit does, it
changes the way yb-master chooses the split point: rather than take the
midway point between start and end, use the middle based on data in the
tablet.  Therefore, ManualTabletSplit test is adjusted so that the split
point is taken as the middle disregarding data.

Original Commit: 96beb9e

Original Differential Revision: https://phabricator.dev.yugabyte.com/D13300

Test Plan:
New test:

    ./yb_build.sh --cxx-test tools_yb-backup-test_ent \
      --gtest_filter YBBackupTest.TestYSQLManualTabletSplit

Run other YSQL backup/restore tests, and make sure that partitions don't
appear to mismatch (i.e. make sure master log doesn't have "Partition
boundaries mismatch for table").

Reviewers: oleg

Reviewed By: oleg

Subscribers: bogdan

Differential Revision: https://phabricator.dev.yugabyte.com/D13731
jasonyb pushed a commit that referenced this issue Nov 11, 2021
…tore

Summary:
Sometimes, ysql_dump may not provide enough partitioning hints.  Then,
when importing snapshots, an error may be thrown for a mismatch on
number of tablets between the snapshot table and ysql_dump-created
table.

Fix this by recreating partitions for the YSQL table in this case.  Do
not yet address the case where number of tablets are the same but
partitions are split differently.

In the implementation, do not worry about waiting for index tablets to
finish creating on tservers since that isn't done for YCQL, which
already recreates table, either.

Add test YBBackupTest.TestYSQLChangeDefaultNumTablets and disabled test
YBBackupTest.TestYSQLRangeSplitConstraint.

Also,

- add a VLOG for AsyncCreateReplica response
- fix AddTabletUnlocked thread annotation from REQUIRES_SHARED to
  REQUIRES
- fix ScopedTabletInfoCommitter to avoid double CommitMutation
- move ScopedTabletInfoCommitter from anonymous namespace to
  catalog_entity_info.h so that ent::CatalogManager can use it
- make ScopedTabletInfoCommitter a template class ScopedInfoCommitter so
  that it can later be used for other objects like TableInfo

Do not close issue #8229 since there will be a part 2 handling partition
schemas that differ on split points but have the same number of
partitions.

For backport to 2.6, besides adjusting to API differences, do the
following:

- Add num_tablets mismatch check since commit
  2f9d4dc ([backup] Added a check for
  'num_tablets' in CatalogManager::ImportSnapshot.) is not in this
  backport 2.6 branch.  Also add part of commit
  aa394e8 ([#9933][Part-0] Update logic
  for using num_tablets from internal or user requests.) touching this
  because it is needed due to backport commit
  ac653d2 ([Backport
  2.6][#9933][YCQL][Part-1] DESC TABLE does not directly match the
  "CREATE TABLE" command for number of tablets.).
- Change ycql_num_tablets and ysql_num_tablets flags to
  yb_num_shards_per_tserer and ysql_num_shards_per_tserver in the test
  since the former flags aren't available in this backport 2.6 branch.
  Since these flags are only granular to multiples of the number of
  tservers (3), adjust the later expected number of tablets from 2 to 6.
- Add REQUIRES(lock_) to AddTabletUnlocked

Original Commit: b14485a

Orignal Differential Revision: https://phabricator.dev.yugabyte.com/D13122

Test Plan:
    ./yb_build.sh --cxx-test client_backup-txn-test
    ./yb_build.sh --cxx-test integration-tests_snapshot-test
    ./yb_build.sh --cxx-test tools_yb-admin-test_ent
    ./yb_build.sh --cxx-test tools_yb-backup-test_ent
    ./yb_build.sh --java-test org.yb.cql.TestYbBackup

Reviewers: oleg, nicolas

Reviewed By: nicolas

Subscribers: bogdan

Differential Revision: https://phabricator.dev.yugabyte.com/D13851
jasonyb pushed a commit that referenced this issue Nov 11, 2021
Summary:
Since commit b14485a ([#8229] backup:
repartition table if needed on YSQL restore), there is a rare deadlock
issue between ProcessTabletReportBatch and RepartitionTable.  It can be
hit when

- thread 1 (RepartitionTable): import a YSQL snapshot where the number
  of tablets for a table mismatch between the cluster and the external
  snapshot
- thread 2 (ProcessTabletReportBatch): process tablets of that table
  from heartbeat

A more in-depth sequence of steps follows:

1. t1: table->LockForWrite (WriteLock)
1. t2: table->LockForRead (ReadLock)
1. t1: tablet->StartMutation (WriteLock)
2. t1: table_lock.Commit (UpgradeToCommitLock; blocks on t2 ReadLock)
3. t2: tablet->LockForWrite (WriteLock; blocks on t1 WriteLock)

To fix, for ProcessTabletReportBatch, take table write lock instead of
read lock.  The table metadata isn't mutated, so this is purely for
deadlock avoidance reasons (since only one writer is allowed at a time).

Bogdan thinks we should expect table write lock to be taken whenever
tablet write lock is taken.

For backport to 2.6, since commit
afd8775 ([#9182] Fix return from
CatalogManager::ProcessTabletReport) is not present, do changes to
ProcessTabletReport instead.

Depends on D13851

Original Commit: cc90f01

Original Differential Revision: https://phabricator.dev.yugabyte.com/D13459

Test Plan:
    ./yb_build.sh \
      --cxx-test tools_yb-backup-test_ent \
      --gtest_filter YBBackupTest.TestYSQLChangeDefaultNumTablets \
      -n 1000 \
      --tp 1 \
      fastdebug

Reviewers: nicolas

Reviewed By: nicolas

Subscribers: bogdan, ybase

Differential Revision: https://phabricator.dev.yugabyte.com/D13852
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/ysql Yugabyte SQL (YSQL) priority/high High Priority
Projects
None yet
Development

No branches or pull requests

4 participants