-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[DB Clone][YSQL] Clone to a time before drop table is failing in case of colocated databases #23230
Comments
More details on why is ysql_dump unable to set the tablegroup id correctly in the dump:
The problem comes from the fact that the tablegroup manager will not have an entry of the dropped table inside its maps and thus the tablegroup manager will return a |
The solution can be one of the two ways: I am leaning toward solution 2 as it is more straight forward and dealing with hiding tablegroup mappings means persisting this info in sys catalog. This seems an overkill for the purpose of getting the tablegroup id which we can get from the parent table. |
… case of colocated databases Summary: Prior to this diff, clone was failing when we were cloning to a time before a drop table operation happened on the source database. The problem is demonstrated in the colocated case only and shouldn't be mixed with D35178. The root cause of the problem is that ysql_dump is unable to set the tablegroup_id correctly in the sql script. More specifically in the following snippet (notice the 0 which is a wrong default value): ``` -- For YB colocation backup, must preserve implicit tablegroup pg_yb_tablegroup oid SELECT pg_catalog.binary_upgrade_set_next_tablegroup_oid('0'::pg_catalog.oid); ``` This is because calling `yb_table_properties()` on the dropped table will not set the tablegroup_id in the response. As part of executing `yb_table_properties()`, `GetTableSchema` is called which in turn doesn't set the tablegroup_id properly if the requested table is a dropped one. The problem is that the tablegroup manager will not have an entry of the dropped table inside its inmemory maps and thus the tablegroup manager will return a nullptr when asked to get the tablegroup of the dropped table using `FindByTable`. The diff fixes the issue by returning the tablegroup_id using the **colocation parent table id** rather than asking the tablegroup manager for the tablegroup_id. The idea is that we can always infer the tablegroup_id from the parent table as all the colocated tables share the same tablegroup_id. We also guarantee that the colocated database corresponding to the returned `tablegroup_id` should still be there as the table is guaranteed not to be in a deleted state (There is a previous check in `GetTableSchemInternal` for that). Test Plan: Switched the test CloneAfterDropTable to a parametrized test where the parameter tests both colocation and non-colocation cases. ./yb_build.sh --cxx-test integration-tests_minicluster-snapshot-test --gtest_filter Colocation/PgCloneTestParam.CloneAfterDropTable/1 Reviewers: asrivastava, zdrudi Reviewed By: zdrudi Subscribers: ybase, slingam Differential Revision: https://phorge.dev.yugabyte.com/D36702
Summary: 944bd87 [#23308] DocDB: Do not show InvalidFlags in flags UI 3c5c40a [DOC-328] [2024.1.1] Change data capture (CDC) Observability in YugabyteDB Anywhere (#22733) eb39516 [DOC-397] [2024.1.1] Create provider validation (#23039) e948dee [#DOC-348] Documentation for Migration Assessment in yugabyted UI. (#23288) 80835a8 [#23230] [DB Clone][YSQL]: Fix Cloning to a time before drop table in case of colocated databases 57e930e [docs] 404 fixes for July 5-19 (#23277) e166cee [PLAT-13956]: Fix RBAC test failure 7e10e9f [WIP][docs] Documentation for logical replication with PG connector (#23065) 86c4e4c [doc][2024.1.1] yb shell client download (#23170) a03ccda [#23045] DocDB: Support for table locks using tserver local object lock manager 836345c [doc][yba][2024.1.1] Kubernetes Prometheus (#23097) c842e1d [PLAT-14750]: Add runtime config to show the Premium V2 Storage type in case of Azure during Create/Edit scenario b0392f3 [PLAT-14734]: Highlight Decommission node status to indicate node is in bad or unhealthy state 307e2fc Fix the UT pipeline for local provider e0ff600 [PLAT-14696] DDL atomicity check metric and alert1 f229cc6 [PLAT-14739]: Make xCluster sync lock per config uuid Test Plan: Jenkins: rebase: pg15-cherrypicks Reviewers: jason, tfoucher Tags: #jenkins-ready Differential Revision: https://phorge.dev.yugabyte.com/D36955
Jira Link: DB-12167
Description
Take the following experiment on a colocated database:
Clone will fail at step 4 with the error:
ERROR: Clone operation aborted: Internal error (yb/master/catalog_manager_ext.cc:2336): Created table not found: 00004002000030008000000000004003.colocation.parent.uuid: OBJECT_NOT_FOUND (master error 3)
The problem is in ysql_dump step where the table group oid in the dump script is supposed to be identical to the tablegroup id at the source database. However the table group is set to 0 as the following snippet shows:
Issue Type
kind/bug
Warning: Please confirm that this issue does not contain any sensitive information
The text was updated successfully, but these errors were encountered: