Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[xCluster] Create API to Wait for Replication Drain #10978

Closed
rahuldesirazu opened this issue Jan 3, 2022 · 2 comments
Closed

[xCluster] Create API to Wait for Replication Drain #10978

rahuldesirazu opened this issue Jan 3, 2022 · 2 comments
Labels
area/docdb YugabyteDB core features kind/enhancement This is an enhancement of an existing feature priority/medium Medium priority issue

Comments

@rahuldesirazu
Copy link
Contributor

rahuldesirazu commented Jan 3, 2022

Jira Link: DB-1155

Description

For XCluster, we don't have an explicit API to verify when the Producer & Consumer are in sync. The normal mode of operation should include lag, since XCluster is asynchronous. However, there are some use cases where we would want to understand if the Producer has completely sent all operations to the Consumer (drain).

  1. Cluster Migration. Stop writes on the old cluster (Producer). Don't use the new cluster (Consumer) until all transactions are sent over.
  2. Schema Alters. Since we don't have full DDL support yet, it can be safer to just stop writes and change both sides before continuing. When doing this, we need to ensure there are no outstanding OPs with the old schema.
  3. Testing. The customer is using synthetic data and wants to verify that XCluster works and is drained before continuing the next round of test data.
  4. Truncates and PITR. (@rahuldesirazu for more info)

Additionally, if we extended this "drain" API to include an OpID or Timestamp, we use this API to wait for a particular catch up window when bootstrapping XCluster.


Implementation Notes:

  1. Currently, the user can look at per-tablet metrics using Prometheus or Platform: "async_replication_committed_lag_micros", "last_checkpoint_opid_index".
  2. We need to aggregate into per-table or a more user-friendly abstraction and verify that all tablets in that schema are caught up.
  3. To ensure that we don't have any new writes, we need to check the "current_term" or "last_appended" on each tablet. Some clarification is here.
  4. We need to think of some heuristic to verify drain and not additional writes. Maybe query all tablets that are not caught up, then do one final query after we think we're caught up to verify?
@rahuldesirazu rahuldesirazu added area/docdb YugabyteDB core features priority/high High Priority labels Jan 3, 2022
@nspiegelberg
Copy link
Contributor

This API would need to disable splitting while waiting for the drain or take that into account.

@nspiegelberg nspiegelberg assigned hari90 and unassigned Rerenah Mar 29, 2022
@nspiegelberg nspiegelberg changed the title [xCluster] Create API WaitForReplicationLag == 0 [xCluster] Create API to Wait for Replication Drain Mar 30, 2022
@bmatican
Copy link
Contributor

@hari90 taking this one back for now, as it might be a bit more work than initially thinking.

I'll pass you a separate one as an intro to producer side code first, instead.

@bmatican bmatican removed the priority/high High Priority label Mar 30, 2022
@yugabyte-ci yugabyte-ci added kind/bug This issue is a bug priority/medium Medium priority issue labels Jun 8, 2022
LambdaWL added a commit that referenced this issue Jul 13, 2022
…logManager and CDCService

Summary:
Issue: #10978
Design Doc: https://docs.google.com/document/d/1HB6zT2MX3NmKnlhhhJYbuCisVV-JKvgSO9gJAhul9OE/edit

As a first step, implemented the API logic on CatalogManager and CDCService.
- CatalogManager: repeatedly send RPCs to relevant TServvers through the API on CDCService
- CDCService: check `cdc_metrics` for tablets in order to determine whether the replication on a tablet is caught-up (to some user specified point in time)

**Notes on how the notion of caught-up is decided:**
A new `cdc_metric` named `last_caughtup_physicaltime` is added. This metric is used to record the point in time such that we can safely assume that the consumer has caught up with the producer up to this time. In other words, this metric keeps track of the progress that the consumer has made in the replication.
Metric is updated as follows:
- If currently consumer has the latest record on producer, i.e. the lag is zero, update the metric to `GetCurrentTimeMicros()`
- Otherwise, update the metric to its own value, or `last_checkpoint_physicaltime`, whichever is larger. This is to ensure that the metric value is always non-decreasing.
The update happens in `GetChanges()`, since it is the only place that consumer could make progress in the replication.
With `last_caughtup_physicaltime`, the logic of the API is greatly simplified: it merely compares the user specified timestamp (default to current time if not provided) against this new metric.
**Since the API logic now entirely relies on the new cdc metric, if producer leadership changes & consumer never send GetChanges from this point on, the metric would be empty and the API would view this tablet as not drained. For this reason, the API only works when replication is still on.**

Test Plan:
Created three unit tests, command:
```
./yb_build.sh --cxx-test integration-tests_twodc-test --test-timeout-sec 1200 --gtest_filter "*TwoDCTestWaitForReplicationDrain*" -n 20
```
The tests cover three scenarios:
- Consumer is unable to caught-up due to GetChanges being blocked (added a test flag `block_get_changes` in `cdc_service.cc`)
- Consumer is unable to caught-up due to tservers being shut down
- User specifies a point in time to check for replication drain

Reviewers: rahuldesirazu, nicolas, jhe

Reviewed By: jhe

Subscribers: ybase, bogdan

Differential Revision: https://phabricator.dev.yugabyte.com/D17806
LambdaWL added a commit that referenced this issue Jul 15, 2022
…YB-Admin

Summary:
Issue: #10978
Design Doc: https://docs.google.com/document/d/1HB6zT2MX3NmKnlhhhJYbuCisVV-JKvgSO9gJAhul9OE/edit

As the second step, implemented the API on yb-admin. The CLI command is:
```
yb-admin  wait_for_replication_drain <comma_separated_list_of_stream_ids>  [<timestamp> | minus <interval>]
```
where the `minus <interval>` is the same format as in PITR (documentation [[ https://docs.yugabyte.com/preview/explore/cluster-management/point-in-time-recovery-ycql/ | here ]],  or see `restore_snapshot_schedule` in `yb-admin_cli_ent.cc`).

If all streams are caught-up, the API prints `All replications are caught-up.` to the console. Otherwise, it prints the non-caught-up streams in the following format:
```
Found undrained replications:
- Under Stream <stream_id>:
  - Tablet: <tablet_id>
  - Tablet: <tablet_id>
  // ......
// ......
```

Test Plan:
```
./yb_build.sh --cxx-test yb-admin-test_ent --gtest_filter XClusterAdminCliTest.TestWaitForReplicationDrain -n 20
```

Reviewers: rahuldesirazu, nicolas, jhe

Reviewed By: jhe

Subscribers: slingam, ybase

Differential Revision: https://phabricator.dev.yugabyte.com/D18363
@yugabyte-ci yugabyte-ci added kind/enhancement This is an enhancement of an existing feature and removed kind/bug This issue is a bug labels Jul 30, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/docdb YugabyteDB core features kind/enhancement This is an enhancement of an existing feature priority/medium Medium priority issue
Projects
None yet
Development

No branches or pull requests

6 participants