Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[xCluster] Locality-Aware Placement for M:N tablet mapping in setup_replication #10186

Closed
nspiegelberg opened this issue Oct 5, 2021 · 3 comments
Assignees
Labels
area/cdc Change Data Capture

Comments

@nspiegelberg
Copy link
Contributor

Currently, setup_replication only does tablet mapping between Consumer & Producer clusters in the 1:1 use case. For the M:N use case, a naive round-robin strategy is used. It would be useful to have a best-effort mapping instead.

In addition to getting the list of tablets on the Producer side, get the start & end rowkey. Compute the max overlap in a rowkey interval with a local tablet and assign accordingly (only assign RR on failure).

  • Add an Interval Overlap Calculation
    -- Calculate a set of candidate tablets on the Consumer that have their [StartKey_C, EndKey_C] within the Producer’s [StartKey_P, EndKey_P] range.
    -- For each candidate, determine the overlap window
    -- [StartKey_O, EndKey_O] == [ max(StartKey_C, StartKey_P), min (EndKey_C, EndKey_P) ].
    -- Pick the candidate with max(EndKey_O - StartKey_O)

We might be able to approximate this faster using a middle key algorithm:

  • For a Producer tablet, calculate the middle key. Assign to the Consumer tablet that contains the middle key.
    Use the logical MiddleKey, instead of GetEncodedMiddleSplitKey(), for a fully in-memory operation.

This task is the first step towards adding this heuristic to the xDC + tablet splitting use case to keep reasonably good locality while allowing their split keys to diverge. See the Tablet Splitting + xDC Design Doc: https://docs.google.com/document/d/1MlhwjqSypnP8i6r7DJcm85GNIaguV51ZqiekRIyf0eU

@hulien22
Copy link
Contributor

Some code pointers to help out, currently we process tablet mappings in two locations:

  1. At the end of setting up replication, we initialize the consumer. As part of this, we fetch all the consumer table's tablets and compare with the producer tablets to create the tablet mapping. If it is a M:N setup, then this assignment is done round-robin:
    consumer = consumer_tablets_resp.tablet_locations(i % consumer_tablets_size).tablet_id();
  2. For tablet splitting, when a consumer side tablet splits, we need to re-assign the parent tablet's pollers. This is currently also done in a naive round-robin fashion:
    Status UpdateTableMappingOnTabletSplit(
    cdc::StreamEntryPB* stream_entry,
    const SplitTabletIds& split_tablet_ids) {
    auto* mutable_map = stream_entry->mutable_consumer_producer_tablet_map();
    auto producer_tablets = (*mutable_map)[split_tablet_ids.source];
    mutable_map->erase(split_tablet_ids.source);
    // TODO introduce a better mapping of tablets to improve locality (GH #10186).
    // For now we just distribute the producer tablets between both children.
    for (int i = 0; i < producer_tablets.tablets().size(); ++i) {
    if (i % 2) {
    *(*mutable_map)[split_tablet_ids.children.first].add_tablets() = producer_tablets.tablets(i);
    } else {
    *(*mutable_map)[split_tablet_ids.children.second].add_tablets() = producer_tablets.tablets(i);
    }
    }
    return Status::OK();
    }

Ideally we can use the same logic in both these cases, but tbd.. Tablet splitting already provides the start and end keys for all the producer tablets, but some work would be required if we want these keys for the setup case as well.

@bmatican
Copy link
Contributor

@hulien22 to confirm, does this mapping also dynamically change, at runtime, based on consumer cluster changes, eg: add/remove nodes, LB moving tablets around, etc?

@hulien22
Copy link
Contributor

The mapping won't change since its purely a tablet id : tablet id mapping. So moving tablets around won't change anything, but splitting tablets does as that is creating new tablets that need to be added/removed from the mapping

hari90 added a commit that referenced this issue May 27, 2022
…ablet counts in xCluster

Summary:
Use the tablet start and end keys to find best mapping when count of tablets between producer and consumer is different. Consumer tablet with the most key range overlap will be picked. Added start and end key to ProducerTabletListPB. This is populated at initial CDC stream create. When producer or consumer tablets split, and we have key ranges in ProducerTabletListPB then it is used to construct the new mapping. If mapping was created on older build in which case keys wont be available, we will fall back to old round robin behavior.
Old code assumed same tablet count implied same key ranges. This need not be true, so has been fixed. ex: producer default 4 tablets, consumer default 3 tablets + consumer tablet split.

Test Plan:
Added new xcluster-tablet-split-itest XClusterTabletMapTest
Added new validations in xcluster-tablet-split-itest XClusterTabletSplitITest
Ensure these tests continue to work
client-test ClientTest.TestKeyRangeFiltering
All other xcluster-tablet-split-itest

Reviewers: nicolas, jhe, rahuldesirazu

Reviewed By: jhe, rahuldesirazu

Subscribers: slingam, bogdan

Differential Revision: https://phabricator.dev.yugabyte.com/D17050
@hari90 hari90 closed this as completed Jun 1, 2022
hari90 added a commit that referenced this issue Jun 2, 2022
…ith different tablet counts in xCluster

Summary:
Use the tablet start and end keys to find best mapping when count of tablets between producer and consumer is different. Consumer tablet with the most key range overlap will be picked. Added start and end key to ProducerTabletListPB. This is populated at initial CDC stream create. When producer or consumer tablets split, and we have key ranges in ProducerTabletListPB then it is used to construct the new mapping. If mapping was created on older build in which case keys wont be available, we will fall back to old round robin behavior.
Old code assumed same tablet count implied same key ranges. This need not be true, so has been fixed. ex: producer default 4 tablets, consumer default 3 tablets + consumer tablet split.

Original commit: a8f3801 / D17050

Test Plan:
Added new xcluster-tablet-split-itest XClusterTabletMapTest
Added new validations in xcluster-tablet-split-itest XClusterTabletSplitITest
Ensure these tests continue to work
client-test ClientTest.TestKeyRangeFiltering
All other xcluster-tablet-split-itest

Reviewers: nicolas, rahuldesirazu, jhe

Reviewed By: jhe

Subscribers: bogdan, slingam

Differential Revision: https://phabricator.dev.yugabyte.com/D17322
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/cdc Change Data Capture
Projects
None yet
Development

No branches or pull requests

5 participants