Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[YSQL] Support PostgreSQL Publication/Replication slot API in CDC #18724

Open
dr0pdb opened this issue Aug 17, 2023 · 0 comments
Open

[YSQL] Support PostgreSQL Publication/Replication slot API in CDC #18724

dr0pdb opened this issue Aug 17, 2023 · 0 comments
Assignees
Labels
area/cdcsdk CDC SDK area/ysql Yugabyte SQL (YSQL) current-roadmap kind/new-feature This is a request for a completely new feature priority/high High Priority roadmap-tracking-issue This issue tracks a major roadmap item, and usually appears in the roadmap list.

Comments

@dr0pdb
Copy link
Contributor

dr0pdb commented Aug 17, 2023

YugabyteDB CDC is built to support the following use cases and motivations. This ticket captures the overall progress of PG compatibility work to support publication, replication slot APIs.

Motivation:

PostgreSQL compatibility:

  • PostgreSQL has a huge community that needs a PG-compatible API to set up and consume database changes.
  • Offer a complete set of PG community connectors for building and managing secure, clean data pipelines, supporting real-time data integrations, and ETL migrations.

Use cases:

  • Enable microservice-oriented architectures to subscribe to changes:

    • Message buses like Kafka, Google PubSub, AWS Kinesis, etc, are likely for microservices.
    • A search system powered by a service such as Elasticsearch may be used in conjunction with the database that stores the transactions
    • Websocket-based consumption through the HTTP endpoint
  • Downstream data warehousing:

    • Write to data warehouses like Snowflake, RedShift, Google BigQuery, etc - for downstream analytics.
    • Generically write to S3 as parquet/JSON/CSV files

Design Document

Jira Link: DB-7623

Phase 1

Status Feature GitHub Issue Comments
CREATE PUBLICATION adds a new publication to the database. #18930 A publication is essentially a group of tables whose data changes are intended to be replicated through CDC
DROP PUBLICATION removes an existing publication from the database. #18931 A publication can only be dropped by its owner or a superuser.
ALTER PUBLICATION can change the attributes of a publication. #18933 Allows to change tables/schemas, publication properties, and the owner and the name of the publication.
Log a notice for unsupported tables in CreatePublication FOR ALL TABLES #19291

Phase 2

Status Feature GitHub Issue Comments
Support the CREATE_REPLICATION_SLOT command to create a logical replication slot. #19211 Replication slots provide an automated way to ensure that YugabyteDB does not remove WAL segments until they have been received by CDC subscribers
Support the DROP_REPLICATION_SLOT command to drop a logical replication slot. #19212
Upgrade path for Publication/Replication Slot model #19261 Will allow migrating existing CDC streams to work with Publication/Replication Slot APIs
Ensure Publication/Replication Slot APIs are atomic #18934 Creation of a Publication/Replication slot involves communication between YSQL layer and YB-master. The CRUD operations should be atomic across the processes.
Migrate all tests to use the API #19599
Update the YB CDC connector to support Publication/Replication slot #19811

Phase 3

Status Feature GitHub Issue Comments
⬜️ Support streaming a subset of insert/update/delete/truncate operations #19250 Allows choosing which operations to stream via the Publication
Support consuming changes from ReplicationSlot. #19441 Subscribers can initiate a replication connection to consume changes via a ReplicationSlot
⬜️ Observability features. #18932 Will provide visibility into the stream progress
⬜️ Support creating temporary Replication Slot. #19263 A temporary replication slot automatically gets deleted upon error or end of session.
@dr0pdb dr0pdb added area/ysql Yugabyte SQL (YSQL) priority/high High Priority labels Aug 17, 2023
@dr0pdb dr0pdb self-assigned this Aug 17, 2023
@yugabyte-ci yugabyte-ci added the kind/enhancement This is an enhancement of an existing feature label Aug 17, 2023
@yugabyte-ci yugabyte-ci added kind/new-feature This is a request for a completely new feature and removed kind/enhancement This is an enhancement of an existing feature labels Aug 31, 2023
@dr0pdb dr0pdb added the roadmap-tracking-issue This issue tracks a major roadmap item, and usually appears in the roadmap list. label Sep 18, 2023
@dr0pdb dr0pdb changed the title [YSQL] Add CRUD API for managing CDC streams [YSQL] YSQL API for CDC Sep 20, 2023
@yugabyte-ci yugabyte-ci added the area/cdcsdk CDC SDK label Sep 22, 2023
@ymahajan ymahajan changed the title [YSQL] YSQL API for CDC [YSQL] [New Feature] Support PostgreSQL compatible Publication/Subscriber API in CDC Oct 8, 2023
@ymahajan ymahajan changed the title [YSQL] [New Feature] Support PostgreSQL compatible Publication/Subscriber API in CDC [YSQL] Support PostgreSQL compatible Publication/Subscriber API in CDC Oct 8, 2023
@ymahajan ymahajan changed the title [YSQL] Support PostgreSQL compatible Publication/Subscriber API in CDC [YSQL] Support PostgreSQL Publication/Subscriber API in CDC Oct 9, 2023
@ymahajan ymahajan changed the title [YSQL] Support PostgreSQL Publication/Subscriber API in CDC [YSQL] Support PostgreSQL Publication/Replication slot API in CDC Oct 9, 2023
dr0pdb added a commit that referenced this issue Oct 11, 2023
…lot name

Summary:
Add support for deleting a CDCSDK stream via its replication slot name. This is the second step in supporting a SQL syntax for CDC via the PG logical replication model.

The `DeleteCDCStreamRequestPB` proto now accepts a repeated list of `cdcsdk_ysql_replication_slot_name` as well. All the CDCSDK streams with these replication slot names are also deleted as part of this request.

**Follow Ups**
1. YSQL layer changes: See #18724 as the tracking issue

**Upgrade/Rollback safety**
This diff modifies the sys-catalog entry stored in yb-master, the changes are not rollback safe. As a result, these changes will be disabled during an upgrade via an autoflag (yb_enable_replication_commands - `LocalPersisted`) and only enabled after the upgrade is finalized.

The responsibility of checking the autoflag `yb_enable_replication_commands` is on the clients of the `DeleteCDCStream` RPC. The client is the YSQL layer commands of Replication Slot which will be added in future diffs.
Jira: DB-8009

Test Plan:
`./yb_build.sh --cxx-test master_xrepl-test --gtest_filter MasterTestXRepl.TestDeleteCDCStreamWithReplicationSlotName`
`./yb_build.sh --cxx-test master_xrepl-test --gtest_filter MasterTestXRepl.TestDeleteCDCStreamWithStreamIdAndReplicationSlotName`
`./yb_build.sh --cxx-test master_xrepl-test --gtest_filter MasterTestXRepl.TestDeleteCDCStreamNotFound`

Reviewers: hsunder, skumar, asrinivasan

Reviewed By: hsunder

Subscribers: ycdcxcluster, ybase, bogdan

Tags: #jenkins-ready

Differential Revision: https://phorge.dev.yugabyte.com/D29148
dr0pdb added a commit that referenced this issue Oct 29, 2023
… of replication slots

Summary:
Introduce support for creating, viewing, and dropping replication slots in YSQL. This is part of the project to support Publication/Replication slot API in YSQL (#18724).

There are two interfaces for the support for create and drop:
1. Functions:
    - `pg_create_logical_replication_slot`
    -  `pg_drop_replication_slot`
2. Walsender commands:
    - CREATE_REPLICATION_SLOT
    - DROP_REPLICATION_SLOT

Both create and drop statements go to yb-master directly. Most of the PG code isn't applicable to YSQL yet and hence it is skipped.

For viewing replication slots, we have a view `pg_replication_slots` which is backed by the function `pg_get_replication_slots`. The schema of the view has been modified by adding an extra yb-specific column `yb_stream_id` which is a text.

Limitations:
1. Only `yboutput` plugin is supported. It'll only be relevant once we add support for consuming replication slots but we are enforcing it from this diff onwards

Apart from the above, this diff fixes two issues:
1. #19509 - Cleanup of held locks in case of an `ereport(elevel >= ERROR)`. This diff fixes that by making sure that we call `LWLockReleaseAll` in `src/postgres/src/backend/access/transam/xact.c` in case of an error. Thanks to Timothy Elgersma.
2. Skipping cache refresh in case of an error in the execution of a replication command. `src/postgres/src/backend/tcop/postgres.c`. This is ok because we only cache DMLs and none of the replication commands are DMLs. We need to do that because the check `yb_is_dml_command` tries to parse the query to check whether it is a DML or not but it doesn't support replication commands. So any `ereport(elevel >= ERROR)` in the execution of a replication command would lead to a syntax error.

TODOs for future:
1. This diff creates a CDC stream with CDCRecordType as `CHANGE`. We need to extend the `pg_create_logical_replication_slot` and `CREATE_REPLICATION_SLOT` syntax to take the CDCRecordType. It'll be done in a future diff
2. DROP_REPLICATION_SLOT commands allows waiting for a slot to become inactive before dropping it. It is unsupported currently and will be done in a future diff
3. Temporary replication slots are unsupported. Will be added in future once we also support consumption via Walsender

Upgrade\Rollback safety:
These changes rely on sys-catalog changes done in yb-master. As a result, all the commands are disabled during upgrade using an autoflag yb_enable_replication_commands (LocalPersisted) and will only be enabled once the user has committed to the new version.

The autoflag was introduced during the Publication syntax support and is being reused here since these are both part of the same project: https://phorge.dev.yugabyte.com/D28721
Jira: DB-8008, DB-8009, DB-8305

Test Plan:
New unit test

```
./yb_build.sh --java-test 'org.yb.pgsql.TestPgReplicationSlot'
```

New Regress test
```
./yb_build.sh --java-test 'org.yb.pgsql.TestPgRegressReplicationSlot'
```

I've also updated most of the CDCSDK tests to now use the ReplicationSlot commands to create a CDCSDK stream instead of an RPC. Remaining tests will be updated in future diffs

Reviewers: dsrinivasan, skumar, asrinivasan, aagrawal

Reviewed By: dsrinivasan

Subscribers: ycdcxcluster, bogdan, ybase, yql

Tags: #jenkins-ready

Differential Revision: https://phorge.dev.yugabyte.com/D29194
dr0pdb added a commit that referenced this issue Dec 13, 2023
…g replication slot name

Summary:
**Backport Description**
The merge was clean. The only difference here is that the changes are under a TEST flag.

**Original Description**
Original commit: a8e2b04 / D29148
Add support for deleting a CDCSDK stream via its replication slot name. This is the second step in supporting a SQL syntax for CDC via the PG logical replication model.

The `DeleteCDCStreamRequestPB` proto now accepts a repeated list of `cdcsdk_ysql_replication_slot_name` as well. All the CDCSDK streams with these replication slot names are also deleted as part of this request.

**Follow Ups**
1. YSQL layer changes: See #18724 as the tracking issue

**Upgrade/Rollback safety**
This diff modifies the sys-catalog entry stored in yb-master, the changes are not rollback safe. As a result, these changes will be disabled during an upgrade via an autoflag (yb_enable_replication_commands - `LocalPersisted`) and only enabled after the upgrade is finalized.

The responsibility of checking the autoflag `yb_enable_replication_commands` is on the clients of the `DeleteCDCStream` RPC. The client is the YSQL layer commands of Replication Slot which will be added in future diffs.
Jira: DB-8009

Test Plan:
`./yb_build.sh --cxx-test master_xrepl-test --gtest_filter MasterTestXRepl.TestDeleteCDCStreamWithReplicationSlotName`
`./yb_build.sh --cxx-test master_xrepl-test --gtest_filter MasterTestXRepl.TestDeleteCDCStreamWithStreamIdAndReplicationSlotName`
`./yb_build.sh --cxx-test master_xrepl-test --gtest_filter MasterTestXRepl.TestDeleteCDCStreamNotFound`

Reviewers: hsunder, skumar, asrinivasan, xCluster

Reviewed By: skumar

Subscribers: bogdan, ybase, ycdcxcluster

Tags: #jenkins-ready

Differential Revision: https://phorge.dev.yugabyte.com/D30995
asrinivasanyb added a commit that referenced this issue Feb 25, 2024
…_SLOT command

Summary:
This is related to the project to support Replication slot API in YSQL (#18724).
(https://phorge.dev.yugabyte.com/D29194).
This is also related to the PG Compatible Logical Replication Consumption project.

In response to the CREATE_REPLICATION_SLOT command, the server will send a one-row result set.
One of the fields in this result set is "snapshot_name". In PG, this refers to the identifier of the
snapshot exported by the command. The snapshot consumption then happens through the execution of a
SELECT query as of the consistent state referred to by "snapshot_name". In YB terms, it would be the
equivalent of querying the database with a specified read point.

The YB support for the CREATE_REPLICATION_SLOT command returns the uint64 representation of the
HybridTime corresponding to the consistent snapshot time of the consistent snapshot stream that
was created as part of the CREATE_REPLICATION_SLOT command. This support is added in this revision.

**UPGRADE/ROLLBACK SAFETY:**
These changes are protected via the preview flag: ysql_yb_enable_replication_commands

The following are the PB changes. These are all for RPC responses.

1 yb::master::CreateCDCStreamResponsePB - addition of 1 optional field
2 yb::tserver::PgCreateReplicationSlotResponsePB - addition of 1 optional field
Jira: DB-10095

Test Plan:
```
./yb_build.sh --cxx-test cdcsdk_consistent_snapshot-test
./yb_build.sh --cxx-test cdcsdk_snapshot-test
./yb_build.sh --java-test 'org.yb.pgsql.TestPgRegressReplicationSlot'
./yb_build.sh --java-test 'org.yb.pgsql.TestPgReplicationSlot'
```

```
Added new test
./yb_build.sh --cxx-test integration-tests_cdcsdk_consistent_snapshot-test --gtest_filter CDCSDKConsistentSnapshotTest.TestSnapshotNameFromCreateReplicationSlot
```

Reviewers: skumar, stiwary, xCluster, hsunder, aagrawal

Reviewed By: stiwary

Subscribers: ycdcxcluster, ybase, yql, bogdan

Differential Revision: https://phorge.dev.yugabyte.com/D32610
asrinivasanyb added a commit that referenced this issue Apr 24, 2024
…s view

Summary:
Summary
This is related to the project to support Replication slot API in YSQL (#18724).
(https://phorge.dev.yugabyte.com/D29194).
This is also related to the PG Compatible Logical Replication Consumption project.

The schema of the pg_replication_slots view has been modified by adding an extra yb-specific
column yb_restart_commit_ht which is a int8.

The value of this column is a uint64 representation of the commit Hybrid Time corresponding
to the restart_lsn. This can be used by the client (like YB-PG Connector) to perform a
consistent snapshot (as of the consistent_point) in the case when a replication slot already
exists.

UPGRADE/ROLLBACK SAFETY:
These changes are protected via the preview flag: ysql_yb_enable_replication_commands
Jira: DB-10956

Test Plan:
Manual Testing
./yb_build.sh --java-test 'org.yb.pgsql.TestPgReplicationSlot'
./yb_build.sh --java-test 'org.yb.pgsql.TestPgRegressReplicationSlot'
./yb_build.sh --java-test 'org.yb.pgsql.TestYsqlUpgrade#upgradeIsIdempotent'
./yb_build.sh --java-test 'org.yb.pgsql.TestYsqlUpgrade#upgradeIsIdempotentSingleConn'

Reviewers: stiwary, skumar

Reviewed By: stiwary

Subscribers: yql, ycdcxcluster

Differential Revision: https://phorge.dev.yugabyte.com/D34279
asrinivasanyb added a commit that referenced this issue Apr 25, 2024
…_replication_slots view

Summary:
**Backport Description**
There were no merge conflicts

**Original Description**
Original commit: 3956dbd / D34279
Summary
This is related to the project to support Replication slot API in YSQL (#18724).
(https://phorge.dev.yugabyte.com/D29194).
This is also related to the PG Compatible Logical Replication Consumption project.

The schema of the pg_replication_slots view has been modified by adding an extra yb-specific
column yb_restart_commit_ht which is a int8.

The value of this column is a uint64 representation of the commit Hybrid Time corresponding
to the restart_lsn. This can be used by the client (like YB-PG Connector) to perform a
consistent snapshot (as of the consistent_point) in the case when a replication slot already
exists.

UPGRADE/ROLLBACK SAFETY:
These changes are protected via the preview flag: ysql_yb_enable_replication_commands
Jira: DB-10956

Test Plan:
Manual Testing
./yb_build.sh --java-test 'org.yb.pgsql.TestPgReplicationSlot'
./yb_build.sh --java-test 'org.yb.pgsql.TestPgRegressReplicationSlot'
./yb_build.sh --java-test 'org.yb.pgsql.TestYsqlUpgrade#upgradeIsIdempotent'
./yb_build.sh --java-test 'org.yb.pgsql.TestYsqlUpgrade#upgradeIsIdempotentSingleConn'

Reviewers: stiwary, skumar

Reviewed By: stiwary

Subscribers: ycdcxcluster, yql

Tags: #jenkins-ready

Differential Revision: https://phorge.dev.yugabyte.com/D34525
svarnau pushed a commit that referenced this issue May 25, 2024
…s view

Summary:
Summary
This is related to the project to support Replication slot API in YSQL (#18724).
(https://phorge.dev.yugabyte.com/D29194).
This is also related to the PG Compatible Logical Replication Consumption project.

The schema of the pg_replication_slots view has been modified by adding an extra yb-specific
column yb_restart_commit_ht which is a int8.

The value of this column is a uint64 representation of the commit Hybrid Time corresponding
to the restart_lsn. This can be used by the client (like YB-PG Connector) to perform a
consistent snapshot (as of the consistent_point) in the case when a replication slot already
exists.

UPGRADE/ROLLBACK SAFETY:
These changes are protected via the preview flag: ysql_yb_enable_replication_commands
Jira: DB-10956

Test Plan:
Manual Testing
./yb_build.sh --java-test 'org.yb.pgsql.TestPgReplicationSlot'
./yb_build.sh --java-test 'org.yb.pgsql.TestPgRegressReplicationSlot'
./yb_build.sh --java-test 'org.yb.pgsql.TestYsqlUpgrade#upgradeIsIdempotent'
./yb_build.sh --java-test 'org.yb.pgsql.TestYsqlUpgrade#upgradeIsIdempotentSingleConn'

Reviewers: stiwary, skumar

Reviewed By: stiwary

Subscribers: yql, ycdcxcluster

Differential Revision: https://phorge.dev.yugabyte.com/D34279
ZhenYongFan pushed a commit to ZhenYongFan/yugabyte-db that referenced this issue Jun 15, 2024
…mn in pg_replication_slots view

Summary:
**Backport Description**
There were no merge conflicts

**Original Description**
Original commit: 3956dbd / D34279
Summary
This is related to the project to support Replication slot API in YSQL (yugabyte#18724).
(https://phorge.dev.yugabyte.com/D29194).
This is also related to the PG Compatible Logical Replication Consumption project.

The schema of the pg_replication_slots view has been modified by adding an extra yb-specific
column yb_restart_commit_ht which is a int8.

The value of this column is a uint64 representation of the commit Hybrid Time corresponding
to the restart_lsn. This can be used by the client (like YB-PG Connector) to perform a
consistent snapshot (as of the consistent_point) in the case when a replication slot already
exists.

UPGRADE/ROLLBACK SAFETY:
These changes are protected via the preview flag: ysql_yb_enable_replication_commands
Jira: DB-10956

Test Plan:
Manual Testing
./yb_build.sh --java-test 'org.yb.pgsql.TestPgReplicationSlot'
./yb_build.sh --java-test 'org.yb.pgsql.TestPgRegressReplicationSlot'
./yb_build.sh --java-test 'org.yb.pgsql.TestYsqlUpgrade#upgradeIsIdempotent'
./yb_build.sh --java-test 'org.yb.pgsql.TestYsqlUpgrade#upgradeIsIdempotentSingleConn'

Reviewers: stiwary, skumar

Reviewed By: stiwary

Subscribers: ycdcxcluster, yql

Tags: #jenkins-ready

Differential Revision: https://phorge.dev.yugabyte.com/D34525
@dr0pdb dr0pdb reopened this Aug 1, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/cdcsdk CDC SDK area/ysql Yugabyte SQL (YSQL) current-roadmap kind/new-feature This is a request for a completely new feature priority/high High Priority roadmap-tracking-issue This issue tracks a major roadmap item, and usually appears in the roadmap list.
Projects
None yet
Development

No branches or pull requests

3 participants