From 0ee02bb5579a33ce80a7febd2e135e888c81deac Mon Sep 17 00:00:00 2001 From: Adam Bloom Date: Mon, 20 Jun 2022 17:01:16 -0600 Subject: [PATCH 1/2] postgres source: fix CDC setup order docs --- docs/integrations/sources/postgres.md | 22 +++++++++++----------- 1 file changed, 11 insertions(+), 11 deletions(-) diff --git a/docs/integrations/sources/postgres.md b/docs/integrations/sources/postgres.md index 9d4338cdbf07..8ee70ab18e22 100644 --- a/docs/integrations/sources/postgres.md +++ b/docs/integrations/sources/postgres.md @@ -125,17 +125,7 @@ We recommend using a user specifically for Airbyte's replication so you can mini We recommend using a `pgoutput` plugin as it is the standard logical decoding plugin in Postgres. In case the replication table contains a lot of big JSON blobs and table size exceeds 1 GB, we recommend using a `wal2json` instead. Please note that `wal2json` may require additional installation for Bare Metal, VMs \(EC2/GCE/etc\), Docker, etc. For more information read [wal2json documentation](https://github.com/eulerto/wal2json). -#### 4. Create replication slot - -Next, you will need to create a replication slot. Here is the query used to create a replication slot called `airbyte_slot`: - -```text -SELECT pg_create_logical_replication_slot('airbyte_slot', 'pgoutput'); -``` - -If you would like to use `wal2json` plugin, please change `pgoutput` to `wal2json` value in the above query. - -#### 5. Create publications and replication identities for tables +#### 4. Create publications and replication identities for tables For each table you want to replicate with CDC, you should add the replication identity \(the method of distinguishing between rows\) first. We recommend using `ALTER TABLE tbl1 REPLICA IDENTITY DEFAULT;` to use primary keys to distinguish between rows. After setting the replication identity, you will need to run `CREATE PUBLICATION airbyte_publication FOR TABLE ;`. This publication name is customizable. Please refer to the [Postgres docs](https://www.postgresql.org/docs/10/sql-alterpublication.html) if you need to add or remove tables from your publication in the future. @@ -145,6 +135,16 @@ Please note that: The UI currently allows selecting any tables for CDC. If a table is selected that is not part of the publication, it will not replicate even though it is selected. If a table is part of the publication but does not have a replication identity, that replication identity will be created automatically on the first run if the Airbyte user has the necessary permissions. +#### 5. Create replication slot + +Next, you will need to create a replication slot. Here is the query used to create a replication slot called `airbyte_slot`: + +```text +SELECT pg_create_logical_replication_slot('airbyte_slot', 'pgoutput'); +``` + +If you would like to use `wal2json` plugin, please change `pgoutput` to `wal2json` value in the above query. + #### 6. Start syncing When configuring the source, select CDC and provide the replication slot and publication you just created. You should be ready to sync data with CDC! From 5e2dd59f648f7302acf8928e4619cef9f05f4716 Mon Sep 17 00:00:00 2001 From: Liren Tu Date: Tue, 21 Jun 2022 11:51:30 -0700 Subject: [PATCH 2/2] Update docs/integrations/sources/postgres.md --- docs/integrations/sources/postgres.md | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/docs/integrations/sources/postgres.md b/docs/integrations/sources/postgres.md index 8ee70ab18e22..311fa4a905da 100644 --- a/docs/integrations/sources/postgres.md +++ b/docs/integrations/sources/postgres.md @@ -137,7 +137,9 @@ The UI currently allows selecting any tables for CDC. If a table is selected tha #### 5. Create replication slot -Next, you will need to create a replication slot. Here is the query used to create a replication slot called `airbyte_slot`: +Next, you will need to create a replication slot. It's important to create the publication first (as in step 4) before creating the replication slot. Otherwise, you can run into exceptions if there is any update to the database between the creation of the two. + +Here is the query used to create a replication slot called `airbyte_slot`: ```text SELECT pg_create_logical_replication_slot('airbyte_slot', 'pgoutput');