Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TiCDC: The sync-point table got replicated (esp. in bdr-mode) #10576

Closed
kennytm opened this issue Feb 1, 2024 · 4 comments · Fixed by #10587
Closed

TiCDC: The sync-point table got replicated (esp. in bdr-mode) #10576

kennytm opened this issue Feb 1, 2024 · 4 comments · Fixed by #10587
Assignees
Labels
affects-6.5 This bug affects the 6.5.x(LTS) versions. affects-7.1 This bug affects the 7.1.x(LTS) versions. affects-7.5 This bug affects the 7.5.x(LTS) versions. affects-7.6 area/ticdc Issues or PRs related to TiCDC. found/gs report/customer Customers have encountered this bug. severity/major type/bug The issue is confirmed as a bug.

Comments

@kennytm
Copy link
Contributor

kennytm commented Feb 1, 2024

What did you do?

  1. Create two clusters

    tiup playground v7.6.0 --host 10.0.0.2 --db 1 --kv 1 --pd 1 --ticdc 1 --tiflash 0 --without-monitor
    rm /tmp/tidb-4000.sock
    tiup playground v7.6.0 --host 10.0.0.48 --db 1 --kv 1 --pd 1 --ticdc 1 --tiflash 0 --without-monitor
  2. Setup two changefeeds between them

    # 2to48.toml
    bdr-mode = true
    enable-sync-point = true
    sync-point-interval = "31s"
    # 48to2.toml
    bdr-mode = true
    enable-sync-point = true
    sync-point-interval = "37s"
    tiup cdc:v7.6.0 cli changefeed create --pd http://10.0.0.2:2379 --sink-uri mysql://root@10.0.0.48 -c 2to48 --config 2to48.toml
    tiup cdc:v7.6.0 cli changefeed create --pd http://10.0.0.48:2379 --sink-uri mysql://root@10.0.0.2 -c 48to2 --config 48to2.toml
  3. Wait for several minutes (because minSyncPointInterval is 30s 🤷)

  4. Check the syncpoint tables

    mysql --host 10.0.0.2 --port 4000 -u root -e 'select * from tidb_cdc.syncpoint_v1;'
    mysql --host 10.0.0.2 --port 4000 -u root -e 'select count(*) from tidb_cdc.syncpoint_v1;'
    
    mysql --host 10.0.0.48 --port 4000 -u root -e 'select * from tidb_cdc.syncpoint_v1;'
    mysql --host 10.0.0.48 --port 4000 -u root -e 'select count(*) from tidb_cdc.syncpoint_v1;'

What did you expect to see?

The syncpoint table in 10.0.0.2 only contained entries of the 48to2 changefeed,
and similarly 10.0.0.48 only contained entries of the 2to48 changefeed.

What did you see instead?

Both clusters contained sync-point entries from both changefeeds.

Versions of the cluster

v7.6.0

@kennytm kennytm added type/bug The issue is confirmed as a bug. area/ticdc Issues or PRs related to TiCDC. found/gs labels Feb 1, 2024
@kennytm
Copy link
Contributor Author

kennytm commented Feb 1, 2024

What happened here is that tidb_cdc.* is not treated as a system schema (unlike DM_HEARTBEAT.*), so the syncpoint_v1 table will be captured by CDC and replicated.

I think two changes should be made:

  1. Make isSysSchema() also treat tidb_cdc.* as a system schema. This also helps the case of non-BDR "A→B→C" scenario, where the syncpoint table of A→B polluted B→C.
  2. Set @@TIDB_CDC_WRITE_SOURCE for all operations in https://github.com/pingcap/tiflow/blob/v7.6.0/cdc/syncpointstore/mysql_syncpoint_store.go, so the A→B change won't even get pulled from B's TiKV in B→A.

@kennytm kennytm added affects-7.5 This bug affects the 7.5.x(LTS) versions. affects-7.6 labels Feb 1, 2024
@fubinzh
Copy link

fubinzh commented Feb 2, 2024

/severity major

@asddongmen
Copy link
Contributor

What happened here is that tidb_cdc.* is not treated as a system schema (unlike DM_HEARTBEAT.*), so the syncpoint_v1 table will be captured by CDC and replicated.

I think two changes should be made:

  1. Make isSysSchema() also treat tidb_cdc.* as a system schema. This also helps the case of non-BDR "A→B→C" scenario, where the syncpoint table of A→B polluted B→C.
  2. Set @@TIDB_CDC_WRITE_SOURCE for all operations in v7.6.0/cdc/syncpointstore/mysql_syncpoint_store.go, so the A→B change won't even get pulled from B's TiKV in B→A.

Thank you for the advice. In my opinion, only step 1 is necessary. When the schema tidb_cdc is considered as a system schema, it will not be listened to by CDC at all.

@seiya-annie
Copy link

/found customer

@ti-chi-bot ti-chi-bot bot added the report/customer Customers have encountered this bug. label Jun 4, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
affects-6.5 This bug affects the 6.5.x(LTS) versions. affects-7.1 This bug affects the 7.1.x(LTS) versions. affects-7.5 This bug affects the 7.5.x(LTS) versions. affects-7.6 area/ticdc Issues or PRs related to TiCDC. found/gs report/customer Customers have encountered this bug. severity/major type/bug The issue is confirmed as a bug.
Projects
Development

Successfully merging a pull request may close this issue.

4 participants