Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reduce duplication of DM's syncer and CDC's MySQL sink #3242

Open
lance6716 opened this issue Nov 2, 2021 · 5 comments
Open

Reduce duplication of DM's syncer and CDC's MySQL sink #3242

lance6716 opened this issue Nov 2, 2021 · 5 comments
Assignees
Labels
area/dm Issues or PRs related to DM. area/ticdc Issues or PRs related to TiCDC. help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. type/enhancement The issue or PR belongs to an enhancement.

Comments

@lance6716
Copy link
Contributor

lance6716 commented Nov 2, 2021

noticed there're lot of functions can be reused:

(a finished item)


https://github.com/pingcap/ticdc/blob/4bc1853a10287f12aff5706123c0cfc39feae7d2/cdc/sink/mysql.go#L1203 vs https://github.com/pingcap/ticdc/blob/4bc1853a10287f12aff5706123c0cfc39feae7d2/dm/syncer/dml.go#L713


sink/causality.go vs syncer/causality.go

@lance6716 lance6716 added question Further information is requested. help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. and removed question Further information is requested. labels Nov 2, 2021
@fatelei
Copy link
Contributor

fatelei commented Nov 2, 2021

can I have a try to make a pull request?

@lance6716
Copy link
Contributor Author

lance6716 commented Nov 2, 2021

can I have a try to make a pull request?

thanks! we’re looking forward to your contribution.

also please keep pr in a small size, that’s better for reviewing.

@lance6716 lance6716 changed the title Reduce duplication of DM and CDC's MySQL sink Reduce duplication of DM's syncer and CDC's MySQL sink Nov 2, 2021
@fatelei
Copy link
Contributor

fatelei commented Nov 2, 2021

can I have a try to make a pull request?

thanks! we’re looking forward to your contribution.

also please keep pr in a small size, that’s better for reviewing.

ok

@Rustin170506 Rustin170506 added area/dm Issues or PRs related to DM. area/ticdc Issues or PRs related to TiCDC. labels Nov 5, 2021
@amyangfei
Copy link
Contributor

amyangfei commented Nov 19, 2021

Causality detection is also a common library used both in DM and TiCDC. ref: https://github.com/pingcap/ticdc/issues/3286

@lance6716
Copy link
Contributor Author

lance6716 commented Dec 30, 2021

some comparison of DML part of mysqlSink and Syncer

dml comes from:

  • CDC: will periodically get resolved transaction from txnCache
  • DM: will read the (DML) job from a channel, the sender of the channel is the binlog event loop

processing stages:

  • CDC: casuality -> generate DML -> batch a SQL transactoin
  • DM: compact (optional) -> casuality -> generate DML -> batch a SQL transactoin

compact

  • DM: for frequently updated row, only replicate the latest data

generate DML:

  • CDC: only generate DELETE and REPLACE per row
  • DM: will generate many types including INSERT ON DUPLICATE UPDATE, and wil retry to merge successive rows of one table and one type into one DML, in other words, one DML with many row changes

batch a SQL transactoin

  • CDC: when the batch is not full, minimal lag is ticker interval
  • DM: when the batch is not full but channel is empty, directly execute the batch

cyclic replication

  • CDC: when generate DML, a cyclic marker row is also generated

cc @amyangfei @GMHDBJD

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/dm Issues or PRs related to DM. area/ticdc Issues or PRs related to TiCDC. help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. type/enhancement The issue or PR belongs to an enhancement.
Projects
None yet
Development

No branches or pull requests

5 participants