Skip to content
This repository has been archived by the owner on Aug 21, 2023. It is now read-only.

add checksum to make sure dumped data is correct #251

Open
lichunzhu opened this issue Mar 23, 2021 · 1 comment
Open

add checksum to make sure dumped data is correct #251

lichunzhu opened this issue Mar 23, 2021 · 1 comment
Labels
help wanted priority/P2 Medium priority issue severity/moderate This issue is a moderate bug

Comments

@lichunzhu
Copy link
Contributor

Feature Request

Is your feature request related to a problem? Please describe:

Currently, we don't know whether dumped data is correct because we don't have a checksum mechanism now.

Describe the feature you'd like:

Add checksum to make sure the dumped data is correct.

Describe alternatives you've considered:

Teachability, Documentation, Adoption, Optimization:

@lichunzhu lichunzhu added difficulty/2-medium priority/P2 Medium priority issue severity/moderate This issue is a moderate bug labels Mar 23, 2021
@kennytm
Copy link
Collaborator

kennytm commented Mar 24, 2021

First we should record the SHA-256 + size of each file and COUNT(*) of each table somewhere. I suggest we do not reuse the metadata file, its format is not good for verification.

Then, we do the actual checksum of data.

MySQL has the CHECKSUM TABLE statement but this is not supported by TiDB (pingcap/tidb#1895). Furthermore, MySQL's CHECKSUM TABLE is not guaranteed to be stable, nor is the checksum method explicitly documented. So let's ignore this feature.

We could reuse sync-diff-inspector's CRC32 checksum from https://github.com/pingcap/tidb-tools/blob/0297393b93b9dbc57fc07a17c898dd621467ef7f/pkg/dbutil/common.go#L373, but we better change the function signature to not take a *model.TableInfo 😏.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
help wanted priority/P2 Medium priority issue severity/moderate This issue is a moderate bug
Projects
None yet
Development

No branches or pull requests

3 participants