Skip to content
This repository has been archived by the owner on Nov 24, 2023. It is now read-only.

sync: refactor dml.go #2234

Merged
merged 93 commits into from
Oct 25, 2021
Merged

sync: refactor dml.go #2234

merged 93 commits into from
Oct 25, 2021

Conversation

GMHDBJD
Copy link
Collaborator

@GMHDBJD GMHDBJD commented Oct 18, 2021

What problem does this PR solve?

refactor dml.go for later develop

  • use INSERT ON DUPLICATE KEY UPDATE instead of REPLACE in safemode
  • generate DML SQL string before execute rather than before addJob.

Check List

Tests

  • orginal test
  • unit test

Copy link
Collaborator Author

@GMHDBJD GMHDBJD left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PTAL

}

// genKeyList format keys.
func genKeyList(table string, columns []*model.ColumnInfo, dataSeq []interface{}) string {
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same as

func genKeyList(table string, columns []*model.ColumnInfo, dataSeq []interface{}) string {


// genMultipleKeys gens keys with UNIQUE NOT NULL value.
// if not UNIQUE NOT NULL value, use table name instead.
func genMultipleKeys(ti *model.TableInfo, value []interface{}, table string) []string {
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same as

func genMultipleKeys(ti *model.TableInfo, value []interface{}, table string) []string {

}

// genWhere generates where condition.
func (dml *DML) genWhere(buf *strings.Builder) []interface{} {
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

refactor from

func genWhere(buf *strings.Builder, columns []*model.ColumnInfo, data []interface{}) {

@GMHDBJD GMHDBJD added status/PTAL This PR is ready for review. Add this label back after committing new changes needs-cherry-pick-release-2.0 This PR should be cherry-picked to release-2.0. Remove this label after cherry-picked to release-2.0 and removed status/WIP This PR is still work in progress labels Oct 19, 2021
@lance6716 lance6716 changed the title sync: refactor dml.go [WIP] sync: refactor dml.go Oct 19, 2021
@lance6716
Copy link
Collaborator

I forget why we choose INSERT ON DUPLICATE KEY UPDATE? seems INSERT ON DUPLICATE KEY UPDATE and REPLACE can both use BatchGet when have multiple values

image
test they two ⬆️

syncer/dml.go Outdated Show resolved Hide resolved
syncer/dml.go Show resolved Hide resolved
syncer/dml.go Show resolved Hide resolved
syncer/dml.go Outdated Show resolved Hide resolved
syncer/dml.go Outdated Show resolved Hide resolved
syncer/dml.go Show resolved Hide resolved
syncer/dml.go Outdated Show resolved Hide resolved
syncer/dml.go Show resolved Hide resolved
@GMHDBJD
Copy link
Collaborator Author

GMHDBJD commented Oct 20, 2021

I forget why we choose INSERT ON DUPLICATE KEY UPDATE? seems INSERT ON DUPLICATE KEY UPDATE and REPLACE can both use BatchGet when have multiple values

I check the TiDB code, REPLACE use get and INSERT ON DUPLICATE KEY UPDATE use batch get.
https://github.com/pingcap/tidb/blob/94e30df8e2d8ba2a1a26f153f40067ba3acd78eb/executor/replace.go#L130
https://github.com/pingcap/tidb/blob/94e30df8e2d8ba2a1a26f153f40067ba3acd78eb/executor/insert.go#L72

Another reason is that if update statement doesn't update index value, change it to REPLACE statement will delete and insert it, which will update index. While change it to INSERT ON DUPLICATE KEY UPDATE will be a update, which will not update index.

Furthermore, if we are in safemode, REPLACE on same value will still delete and insert the row again, but INSERT ON DUPLICATE UPDATE will not update the row at all.

I tested it before and it was in line with this conclusion

@lance6716
Copy link
Collaborator

lance6716 commented Oct 20, 2021

in my memory REPLACE will also use batch get to get all rows before mutation https://github.com/pingcap/tidb/blob/94e30df8e2d8ba2a1a26f153f40067ba3acd78eb/executor/replace.go#L233

good to reduce index kv mutation 👍 seems ON DUPLICATE KEY UPDATE will use TableCommon.UpdateRecord. compared with TableCommon.AddRecord, it will not put index KVs.

Copy link
Contributor

@Ehco1996 Ehco1996 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

syncer/dml_test.go Outdated Show resolved Hide resolved
@ti-chi-bot ti-chi-bot added the status/LGT1 One reviewer already commented LGTM label Oct 21, 2021
@GMHDBJD
Copy link
Collaborator Author

GMHDBJD commented Oct 22, 2021

/run-all-tests

Copy link
Collaborator

@lance6716 lance6716 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rest lgtm

syncer/dml.go Outdated
@@ -33,6 +33,7 @@ import (
// genDMLParam stores pruned columns, data as well as the original columns, data, index.
type genDMLParam struct {
tableID string // as a key in map like `schema`.`table`
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

better to note this is target table ID (maybe in future PR)

syncer/dml.go Outdated
sqls = append(sqls, sql)
values = append(values, value)
keys = append(keys, ks)
dmls = append(dmls, newDML(insert, param.safeMode, param.tableID, param.sourceTable, nil, value, nil, originalValue, columns, ti))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

slightly recommend that if DML struct will always stay same package with syncer, I think it's more readable that we use literal to create it. Like

DML{op: insert, ..., sourceTableInfo: ti}

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So many args, so keep this function.

syncer/dml.go Outdated
key := genKeyList(table, cols, vals)
if len(key) > 0 { // ignore `null` value.
multipleKeys = append(multipleKeys, key)
// TODO: break here? one unique index is enough?
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this comment is added by me about one year ago, now it can be removed 😂

@@ -87,18 +84,15 @@ type job struct {

func (j *job) String() string {
// only output some important information, maybe useful in execution.
return fmt.Sprintf("tp: %s, sql: %s, args: %v, key: %s, ddls: %s, last_location: %s, start_location: %s, current_location: %s", j.tp, j.sql, j.args, j.key, j.ddls, j.location, j.startLocation, j.currentLocation)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how can we know the DML values in debug mode now?

@ti-chi-bot ti-chi-bot added status/LGT2 Two reviewers already commented LGTM, ready for merge and removed status/LGT1 One reviewer already commented LGTM labels Oct 25, 2021
@GMHDBJD
Copy link
Collaborator Author

GMHDBJD commented Oct 25, 2021

/merge

@ti-chi-bot
Copy link
Member

This pull request has been accepted and is ready to merge.

Commit hash: 2ebdb46

@ti-chi-bot ti-chi-bot merged commit d625271 into pingcap:master Oct 25, 2021
ti-chi-bot pushed a commit to ti-chi-bot/dm that referenced this pull request Oct 25, 2021
@ti-chi-bot
Copy link
Member

In response to a cherrypick label: new pull request created: #2256.

ti-chi-bot added a commit that referenced this pull request Oct 25, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
needs-cherry-pick-release-2.0 This PR should be cherry-picked to release-2.0. Remove this label after cherry-picked to release-2.0 size/XXL status/can-merge status/LGT2 Two reviewers already commented LGTM, ready for merge status/PTAL This PR is ready for review. Add this label back after committing new changes
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants