Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

executor: support select for update no wait #12775

Merged
merged 29 commits into from
Nov 4, 2019
Merged

Conversation

cfzjywxk
Copy link
Contributor

@cfzjywxk cfzjywxk commented Oct 17, 2019

What problem does this PR solve?

support select for update nowait
to implement

# Session 1:

mysql> CREATE TABLE t (i INT, PRIMARY KEY (i)) ENGINE = InnoDB;

mysql> INSERT INTO t (i) VALUES(1),(2),(3);

mysql> START TRANSACTION;

mysql> SELECT * FROM t WHERE i = 2 FOR UPDATE;
+---+
| i |
+---+
| 2 |
+---+

# Session 2:

mysql> START TRANSACTION;

mysql> SELECT * FROM t WHERE i = 2 FOR UPDATE NOWAIT;
ERROR 3572 (HY000): Statement aborted because lock(s) could not be acquired immediately and NOWAIT is set.

What is changed and how it works?

need to discuss: how to process forUpdateTS

for pessimistic txn
in same txn, user tries

session1:
select for update key1 fail(ErrLock or max execution time or sth else.  lock key1 not get)
(the error is not "deadLockErr", rollBack pessmisitc key async)

select for update key1 again(this time succ, key locked like in "nowait" situation)
(the async rollback rollbacked the lock acquired in last step, since the forUpdateTs will not be updated
in "select for update" path)

session2:
after session1 step2
user tries to select for update key1,  lock succ(should fail)
  1. do we need to make kvproto
    bool nowait -> int waitSec to make it more flexible to support possible [wait n | nowait]

Check List

Tests

  • Unit test

Code changes

  • Has interface methods change

Side effects

Related changes

  • Need to cherry-pick to the release branch
  • Need to update the documentation

Release note

executor/executor.go Outdated Show resolved Hide resolved
// Check lock conflict error for nowait, if nowait set and key locked by others
// report error immediately and do no resolve locks any more
if c.lockNoWait {
if keyErr.GetLocked() != nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can this lock be a committed or rollbacked lock in pessimistic txn and need check lock state? I'm not familiar with this :D

sessionctx/context.go Outdated Show resolved Hide resolved
config/config.go Outdated Show resolved Hide resolved
// the async rollback operation rollbacked the lock just acquired
newForUpdateTS, tsErr := a.Ctx.GetStore().GetOracle().GetTimestamp(ctx)
if tsErr != nil {
return nil, tsErr
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if fail happened here, UpdateTS will not be updated, and the previous async rollback is possiblely ongoing,
still risk that the later select for update rollbacked ? @coocood

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This error rarely happens, so even if this happened, the async rollback causes the transaction to fail, it is acceptable.

Copy link
Contributor Author

@cfzjywxk cfzjywxk Oct 18, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tiancaiamao @coocood @jackysp
I'll add and use getWithRetry, but I think here still risk with nowait. User get this statement result error, next statement pessimistic lock getForUpdateTs not updated(select for update, update, delete operations), ths "async rollback" risk still exists, how should we process the error here?
Or maybe we make rollback sync for "nowait" locked error?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sync rollback may still send duplicated requests later.

executor/executor.go Outdated Show resolved Hide resolved
store/tikv/error.go Outdated Show resolved Hide resolved
executor/adapter.go Outdated Show resolved Hide resolved
store/tikv/2pc.go Outdated Show resolved Hide resolved
go.mod Outdated Show resolved Hide resolved
session/pessimistic_test.go Show resolved Hide resolved
@@ -112,6 +112,8 @@ type twoPhaseCommitter struct {
regionTxnSize map[uint64]int
// Used by pessimistic transaction and large transaction.
ttlManager
// Used for select for update in ms, 0 means always wait, 1 means nowait
lockWaitTime uint64
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why should lockWaitTime be a struct field in twoPhaseCommitter?

begin;
select xxx for update no wait
select yyy for update  // The committer is still there but `lockWaitTime` change?
commit;

How about pass lockWaitTime as an argument?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pessimisticLockSingleBatch and other proc functions from getProcFuncByType have the same declaration, seems need more change to pass it as a variable into NewRequest call

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did a refactoring and twoPhaseCommitAction can carry more information now. You can see how actionPessimisticLock is done on the master branch.

Never mind, as we can also merge the PR and refactor later.

Copy link
Contributor Author

@cfzjywxk cfzjywxk Oct 25, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Roger, I've learned the action refactoration when resolving conflicts, this is refactored passing time param as parameter not member of twoPhaseCommitter

@cfzjywxk
Copy link
Contributor Author

cfzjywxk commented Oct 25, 2019

still has some unstable unit tests

[2019-10-25T08:06:02.802Z] ----------------------------------------------------------------------
[2019-10-25T08:06:02.802Z] FAIL: ddl_test.go:442: testSuite6.TestRenameTable
[2019-10-25T08:06:02.802Z] 
[2019-10-25T08:06:02.802Z] ddl_test.go:495:
[2019-10-25T08:06:02.802Z]     result.Check(testkit.Rows("1", "100000", "100001"))
[2019-10-25T08:06:02.802Z] /home/jenkins/agent/workspace/tidb_ghpr_unit_test/go/src/github.com/pingcap/tidb/util/testkit/testkit.go:62:
[2019-10-25T08:06:02.802Z]     res.c.Assert(resBuff.String(), check.Equals, needBuff.String(), res.comment)
[2019-10-25T08:06:02.802Z] ... obtained string = "" +
[2019-10-25T08:06:02.802Z] ...     "[1]\n" +
[2019-10-25T08:06:02.802Z] ...     "[100000]\n" +
[2019-10-25T08:06:02.802Z] ...     "[2000001]\n"
[2019-10-25T08:06:02.802Z] ... expected string = "" +
[2019-10-25T08:06:02.802Z] ...     "[1]\n" +
[2019-10-25T08:06:02.802Z] ...     "[100000]\n" +
[2019-10-25T08:06:02.802Z] ...     "[100001]\n"
[2019-10-25T08:06:02.802Z] ... sql:select * from rename2.t1, args:[]
[2019-10-25T08:06:02.802Z] 
[2019-10-25T08:06:02.802Z] ddl_test.go:495:
[2019-10-25T08:06:02.802Z]     result.Check(testkit.Rows("1", "100000", "100001"))
[2019-10-25T08:06:02.802Z] ddl_test.go:445:
[2019-10-25T08:06:02.802Z]     c.Assert(failpoint.Disable("github.com/pingcap/tidb/meta/autoid/mockAutoIDChange"), IsNil)
[2019-10-25T08:06:02.802Z] ... value *errors.errorString = &errors.errorString{s:"failpoint: failpoint is disabled"} ("failpoint: failpoint is disabled")
[2019-10-25T08:06:02.802Z]

@cfzjywxk
Copy link
Contributor Author

@jackysp @tiancaiamao PTAL

@cfzjywxk
Copy link
Contributor Author

/run-all-tests

Copy link
Contributor

@tiancaiamao tiancaiamao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@coocood
Copy link
Member

coocood commented Oct 25, 2019

/release

@crazycs520
Copy link
Contributor

/run-all-tests

@cfzjywxk
Copy link
Contributor Author

@jackysp PTAL

Copy link
Member

@jackysp jackysp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM
Should we wait for tikv/tikv#5680 to merge?

@youjiali1995
Copy link
Contributor

/release

@cfzjywxk
Copy link
Contributor Author

cfzjywxk commented Nov 1, 2019

/run-all-tests

@cfzjywxk
Copy link
Contributor Author

cfzjywxk commented Nov 1, 2019

/release

@jackysp jackysp added status/LGT2 Indicates that a PR has LGTM 2. and removed status/LGT1 Indicates that a PR has LGTM 1. labels Nov 4, 2019
@cfzjywxk
Copy link
Contributor Author

cfzjywxk commented Nov 4, 2019

LGTM
Should we wait for tikv/tikv#5680 to merge?

tikv pull requests merged, can we merge this now ? @coocood @jackysp

Copy link
Member

@coocood coocood left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@coocood
Copy link
Member

coocood commented Nov 4, 2019

/merge

@sre-bot sre-bot added the status/can-merge Indicates a PR has been approved by a committer. label Nov 4, 2019
@sre-bot
Copy link
Contributor

sre-bot commented Nov 4, 2019

/run-all-tests

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
sig/execution SIG execution sig/transaction SIG:Transaction status/can-merge Indicates a PR has been approved by a committer. status/LGT2 Indicates that a PR has LGTM 2.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants