Add keyonly support for seek #7419

shafreeck · 2018-08-16T13:37:05Z

What problem does this PR solve?

Only return keys when do seek

What is changed and how it works?

Add a new Option: KeyOnly

// Before seek , set the KeyOnly option, it takes effects until you set it false in this transaction
txn.SetOption(kv.KeyOnly, true)

The TiKV RPC method has already supported this option, so there is nothing to change with the remote server

Check List

Tests

Unit test
Integration test
Manual test (add detailed scripts or steps below)

    txn.SetOption(kv.KeyOnly, true)
    iter, err := txn.Seek("your key")
    if err != nil {
        return err
    }
    for iter.Valid()  {
        fmt.Println(iter.Key(), iter.Value)
        if err := iter.Next(); err != nil {
          return  err
        }
    }

No code

sre-bot · 2018-08-16T13:37:08Z

Hi contributor, thanks for your PR.

This patch needs to be approved by someone of admins. They should reply with "/ok-to-test" to accept this PR for running test automatically.

CLAassistant · 2018-08-16T13:37:12Z

All committers have signed the CLA.

siddontang · 2018-08-16T16:00:44Z

store/tikv/scan.go

@@ -98,10 +100,6 @@ func (s *Scanner) Next() error {
 			s.Close()
 			return errors.Trace(err)
 		}
-		if len(s.Value()) == 0 {


seem we need to check keyOnly here?

We should not assume that the key is not exist if len(s.Value()) == 0, it is conflict with the concept key only. So the TiKV server would (and must )not return an empty value to indicate a non-exist key when doing seek.

AFAIK, the current situation is that TiKV allows empty values, but TiDB does not because an empty value in the BufferStore indicates deletion.

AFAIK, the current situation is that TiKV allows empty values

Yeah, that’s why I said an empty value does not mean that a key is non-exist. The code here is weird, I don’t know what is the original design purpose. Can any one give some information?

But you cannot remove it directly here. Be aware that resolveCurrentLock() used snapshot.Get(), which returns nil when the key does not exist. You have to change the behaviour of Get if you really want to accept empty values.

I checked the keyOnly option and the CI passed. However, I think the problem is not resolved. The keyOnly option does not change the behavior of TiKV server, so is there any case that the server returns a key which does not exist?

I got it that when deleting a key with prewrite succeeded and commit failed, the key will be returned by the TiKV server with an error. Now I am trying to fix it.

shafreeck · 2018-08-17T03:29:12Z

The KeyOnly Option only takes effects on snapshot now, I think it is necessary to work on BufferStore as well. I did not find how to add this option for BufferStore gracefully, do you guys have any suggestions ?

disksing · 2018-08-17T04:04:21Z

em, I think BufferStore does not have to support KeyOnly. The option was invented for improving performance -- but it will not help BufferStore.

shafreeck · 2018-08-17T04:15:04Z

@disksing Yeah, I agree, KeyOnly just makes the behavior consistent between a buffer store and a snapshot. It does not help for performance. It is acceptable for me to ignore KeyOnly Option for buffer store.

disksing · 2018-08-17T06:36:26Z

@shafreeck Good point. As an alternative, I think we can make it 'undefined behavior' to call Value() when the KeyOnly option is set. Then it will be ok to return either nil or a value.

shafreeck · 2018-08-17T14:30:32Z

@disksing Great idea !

shafreeck · 2018-08-21T17:29:47Z

Any further suggestions? @disksing @siddontang

disksing · 2018-08-22T11:00:05Z

em, you need to get CI pass. You can use make dev to reproduce the test failure.

disksing · 2018-08-27T11:27:05Z

store/tikv/scan.go

@@ -98,7 +100,8 @@ func (s *Scanner) Next() error {
 			s.Close()
 			return errors.Trace(err)
 		}
-		if len(s.Value()) == 0 {
+
+		if len(s.Value()) == 0 && !s.keyOnly {


As I said resolveCurrentLock uses snapshotGet to resolve lock. When len(s.Value()) == 0, it means the key does not exist. It has nothing to do with the keyOnly flag.

disksing · 2018-08-28T02:07:33Z

It should be correct now. Could you add a test case that seeks keys with keyOnly option?

shafreeck · 2018-08-28T06:46:15Z

@disksing Test cases added

siddontang · 2018-08-29T03:50:13Z

PTAL @disksing

disksing · 2018-08-29T06:10:58Z

LGTM.

tiancaiamao · 2018-08-29T12:14:23Z

kv/kv.go

@@ -46,6 +46,8 @@ const (
 	// BypassLatch option tells 2PC commit to bypass latches, it would be true when the
 	// transaction is not conflict-retryable, for example: 'select for update', 'load data'.
 	BypassLatch
+	// KeyOnly retrieve only keys, it can be used in scan now


add . at the end of a sentence.

tiancaiamao · 2018-08-29T12:36:23Z

store/tikv/scan.go

 			s.Close()
 			return errors.Trace(err)
 		}
-		if len(s.Value()) == 0 {
+
+		if resolved && len(s.Value()) == 0 {


If resolved is false from https://github.com/pingcap/tidb/pull/7419/files#diff-d1ae066017b8533eea6b56eb55c4382cR125, the logic is different from the old code.

It is different, and it is more appropriate and clear. The value returned from a seek cannot be nil if none errors happened.

IMHO, when several conditions are combined, it's not more appropriate and clear.

resolved == false is ambiguous: is there no lock, so no need to resolve lock?
or is there a lock, and resolved fail, so it's false ?

When resolved && len(s.Value()) was written down, I have to consider four combination:

resolve success && key not exist

resolve success && key exist

resolve fail && key not exists

resolve fail && key exist

I think it's more appropriate to split the conditions, check lock first, then decide whether to resolve lock, and check key results afterwards.

@tiancaiamao
I love your ideas. It is actually not human-friendly when combines several conditions. I did that for keeping the same style as the old code.

Let me explain what I thought more in details, and then talk about how to improve it.

I said more appropriate and clear according to the old code, it is not an argument about if conditions.

The source of evil here is the method resolveCurrentLock , it is completely side effect. It is invoked without any condition and checks GetError() inside, furthermore, it does not take any arguments and return any results, but it indeed changes the current item of the iterator.

There is nothing relation between len(s.Value()) and resolveCurrentLock we can tell from the original code, but it actually has. The resolved I added indicates the relation, that's why I said more appropriate and clear. I want to focus on the KeyOnly feature in this PR, so I did not change resolveCurrentLock too much, for achieving my goal and keeping the old code style.

Now I think I made it clear, let's talk about the solution. Maybe we have two options

Focus on the KeyOnly feature here and using the way I did. Maybe we can refactor resolveCurrentLock in another PR.

Refactor resolveCurrentLock and do the way as your purpose. All in this PR.

These two options are all OK for me, so I am looking for your suggestions.

it is completely side effect

Good catch!

I want to focus on the KeyOnly feature in this PR

Or we can reset unrelated changes

Refactor resolveCurrentLock and do the way as your purpose. All in this PR.

I think option 2 is acceptable, this refactor is just a tiny change. @shafreeck

OK, I will take another commit.

tiancaiamao · 2018-08-29T12:59:45Z

store/tikv/scan.go

@@ -27,13 +27,14 @@ type Scanner struct {
 	snapshot     *tikvSnapshot
 	batchSize    int
 	valid        bool
+	keyOnly      bool


Why keyOnly is add to Scanner? it's accessible from Scanner.snapshot.keyOnly already.

It can be omitted, thanks.

store/tikv/lock_test.go

shenli · 2018-08-29T14:39:12Z

@tiancaiamao PTAL

tiancaiamao · 2018-08-30T02:47:59Z

store/tikv/scan.go

@@ -115,18 +117,18 @@ func (s *Scanner) startTS() uint64 {
 	return s.snapshot.version.Ver
 }

-func (s *Scanner) resolveCurrentLock(bo *Backoffer) error {
+func (s *Scanner) resolveCurrentLock(bo *Backoffer) (bool, error) {
 	current := s.cache[s.idx]
 	if current.GetError() == nil {


Would you please move this if branch out the resolveCurrentLock function, to the caller?
Then the resolved return value can be removed.

current := s.cache[s.idx] if current.GetError() == nil { ... // no lock, no need to resolve lock } else { err := resolveCurrentLock() }

tiancaiamao · 2018-08-30T03:03:28Z

store/tikv/scan.go

 			s.Close()
 			return errors.Trace(err)
 		}
-		if len(s.Value()) == 0 {
+
+		if resolved && len(s.Value()) == 0 {


IMHO, when several conditions are combined, it's not more appropriate and clear.

resolved == false is ambiguous: is there no lock, so no need to resolve lock?
or is there a lock, and resolved fail, so it's false ?

When resolved && len(s.Value()) was written down, I have to consider four combination:

resolve success && key not exist

resolve success && key exist

resolve fail && key not exists

resolve fail && key exist

I think it's more appropriate to split the conditions, check lock first, then decide whether to resolve lock, and check key results afterwards.

shafreeck · 2018-08-30T10:18:28Z

@tiancaiamao PTAL?

tiancaiamao · 2018-08-30T12:06:43Z

LGTM

tiancaiamao · 2018-08-30T12:06:53Z

/run-all-tests

shafreeck force-pushed the scankeyonly branch from 3178067 to 06c71b8 Compare August 16, 2018 13:55

siddontang reviewed Aug 16, 2018

View reviewed changes

zhangjinpeng87 requested a review from disksing August 17, 2018 03:01

shafreeck force-pushed the scankeyonly branch from 06c71b8 to 8e894c2 Compare August 27, 2018 09:27

disksing reviewed Aug 27, 2018

View reviewed changes

disksing added the status/LGT1 Indicates that a PR has LGTM 1. label Aug 29, 2018

tiancaiamao reviewed Aug 29, 2018

View reviewed changes

store/tikv/lock_test.go Show resolved Hide resolved

shafreeck force-pushed the scankeyonly branch 2 times, most recently from ad4df95 to 7a9cd60 Compare August 29, 2018 13:22

tiancaiamao reviewed Aug 30, 2018

View reviewed changes

shafreeck force-pushed the scankeyonly branch from 7ffd680 to 6cea784 Compare August 30, 2018 09:20

shafreeck added 3 commits August 30, 2018 18:05

Add keyonly support for seek

6f7e483

Check an empty value

0991155

Checks empty value only when doing resolve

64a23f7

shafreeck and others added 5 commits August 30, 2018 18:05

Add unit tests for KeyOnly option

de75a3f

Remove keyOnly field from Scanner

222a30c

Fix the comment

35dd4d8

Refactor resolveCurrentLock

74b4413

Add test case for KeyOnly being false

d4e82a2

shafreeck force-pushed the scankeyonly branch from 6cea784 to d4e82a2 Compare August 30, 2018 10:05

tiancaiamao added status/LGT2 Indicates that a PR has LGTM 2. component/tikv and removed status/LGT1 Indicates that a PR has LGTM 1. labels Aug 30, 2018

Merge branch 'master' into scankeyonly

596c79c

ngaut approved these changes Aug 30, 2018

View reviewed changes

ngaut merged commit b4fdaf3 into pingcap:master Aug 30, 2018

Add keyonly support for seek #7419

Add keyonly support for seek #7419

Conversation

shafreeck commented Aug 16, 2018

What problem does this PR solve?

What is changed and how it works?

Check List

sre-bot commented Aug 16, 2018

CLAassistant commented Aug 16, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

shafreeck Aug 17, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

shafreeck Aug 27, 2018 • edited Loading

Choose a reason for hiding this comment

shafreeck commented Aug 17, 2018

disksing commented Aug 17, 2018 • edited Loading

shafreeck commented Aug 17, 2018

disksing commented Aug 17, 2018

shafreeck commented Aug 17, 2018

shafreeck commented Aug 21, 2018 • edited Loading

disksing commented Aug 22, 2018

Choose a reason for hiding this comment

disksing commented Aug 28, 2018

shafreeck commented Aug 28, 2018

siddontang commented Aug 29, 2018

disksing commented Aug 29, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

shafreeck Aug 30, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

shenli commented Aug 29, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

shafreeck commented Aug 30, 2018

tiancaiamao commented Aug 30, 2018

tiancaiamao commented Aug 30, 2018

CLAassistant commented Aug 16, 2018 •

edited

Loading

shafreeck Aug 17, 2018 •

edited

Loading

shafreeck Aug 27, 2018 •

edited

Loading

disksing commented Aug 17, 2018 •

edited

Loading

shafreeck commented Aug 21, 2018 •

edited

Loading

shafreeck Aug 30, 2018 •

edited

Loading