store/tikv: implement local latch for transaction #6268

AndreMouche · 2018-04-11T09:30:53Z

@coocood @tiancaiamao @disksing @zhangjinpeng1987 PTAL

zhangjinpeng87 · 2018-04-12T03:34:01Z

store/tikv/latch/latch.go

+func (l *Latch) acquire(startTS uint64) (acquire, timeout, newWait bool) {
+	l.Lock()
+	defer l.Unlock()
+	timeout = startTS <= l.lastCommitTs


A latch may relate to multiple keys, so the timeout judgement is wrong here?

tiancaiamao · 2018-04-12T02:53:54Z

store/tikv/latch/latch.go

+}
+
+// acquire tries to get current key's lock for the transaction with startTS.
+// acquire is true when success


Why not just name it success?

tiancaiamao · 2018-04-12T03:05:38Z

store/tikv/latch/latch.go

+	if timeout {
+		return
+	}
+	if len(l.waiting) == 0 || l.waiting[0] != startTS {


I'm curious about when will l.waiting[0] == startTS happen?

tiancaiamao · 2018-04-12T03:13:55Z

store/tikv/latch/latch.go

+	l.Lock()
+	defer l.Unlock()
+	if startTS != l.waiting[0] {
+		panic(fmt.Sprintf("invalid front ts %d, latch:%+v", startTS, l))


tiancaiamao · 2018-04-12T03:17:17Z

store/tikv/latch/latch.go

+func (l *Latch) release(startTS uint64, commitTS uint64) (isEmpty bool, front uint64) {
+	l.Lock()
+	defer l.Unlock()
+	if startTS != l.waiting[0] {


if len(waiting) == 0 || startTS != l.waiting[0]

tiancaiamao · 2018-04-12T03:38:01Z

store/tikv/latch/latch.go

+
+// Latch stores a key's waiting transactions.
+type Latch struct {
+	// A queue by startTs of those waiting transactions.


Both acquired and waiting txn is put into waiting queue
if waiting[0] == T, we can't distinguish T acquired the latch or waiting for the latch, that's not good.
I suggest add a holder field, so the case would be much clear.

tiancaiamao · 2018-04-12T03:41:53Z

store/tikv/latch/latch.go

+	// The number of latches that the transaction has acquired.
+	acquiredCount int
+	// The number of latches whose waiting queue contains current transaction.
+	waitedCount int


Why we need this field?

we need the filed in Release for those Acquired failed cases

It can be changed to waiting bool.

tiancaiamao · 2018-04-12T03:44:06Z

store/tikv/latch/latch.go

+// NewLatches create a Latches with fixed length,
+// the size will be rounded up to the power of 2.
+func NewLatches(size int) Latches {
+	powerOfTwoSize := 1 << uint(math.Ceil(math.Log2(float64(size))))


This line is ineffective and ugly.
Tell her how to do it @lamxTyler

sort.SearchInts([]int{1,2,4,8,16,32,64...}, size)? 😄

Actually, I do not know any bits magic to find the highest bits. But, there is a function in math/bits called Len32 which returns the bits length after remove leading zeros. So a way to do it is:

if (size & (size -1)) == 0 { powerOfTwoSize = size } else { powerOfTwoSize = 1 << bits.Len32(size) }

tiancaiamao · 2018-04-12T03:44:42Z

store/tikv/latch/latch.go

+// the size will be rounded up to the power of 2.
+func NewLatches(size int) Latches {
+	powerOfTwoSize := 1 << uint(math.Ceil(math.Log2(float64(size))))
+	latches := make([]Latch, powerOfTwoSize, powerOfTwoSize)


make([]Latch, powerOfTwoSize)

disksing · 2018-04-12T04:09:06Z

store/tikv/latch/latch.go

+		panic(fmt.Sprintf("invalid front ts %d, latch:%+v", startTS, l))
+	}
+	if commitTS > l.lastCommitTs {
+		l.lastCommitTs = commitTS


maxCommitTS seems more appropriate than lastCommitTs.

disksing · 2018-04-12T04:10:09Z

store/tikv/latch/latch.go

+	}
+	l.waiting = l.waiting[1:]
+	if len(l.waiting) == 0 {
+		isEmpty = true


I guess better to reset the slice by l.waiting = l.waiting[:0]? @tiancaiamao

if l.waiting == nil, l.waiting = l.waiting[:0] would panic.

if l.waiting == nil, it will panic at L65.

tiancaiamao · 2018-04-12T05:32:23Z

store/tikv/latch/latch.go

+
+// GenLock generates Lock for the transaction with startTS and keys.
+func (latches Latches) GenLock(startTS uint64, keys [][]byte) Lock {
+	hashes := make(map[int]bool)


hashes is not necessary.

slots := make([]int, 0, len(keys)) for _, key := range keys { slots = append(slots, latches.hash[key]) } sort.Ints(slots)

tiancaiamao · 2018-04-12T05:34:23Z

store/tikv/latch/latch.go

+			lock.waitedCount++
+		}
+		if timeout || !acquired {
+			return


timeout doesn't clear the lock?

I'd like to call Release outside this function.

tiancaiamao · 2018-04-12T05:39:00Z

store/tikv/latch/latch.go

+		slotID := lock.requiredSlots[lock.acquiredCount]
+		acquired, timeout, newWait = latches[slotID].acquire(lock.startTS)
+		if newWait {
+			lock.waitedCount++


The waitedCount always make me confused...

coocood · 2018-04-12T06:21:36Z

store/tikv/latch/latch.go

+// Latch stores a key's waiting transactions.
+type Latch struct {
+	// A queue by startTs of those waiting transactions.
+	waiting      []uint64


The number of latches with a waiting queue is much less than the number of latches.
The slice has a pointer which has a big impact on GC when the number of latches is large.
We can move the waiting queue out of latches to a map to reduce GC pressure.

like

type Latch struct { Head uint64 HasWaiting bool LastCommit uint64 sync.Mutex } type Latches struct { Slots []Latch WaitingQueue map[int][]int // key is slotID, value is slice of txn startTS WaitingLock sync.Mutex }

There is no map used in current code @coocood

coocood · 2018-04-12T10:37:18Z

store/tikv/latch/latch.go

+	l.Lock()
+	defer l.Unlock()
+
+	if timeout = startTS <= l.maxCommitTS; timeout {


s/timeout/stale

if startTS <= l.maxCommitTS { timeout = true return }

coocood · 2018-04-12T10:38:07Z

store/tikv/latch/latch.go

+		return
+	}
+
+	if l.hasWaiting == false {


if !l.hasWaiting {

coocood · 2018-04-12T10:51:31Z

store/tikv/latch/latch.go

+}
+
+func (latches *Latches) acquireSlot(slotID int, startTS uint64) (success, timeout, new bool) {
+	success, timeout, new = latches.slots[slotID].acquire(startTS)


We never use the returned value new.

disksing · 2018-04-12T10:25:20Z

store/tikv/latch/latch.go

+		return
+	}
+
+	if l.hasWaiting == false {


!l.hasWaiting

disksing · 2018-04-12T10:58:02Z

store/tikv/latch/latch.go

+		l.hasWaiting = true
+		newWait = true
+	}
+	success = l.head == startTS


Seems newWait and success will always be the same, as long as we don't acquire multiple times for a same ts.

I'm also confused by newWait

coocood · 2018-04-12T10:58:50Z

store/tikv/latch/latch.go

+	latches.Lock()
+	defer latches.Unlock()
+	if waiting, ok := latches.waitingQueue[slotID]; ok {
+		latches.waitingQueue[slotID] = append(waiting, startTS)


Is it possible that we already in the waiting queue?

coocood · 2018-04-12T11:47:42Z

store/tikv/latch/latch.go

+}
+
+func (latches *Latches) releaseSlot(slotID int, startTS, commitTS uint64) (hasNext bool, nextStartTS uint64) {
+	latches.Lock()


We can avoid lock here.
Check the slot first, if the slot doesn't have waiting queue, we don't need to lock.

no, we do not know if the slot has a wating queue in Latch

coocood · 2018-04-12T12:04:46Z

store/tikv/latch/latch.go

+			slots[size] = v
+			size++
+		}
+	}


This maybe easier to read.

dedup := slots[:1] for i := 1; i < len(slots); i++ { if slots[i] != slots[i-1] { dedup = append(dedup, slots[i]) } }

@tiancaiamao What do you think?

Well, @coocood is right! it's easier to read, you can change this way.

if len(slots) == 0 { return NewLock(startTS, nil) }

As for me, I'd always like the C-style, for example:

don't let append do the implicit work

manual malloc when resize a slice

prefer using i := 0; i < len(); i++ style instead of range

It seems clumsy but less likely to went wrong.

zhangjinpeng87 · 2018-04-13T03:31:41Z

store/tikv/latch/latch.go

+		slots = append(slots, latches.hash(key))
+	}
+	sort.Ints(slots)
+	if len(slots) == 0 {


if len(slots) <= 1?

zhangjinpeng87 · 2018-04-13T04:58:49Z

store/tikv/latch/latch.go

+
+// Release releases all latches owned by the `lock` and returns the wakeup list.
+// Preconditions: the caller must ensure the transaction is at the front of the latches.
+func (latches *Latches) Release(lock *Lock, commitTS uint64) (wakeupList []uint64) {


What the commitTs will be when release a uncommitted txn?

will be zero.

zhangjinpeng87 · 2018-04-13T05:05:48Z

store/tikv/latch/latch.go

+	}
+
+	latches.Lock()
+	if waiting, ok := latches.waitingQueue[slotID]; ok {


If latch.hasWanting is true, it means the ok must be true.

zhangjinpeng87 · 2018-04-13T05:35:32Z

store/tikv/latch/latch.go

+	// Whether there is any transaction in waitingQueue except head.
+	hasWaiting bool
+	// The startTS of the transaction which is the head of waiting transactions.
+	head        uint64


s/head/waitingQueueHead

I think it is better to add a function occupied for Latch to judge if the Latch is available. Lock should use fn occupied to implement acquire.

zhangjinpeng87 · 2018-04-13T06:26:44Z

store/tikv/latch/latch.go

+		return
+	}
+	// Empty latch
+	if !latch.occupied() {


acquired := latch.aquire(start_ts)
if acquired {
return
}
...

So we check stale outside latch while do acquire in latch? I do not think is a good style.
if we do this, I think we should also implement latch.isStale(startTS), latch.setHasWaiting(), and so on.

coocood · 2018-04-13T06:33:18Z

store/tikv/latch/latch.go

+// Preconditions: the caller must ensure the transaction is at the front of the latches.
+func (latches *Latches) Release(lock *Lock, commitTS uint64) (wakeupList []uint64) {
+	wakeupCount := lock.acquiredCount
+	if lock.waiting {


In Release, waiting is always false.

coocood · 2018-04-13T06:35:31Z

store/tikv/latch/latch.go

+		if success {
+			lock.acquiredCount++
+			lock.waiting = false
+			continue


Should not continue here.
success and stale may both be true.

if !stale { continue } else { return }

According to the definition of acquiredCount and success, it's ok to continue here.

coocood · 2018-04-13T11:07:20Z

store/tikv/latch/latch.go

+type Latches struct {
+	slots []Latch
+	// The waiting queue for each slot(slotID => slice of startTS).
+	waitingQueue map[int][]uint64


waitingQueues

coocood · 2018-04-13T11:09:48Z

store/tikv/latch/latch.go

+	// The number of latches that the transaction has acquired.
+	acquiredCount int
+	// Whether current transaction is waiting
+	waiting bool


coocood · 2018-04-13T11:11:43Z

store/tikv/latch/latch.go

+// Latch stores a key's waiting transactions information.
+type Latch struct {
+	// Whether there is any transaction in waitingQueue except head.
+	hasWaiting bool


hasMoreWaiting

coocood · 2018-04-13T11:14:00Z

store/tikv/latch/latch.go

+}
+
+// hash return hash int for current key.
+func (latches *Latches) hash(key []byte) int {


coocood · 2018-04-13T11:16:19Z

store/tikv/latch/latch.go

+// Release releases all latches owned by the `lock` and returns the wakeup list.
+// Preconditions: the caller must ensure the transaction is at the front of the latches.
+func (latches *Latches) Release(lock *Lock, commitTS uint64) (wakeupList []uint64) {
+	wakeupCount := lock.acquiredCount


I think releaseCount is better

coocood · 2018-04-13T11:18:00Z

store/tikv/latch/latch.go

+	if startTS != latch.waitingQueueHead {
+		panic(fmt.Sprintf("invalid front ts %d, latch:%#v", startTS, latch))
+	}
+	if latch.maxCommitTS < commitTS {


latch.maxCommitTS = mathutil.Max(latch.maxCommitTS, commitTS)

coocood · 2018-04-13T11:50:08Z

LGTM

tiancaiamao · 2018-04-15T01:01:08Z

LGTM @zhangjinpeng1987 @disksing

AndreMouche · 2018-04-16T01:47:22Z

friendly ping @zhangjinpeng1987 @disksing

zhangjinpeng87 · 2018-04-16T02:13:34Z

store/tikv/latch/latch.go

+	latch := &latches.slots[slotID]
+	latch.Lock()
+	defer latch.Unlock()
+	if startTS != latch.waitingQueueHead {


When release a Lock which is in waiting status will cause panic here.

We panic here since it should never happen.

disksing · 2018-04-16T01:52:29Z

store/tikv/latch/latch.go

+	return NewLock(startTS, dedup)
+}
+
+// hash return slotID for current key.


slotID returns ...

disksing · 2018-04-16T01:54:53Z

store/tikv/latch/latch.go

+		latches.waitingQueues[slotID] = append(waitingQueue, startTS)
+	} else {
+		latches.waitingQueues[slotID] = []uint64{startTS}
+	}


I think we can simplify it to latches.waitingQueues[slotID] = append(latches.waitingQueues[slotID], startTS)

disksing · 2018-04-16T02:13:02Z

store/tikv/latch/latch.go

+		releaseCount++
+	}
+	wakeupList = make([]uint64, 0, releaseCount)
+	for id := 0; id < releaseCount; id++ {


disksing · 2018-04-16T02:27:52Z

store/tikv/latch/latch.go

+	} else {
+		latches.waitingQueues[slotID] = waiting[1:]
+	}
+	latches.Unlock()


How about extract L162-173 to a method?

disksing · 2018-04-16T02:48:16Z

store/tikv/latch/latch.go

+	} else {
+		latches.waitingQueues[slotID] = waiting[1:]
+	}
+	latches.Unlock()


How about extract L162-173 as a method?

zhangjinpeng87 · 2018-04-16T07:44:01Z

store/tikv/latch/latch.go

 	waiting := latches.waitingQueues[slotID]
-	hasNext = true
-	nextStartTS = waiting[0]
+	front = waiting[0]


How about if waiting queue is empty?

zhangjinpeng87 · 2018-04-16T09:00:47Z

store/tikv/latch/latch.go

+	latches.Lock()
+	defer latches.Unlock()
+	waiting := latches.waitingQueues[slotID]
+	front = waiting[0]


How about waiting is empty?

It will return on L158

disksing

LGTM.

disksing · 2018-04-16T09:13:19Z

/run-all-tests

AndreMouche added 3 commits April 11, 2018 17:26

store/tikv: implement local latch for transaction

958eed5

tikv/latch_test: update comments

13fd547

Merge branch 'master' into latch

a3c8d2c

zhangjinpeng87 reviewed Apr 12, 2018

View reviewed changes

tiancaiamao reviewed Apr 12, 2018

View reviewed changes

disksing reviewed Apr 12, 2018

View reviewed changes

tiancaiamao reviewed Apr 12, 2018

View reviewed changes

coocood reviewed Apr 12, 2018

View reviewed changes

AndreMouche added 2 commits April 12, 2018 17:28

address comments

9c0f9d2

merge master

2ae2922

coocood reviewed Apr 12, 2018

View reviewed changes

disksing reviewed Apr 12, 2018

View reviewed changes

coocood reviewed Apr 12, 2018

View reviewed changes

AndreMouche added 5 commits April 12, 2018 21:21

address comments

b87033b

address comments

fa10a54

Merge branch 'master' into latch

af981cc

address comments

ed700f4

address comments

5488231

zhangjinpeng87 reviewed Apr 13, 2018

View reviewed changes

AndreMouche added 2 commits April 13, 2018 13:59

address comments

e71ffe2

address comments

c84c8de

zhangjinpeng87 reviewed Apr 13, 2018

View reviewed changes

coocood reviewed Apr 13, 2018

View reviewed changes

address comments

58cabb8

Merge branch 'master' into latch

d0b2200

zhangjinpeng87 reviewed Apr 16, 2018

View reviewed changes

disksing reviewed Apr 16, 2018

View reviewed changes

AndreMouche added 2 commits April 16, 2018 14:40

address comments

65bfe3f

Merge branch 'master' into latch

d00fa95

zhangjinpeng87 reviewed Apr 16, 2018

View reviewed changes

disksing approved these changes Apr 16, 2018

View reviewed changes

AndreMouche merged commit 3d183a9 into pingcap:master Apr 16, 2018

AndreMouche deleted the latch branch April 16, 2018 09:41

store/tikv: implement local latch for transaction #6268

store/tikv: implement local latch for transaction #6268

Conversation

AndreMouche commented Apr 11, 2018

zhangjinpeng87 Apr 12, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

alivxxx Apr 12, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

tiancaiamao Apr 13, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

coocood Apr 13, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

coocood commented Apr 13, 2018

tiancaiamao commented Apr 15, 2018

AndreMouche commented Apr 16, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

disksing left a comment

Choose a reason for hiding this comment

disksing commented Apr 16, 2018

zhangjinpeng87 Apr 12, 2018 •

edited

Loading

alivxxx Apr 12, 2018 •

edited

Loading

tiancaiamao Apr 13, 2018 •

edited

Loading

coocood Apr 13, 2018 •

edited

Loading