Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix potential deadlock between Revoke and (Grant or Checkpoint) #14080

Merged
merged 1 commit into from
Jun 9, 2022
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
17 changes: 10 additions & 7 deletions server/lease/lessor.go
Original file line number Diff line number Diff line change
Expand Up @@ -286,17 +286,17 @@ func (le *lessor) Grant(id LeaseID, ttl int64) (*Lease, error) {
revokec: make(chan struct{}),
}

if l.ttl < le.minLeaseTTL {
l.ttl = le.minLeaseTTL
}

le.mu.Lock()
defer le.mu.Unlock()

if _, ok := le.leaseMap[id]; ok {
return nil, ErrLeaseExists
}

if l.ttl < le.minLeaseTTL {
l.ttl = le.minLeaseTTL
}

if le.isPrimary() {
l.refresh(0)
} else {
Expand Down Expand Up @@ -326,6 +326,12 @@ func (le *lessor) Revoke(id LeaseID) error {
le.mu.Unlock()
return ErrLeaseNotFound
}

// We shouldn't delete the lease inside the transaction lock, otherwise
// it may lead to deadlock with Grant or Checkpoint operations, which
// acquire the le.mu firstly and then the batchTx lock.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why Revoke doesn't follow Grant and Checkpoint in executing all changes under lock? If we look into code of similar caching structs (manages both in memory and bbolt representation of state) like RaftCluster, it does everything including backend changes under lock.

Do you know why on line 337 we unlock le.mu?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you know why on line 337 we unlock le.mu?

My understanding is the following transaction (which deletes the lease and all keys attached to the lease) may be time consuming if there are lots of keys being attached to the lease. Actually there is no need to acquire the le.mu during the backend transaction.

But both Grant and Checkpoint follow the pattern lease operation --> backend db operation --> lease operation, so for simplicity, they just hold thele.mufor the whole process.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that there is a chance of deadlock here, however I'm worried about removing lease from leaseMap and releasing lock before db operation is executed. This might be totally unfounded worry as backend operation as flushed only once 5 seconds, however I need to think a little on it. Have you thought on it more?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should be OK. The le.mu is used to protect the data structures inside lessor, and the backend transaction is protected by separate transaction lock. Just as I mentioned previously, there is no reason to hold the le.mu during the transaction.

Please note that during Grant and Checkpoint, lessor may also persist the lease, but both of them will require the transaction lock before persisting the lease data. So it's safe.

Copy link
Member

@chaochn47 chaochn47 Jun 16, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From my understanding,

before the change

Grant P1 (stands for Process 1)
1. acquires lessor.mu R1 (stands for Resource 1)
2. acquires backend tx R2
3. releases backend tx R2
4. releases lessor.mu R1

Revoke P2
1. acquires lessor.mu R1
2. releases lessor.mu R1
3. acquires backend tx R2
4. acquires lessor.mu R1
5. releases backend tx R2
6. releases lessor.mu R1

There is a potential deadlock for sure

  • P1 is holding R1 and wait to acquire R2
  • P2 is holding R2 and wait to acquire R1

Why not we move P2 step 4 to happen after step 5? Namely delete(le.leaseMap, id) after txn.End() which releases the backend transaction.

The benefit of this idea is the rangeDeleter will eveually call lessor.Detach() to look up leaseID in lessor.leaseMap. So the order is preserved and won't cause the issue that #16035 is trying to fix.

delete(le.leaseMap, id)

defer close(l.revokec)
// unlock before doing external work
le.mu.Unlock()
Expand All @@ -344,9 +350,6 @@ func (le *lessor) Revoke(id LeaseID) error {
txn.DeleteRange([]byte(key), nil)
}

le.mu.Lock()
defer le.mu.Unlock()
delete(le.leaseMap, l.ID)
// lease deletion needs to be in the same backend transaction with the
// kv deletion. Or we might end up with not executing the revoke or not
// deleting the keys if etcdserver fails in between.
Expand Down