-
Notifications
You must be signed in to change notification settings - Fork 9.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix lease expiration check #19092
base: main
Are you sure you want to change the base?
Fix lease expiration check #19092
Conversation
The new test checks if `leasingKV.Get()` correctly checks for lease expiration by checking if it (incorrectly) returns stale values when network partitioned from the rest of the system. Signed-off-by: Upamanyu Sharma <upamanyu@mit.edu>
This helps fix the failing test TestLeasingGetChecksForExpiration. Previously, the `leasing` library relied on *not* receiving something over the `session.Done()` channel in `readySession()`. Failing to immediately receive over the channel does not guarantee that the lease is actually still valid. Signed-off-by: Upamanyu Sharma <upamanyu@mit.edu>
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: upamanyus The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
Hi @upamanyus. Thanks for your PR. I'm waiting for a etcd-io member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
client/v3/lease.go
Outdated
@@ -136,6 +136,10 @@ type Lease interface { | |||
// (see https://github.com/etcd-io/etcd/pull/7866) | |||
KeepAlive(ctx context.Context, id LeaseID) (<-chan *LeaseKeepAliveResponse, error) | |||
|
|||
// Unexpired returns true iff the lease is unexpired (more precisely: iff the | |||
// lease was unexpired during the execution of the Unexpired() call). | |||
Unexpired(id LeaseID) bool |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's avoid negative logic in the interface. !Expired(id)
should be simpler.
Signed-off-by: Upamanyu Sharma <upamanyu@mit.edu>
Fixes #19091.
This PR adds and uses a client-side
Unexpired()
method that determines if a lease is valid by checking if its expiration time is after the current time. This helps fix a bug in the clientleasing
library, and provides a general API to check if a lease is valid, which was not previously exposed to users.This adds a new test for
leasing
which does a few Puts and Gets with delays and with a client that gets network partitioned. The test fails often on the oldleasing
implementation. The bug withinleasing
was here:etcd/client/v3/leasing/kv.go
Lines 469 to 473 in 9fa35e5
This code tries to determine if a session is "ready" by checking if an non-blocking receive on
lkv.session.Done()
fails. However, failing to immediately receive on achan
does not guarantee anything: the other side's send orclose
may just have been delayed by a bit.Fortunately for reproducing the problem, there's a delay already present in the code: the
Done()
channel is closed by a background loop that sleeps and periodically checks whether the lease is expired. If the lease expires while this loop is sleeping, there is a delay between the true expiration time and when theDone()
channel is closed, which seems to make the test fail relatively reliably (it still sometimes takes a few tries to get a failure).Using
Unexpired()
, the test always appears to pass.I couldn't find other places using etcd leases with the problematic pattern from the
leasing
library, but I'm admittedly not sure how theLease
library is used by others.