lease: do lease pile-up reduction in the background #9699

mgates · 2018-05-04T21:01:57Z

This moves lease pile reduction into a goroutine. This prevents timeouts
when the lessor is locked for a long time (when there are a lot of
leases, mostly).

We had a problem where when we had a lot of leases (100kish), and a leader election happened the lessor was locked for a long time causing timeouts when we tried to grant a lease. This changes it so that the lessor is locked in batches, which allows for creation of leases while the leader is initializing the lessor. It still won't start expiring leases until it's all done rewriting them, though.

Before:

BenchmarkLessorPromote1-16                        500000 4036 ns/op
BenchmarkLessorPromote10-16                       500000 3932 ns/op
BenchmarkLessorPromote100-16                      500000 3954 ns/op
BenchmarkLessorPromote1000-16                     300000 3906 ns/op
BenchmarkLessorPromote10000-16                    300000 4639 ns/op
BenchmarkLessorPromote100000-16                      100 27216481 ns/op
BenchmarkLessorPromote1000000-16                     100 325164684 ns/op

After:

BenchmarkLessorPromote1-16                        500000 3769 ns/op
BenchmarkLessorPromote10-16                       500000 3835 ns/op
BenchmarkLessorPromote100-16                      500000 3829 ns/op
BenchmarkLessorPromote1000-16                     500000 3665 ns/op
BenchmarkLessorPromote10000-16                    500000 3800 ns/op
BenchmarkLessorPromote100000-16                   300000 4114 ns/op
BenchmarkLessorPromote1000000-16                  300000 5143 ns/op

mgates · 2018-05-04T21:02:30Z

I'll have some more realistic test examples on Monday.

cosgroveb · 2018-05-07T14:32:40Z

This addresses #9496

gyuho · 2018-05-09T06:16:16Z

lease/lessor.go

 	return ls
 }

 func (le *lessor) Promote(extend time.Duration) {
-	le.mu.Lock()
-	defer le.mu.Unlock()
+	le.Ready = false


Needs lock around le.Ready?

gyuho · 2018-05-09T06:16:57Z

lease/lessor.go

+	le.Ready = false
+	go func() {
+		le.mu.RLock()
+		le.demotec = make(chan struct{})


le.demotec needs be protected by write lock.

gyuho · 2018-05-09T06:17:42Z

Have you benchmarked with real work workloads?

mgates · 2018-05-16T17:58:03Z

Sorry for the benchmarking delay - still trying to find the time.

mgates · 2018-05-18T18:47:39Z

Still busy, but I blocked out some time at the end of next week to get it done. Sorry for the delay.

mgates · 2018-05-25T21:05:39Z

Ok - so I'm not sure what the best way to demonstarate this in a real-worldish way is, but here's what I did:

set up the test cluster with goreman start
put 1.9 million leases into etcd
Ran this snippet which tries to make a new lease every half second, and kills the leader after 10 seconds

while true; do ETCDCTL_API=3 etcdctl --command-timeout 0.2s --endpoints "http://127.0.0.1:2379,http://127.0.0.1:22379,http://127.0.0.1:32379" lease grant  60  > /dev/null ; echo `date` $? ; sleep 0.5; done & sleep 10; pkill -f  -- "[-]-listen-peer-urls $(ETCDCTL_API=2 etcdctl  member list  | grep isLeader=true | cut -f 3 -d "="| cut -f 1 -d " ")"

On the branch it had 1 timeout exceed error, but on master it had 60, over the 61 seconds it took before it recovered.

Sorry it took so long - if these changes looks plausible to you, we can try to test it against our live, large workloads.

mgates · 2018-06-04T13:56:09Z

Hey folks - just wanted to check in about this - does it seem ok, at least in theory? If so, happy to work more on it.

xiang90 · 2018-06-04T17:32:00Z

lease/lessor.go

@@ -329,66 +332,84 @@ func (le *lessor) unsafeLeases() []*Lease {
 	for _, l := range le.leaseMap {
 		leases = append(leases, l)
 	}
-	sort.Sort(leasesByExpiry(leases))


this change can be moved to a separate PR?

xiang90 · 2018-06-04T17:34:23Z

@mgates

Can you:

describe what problem (in which case what can happen and how this PR can help ) this PR solves
move the sorting change into another PR

I can understand this PR as it is, but I guess it is easier for other people to review if you do what I suggested.

mgates · 2018-06-04T17:55:46Z

Sure.
We had a problem where when we had a lot of leases (100kish), and a leader election happened the lessor was locked for a long time causing timeouts when we tried to grant a lease. This changes it so that the lessor is locked in batches, which allows for creation of leases while the leader is initializing the lessor. It still won't start expiring leases until it's all done rewriting them, though.

Because the leases were sorted inside UnsafeLeases() the lessor mutex ended up being locked while the whole map was sorted. This pulls the soring outside of the lock, per feedback on etcd-io#9699

xiang90 · 2018-06-06T16:40:45Z

We had a problem where when we had a lot of leases (100kish), and a leader election happened the lessor was locked for a long time causing timeouts when we tried to grant a lease. This changes it so that the lessor is locked in batches, which allows for creation of leases while the leader is initializing the lessor. It still won't start expiring leases until it's all done rewriting them, though.

can you move this to the PR description and also put this into the commit message?

Thanks.

mgates · 2018-06-06T20:54:54Z

@xiang90 done, thanks a lot.

xiang90 · 2018-06-06T21:13:10Z

lease/lessor.go

@@ -144,6 +144,9 @@ type lessor struct {
 	stopC chan struct{}
 	// doneC is a channel whose closure indicates that the lessor is stopped.
 	doneC chan struct{}
+
+	// when the lease pile-up reduction is done this is true


mention when this will be set to false?

xiang90 · 2018-06-06T21:18:52Z

@mgates What happens when a lease is revoked while the go routine created in the promote is still running? is there a risk that the copy of the lease will be re-add into the heap?

mgates · 2018-06-07T14:46:25Z

That's a good question - I'm pretty sure that it would get updated, but since the revoke method deletes it from the lease map and deletes the appropriate keys, that wouldn't actually be a problem, and the lease would get cleaned up eventually. It will get added to the heap, but the heap is already full of revoked leases, and ignores them if they aren't in the map.

We'll write a test to confirm, though

mgates · 2018-06-07T16:43:29Z

Hey there, we looked into this more, and it doesn't seem possible to write a automated test of this behavior, we did some manual work and added some sleeps and feel good about the behavior - the lease TTL will get refreshed and possibly rewritten for the pile up avoidance, it'll get added to the heap, and will be ignored once the expiry checker gets to it, because it isn't in the lease map.

cyc115

minor nits

cyc115 · 2018-06-11T15:36:50Z

lease/lessor.go

@@ -144,6 +144,11 @@ type lessor struct {
 	stopC chan struct{}
 	// doneC is a channel whose closure indicates that the lessor is stopped.
 	doneC chan struct{}
+
+	// this is false when promotion is starting and gets


nit: The comment should start with Ready, the variable name

cyc115 · 2018-06-11T16:29:07Z

lease/lessor.go

-		item := &LeaseWithTime{id: l.ID, expiration: l.expiry.UnixNano()}
-		heap.Push(&le.leaseHeap, item)
-	}
+		// adjust expiries in case of overlap



nit: white line

This moves lease pile-up reduction into a goroutine which mostly operates on a copy of the lease list, to avoid locking. This prevents timeouts when the lessor is locked for a long time (when there are a lot of leases, mostly). This should solve etcd-io#9496. We had a problem where when we had a lot of leases (100kish), and a leader election happened the lessor was locked for a long time causing timeouts when we tried to grant a lease. This changes it so that the lessor is locked in batches, which allows for creation of leases while the leader is initializing the lessor. It still won't start expiring leases until it's all done rewriting them, though. Before: ``` BenchmarkLessorPromote1-16 500000 4036 ns/op BenchmarkLessorPromote10-16 500000 3932 ns/op BenchmarkLessorPromote100-16 500000 3954 ns/op BenchmarkLessorPromote1000-16 300000 3906 ns/op BenchmarkLessorPromote10000-16 300000 4639 ns/op BenchmarkLessorPromote100000-16 100 27216481 ns/op BenchmarkLessorPromote1000000-16 100 325164684 ns/op ``` After: ``` BenchmarkLessorPromote1-16 500000 3769 ns/op BenchmarkLessorPromote10-16 500000 3835 ns/op BenchmarkLessorPromote100-16 500000 3829 ns/op BenchmarkLessorPromote1000-16 500000 3665 ns/op BenchmarkLessorPromote10000-16 500000 3800 ns/op BenchmarkLessorPromote100000-16 300000 4114 ns/op BenchmarkLessorPromote1000000-16 300000 5143 ns/op ```

Because the leases were sorted inside UnsafeLeases() the lessor mutex ended up being locked while the whole map was sorted. This pulls the soring outside of the lock, per feedback on etcd-io#9699

wenjiaswe · 2018-09-06T18:30:41Z

cc @jingyih

stale · 2020-04-06T19:55:32Z

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

gyuho added the area/performance label May 7, 2018

gyuho reviewed May 9, 2018

View reviewed changes

mgates force-pushed the break_up_lease_promotion branch from 9f7bf2b to 81f80d9 Compare May 11, 2018 14:21

junhopark mentioned this pull request May 24, 2018

Add new flag to persist lease expiry #9526

Closed

mgates force-pushed the break_up_lease_promotion branch 2 times, most recently from 10bf546 to 628570e Compare May 25, 2018 21:18

xiang90 reviewed Jun 4, 2018

View reviewed changes

mgates mentioned this pull request Jun 6, 2018

leases: Move lease sorting outside of lock #9813

Merged

mgates force-pushed the break_up_lease_promotion branch from 628570e to 1bdfe6f Compare June 6, 2018 20:54

mgates force-pushed the break_up_lease_promotion branch 2 times, most recently from 591a377 to 4c331ff Compare June 6, 2018 21:04

xiang90 reviewed Jun 6, 2018

View reviewed changes

jcalvert force-pushed the break_up_lease_promotion branch from 4c331ff to 77ee5ee Compare June 7, 2018 18:28

cyc115 reviewed Jun 11, 2018

View reviewed changes

mgates force-pushed the break_up_lease_promotion branch 2 times, most recently from 9e56e57 to a4cfc61 Compare June 12, 2018 21:44

mgates force-pushed the break_up_lease_promotion branch from a4cfc61 to 187ad9a Compare June 12, 2018 21:45

gyuho self-assigned this Jun 19, 2018

gyuho added this to the etcd-v3.4 milestone Jun 19, 2018

gyuho modified the milestones: etcd-v3.4, etcd-v3.5 Aug 5, 2019

stale bot added the stale label Apr 6, 2020

stale bot closed this Apr 27, 2020

ahrtr mentioned this pull request Jun 8, 2022

[Lease] Refactor lease renew request via raft #14094

Draft

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

lease: do lease pile-up reduction in the background #9699

lease: do lease pile-up reduction in the background #9699

mgates commented May 4, 2018 •

edited

Loading

mgates commented May 4, 2018

cosgroveb commented May 7, 2018

gyuho May 9, 2018

gyuho May 9, 2018

gyuho commented May 9, 2018

mgates commented May 16, 2018

mgates commented May 18, 2018

mgates commented May 25, 2018 •

edited

Loading

mgates commented Jun 4, 2018

xiang90 Jun 4, 2018

xiang90 commented Jun 4, 2018

mgates commented Jun 4, 2018

xiang90 commented Jun 6, 2018

mgates commented Jun 6, 2018

xiang90 Jun 6, 2018

xiang90 commented Jun 6, 2018

mgates commented Jun 7, 2018 •

edited

Loading

mgates commented Jun 7, 2018

cyc115 left a comment

cyc115 Jun 11, 2018

cyc115 Jun 11, 2018

wenjiaswe commented Sep 6, 2018

stale bot commented Apr 6, 2020

lease: do lease pile-up reduction in the background #9699

lease: do lease pile-up reduction in the background #9699

Conversation

mgates commented May 4, 2018 • edited Loading

mgates commented May 4, 2018

cosgroveb commented May 7, 2018

gyuho May 9, 2018

Choose a reason for hiding this comment

gyuho May 9, 2018

Choose a reason for hiding this comment

gyuho commented May 9, 2018

mgates commented May 16, 2018

mgates commented May 18, 2018

mgates commented May 25, 2018 • edited Loading

mgates commented Jun 4, 2018

xiang90 Jun 4, 2018

Choose a reason for hiding this comment

xiang90 commented Jun 4, 2018

mgates commented Jun 4, 2018

xiang90 commented Jun 6, 2018

mgates commented Jun 6, 2018

xiang90 Jun 6, 2018

Choose a reason for hiding this comment

xiang90 commented Jun 6, 2018

mgates commented Jun 7, 2018 • edited Loading

mgates commented Jun 7, 2018

cyc115 left a comment

Choose a reason for hiding this comment

cyc115 Jun 11, 2018

Choose a reason for hiding this comment

cyc115 Jun 11, 2018

Choose a reason for hiding this comment

wenjiaswe commented Sep 6, 2018

stale bot commented Apr 6, 2020

mgates commented May 4, 2018 •

edited

Loading

mgates commented May 25, 2018 •

edited

Loading

mgates commented Jun 7, 2018 •

edited

Loading