Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Leases wait for entries to be applied #13690

Merged
merged 2 commits into from
Apr 11, 2022

Conversation

ahrtr
Copy link
Member

@ahrtr ahrtr commented Feb 12, 2022

Fix issue 13675.

When etcdserver receives a LeaseRenew request, it may be still in progress of processing the LeaseGrantRequest on exact the same leaseID. Accordingly it may return a TTL=0 to client due too the leaseID not found error. Please see my analysis in the issue 13675.

Before I head on adding an e2e or integration test, I'd like to get more feedback. cc @serathius @ptabor @spzala

@liberize, please kindly let me know whether this resolves your issue. Please clone my branch, build etcd binaries and verify the fix, thanks. I verified in my local environment, and confirmed that it works.

@codecov-commenter
Copy link

codecov-commenter commented Feb 12, 2022

Codecov Report

Merging #13690 (d5835c9) into main (7c472a9) will decrease coverage by 0.09%.
The diff coverage is 80.00%.

❗ Current head d5835c9 differs from pull request most recent head 5360780. Consider uploading reports for the commit 5360780 to get more accurate results

Impacted file tree graph

@@            Coverage Diff             @@
##             main   #13690      +/-   ##
==========================================
- Coverage   72.68%   72.58%   -0.10%     
==========================================
  Files         467      467              
  Lines       38278    38291      +13     
==========================================
- Hits        27822    27795      -27     
- Misses       8653     8685      +32     
- Partials     1803     1811       +8     
Flag Coverage Δ
all 72.58% <80.00%> (-0.10%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
api/v3rpc/rpctypes/error.go 90.47% <ø> (ø)
server/etcdserver/api/v3rpc/util.go 74.19% <ø> (ø)
server/etcdserver/errors.go 0.00% <ø> (ø)
server/etcdserver/v3_server.go 78.44% <80.00%> (-0.22%) ⬇️
server/proxy/grpcproxy/register.go 69.76% <0.00%> (-9.31%) ⬇️
client/v3/leasing/util.go 91.66% <0.00%> (-6.67%) ⬇️
client/v3/concurrency/mutex.go 61.64% <0.00%> (-5.48%) ⬇️
client/v3/concurrency/session.go 88.63% <0.00%> (-4.55%) ⬇️
client/v3/leasing/cache.go 87.77% <0.00%> (-3.89%) ⬇️
client/pkg/v3/testutil/leak.go 67.25% <0.00%> (-2.66%) ⬇️
... and 17 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 7c472a9...5360780. Read the comment docs.

@liberize
Copy link

verified. it works.

@ahrtr
Copy link
Member Author

ahrtr commented Feb 12, 2022

I found a similar issue 6978, which was fixed by PR 7015 more than 5 years ago.

It only fixed the HTTP entries. etcd doesn't support gRPC that time, and the fix isn't ported to gRPC later. So I just ported the fix for gRPC as well.

Will update the existing test cases TestV3LeaseRenewStress and TestV3LeaseTimeToLiveStress later.

@ahrtr ahrtr force-pushed the lease_renew_linearizable branch 4 times, most recently from 6092356 to 05b4471 Compare February 13, 2022 22:38
@ahrtr
Copy link
Member Author

ahrtr commented Feb 13, 2022

I updated the integration test and the 3.6 changelog. This PR is ready for review. @serathius @spzala @ptabor

@ahrtr
Copy link
Member Author

ahrtr commented Feb 14, 2022

Just rebased this PR.

@ahrtr ahrtr force-pushed the lease_renew_linearizable branch 3 times, most recently from 68f90a6 to 314db9c Compare February 15, 2022 23:37
@ahrtr
Copy link
Member Author

ahrtr commented Feb 16, 2022

The semaphoreci failure is due to docker hub rate limitation. See error below,

Step 1/14 : FROM ubuntu:21.10
toomanyrequests: You have reached your pull rate limit. You may increase the limit by authenticating and upgrading: https://www.docker.com/increase-rate-limit
make[1]: *** [build-docker-test] Error 1
make[1]: Leaving directory `/home/runner/etcd'
make: *** [ensure-docker-test-image-exists] Error 2

Just raised an issue to track this failure.

@ahrtr ahrtr force-pushed the lease_renew_linearizable branch 2 times, most recently from 41cec48 to fb40b4e Compare February 17, 2022 23:25
@ahrtr
Copy link
Member Author

ahrtr commented Feb 25, 2022

Resolved all comments. PTAL @serathius @ptabor @spzala

@ahrtr
Copy link
Member Author

ahrtr commented Feb 25, 2022

Just rebased this PR.

Copy link
Member

@serathius serathius left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, based on fact that this restores behavior from V2 api (bothApplyWait and 1 second timeout).

@ahrtr
Copy link
Member Author

ahrtr commented Mar 3, 2022

cc @spzala @ptabor

@ahrtr ahrtr force-pushed the lease_renew_linearizable branch 2 times, most recently from d1d8a94 to 2493d87 Compare April 2, 2022 06:54
@serathius
Copy link
Member

I think we need @ptabor opinion here. In #13868 (comment) he brought up that making LeaseLeases linealizable is a breaking change, so it would be worth discussing it here again.

@serathius
Copy link
Member

cc @endocrimes

@serathius
Copy link
Member

serathius commented Apr 4, 2022

Pasting my discussion with @endocrimes about implementing LeaseList.

danielle 5:38 PM
o/ - trying to understand when I’d want to use linearizableReadNotify vs forwarding a request to the leader.
I’m guessing the intention is that linearizableReadNotify is a relatively cheap way to shard reads over etcd servers, but it’s not particularly well documented, so I’m kinda piecing that together from impl - and don’t wanna make too many guesses yet 😅

serathius 6:12 PM
My guess is that we should use linearizableReadNotify . Forwarding request looks for me like a legacy from V2 code.

In v2, revoked leases didn't go through raft, leader just send SYNC request through raft that didn't specify what leases should be revoked. This means that each member could remove different leases depending on time on server. So all requests about leases still needed to go through leader. I assume that when implementing V3 api this approach was blindly copied. Same problem affects other lease methods see #13690

In V3 revoking a lease goes through raft. RevokeLease entry specifies what lease is removed and all members need to agree through qorum. This means that it should be enough to use linearizableReadNotify aka (check if member is up to day with leader apply index) and the return the local response without contacting the leader.

I'm not super familiar with lease code and this is just based on quickly reading through the code. We should double check with @Piotr Tabor and @spzala. Looking at #13690 I think that we should open an issue which describes this incorrect behavior in V3 API and removes going through leader.

@ahrtr
Copy link
Member Author

ahrtr commented Apr 4, 2022

What this PR tries to resolve is definitely a BUG, and I think it's different to what we discussed on the ListLease.

@serathius serathius mentioned this pull request Apr 6, 2022
28 tasks
@ahrtr
Copy link
Member Author

ahrtr commented Apr 7, 2022

The following test flake should can be resolved by this PR.



        lease_test.go:326: 
            	Error Trace:	lease_test.go:326
            	            				execute.go:29
            	            				asm_amd64.s:1581
            	Error:      	Not equal: 
            	            	expected: -1
            	            	actual  : 19
            	Test:       	TestLeaseGrantRevoke/PeerTLS

@ahrtr
Copy link
Member Author

ahrtr commented Apr 8, 2022

@ptabor @serathius @spzala Could you take a look at this PR? Once it's merged, I can backport to 3.5.

@serathius
Copy link
Member

I talked with @ptabor and we agreed that the current lease implementation is bad and fact that is not linearizable is a big problem. This PR addresses some of the issues but not all of them (checking isLeader is not enough to guarantee linearizablity). We also think that this is an bug that should be fixed in older releases (definetly v3.5 not sure if v3.4).

I will work on reproduction and create a proposal to rewrite the API for v3.5.3. @endocrimes this will also cover your work on LeaseLeases and LeaseList.

Copy link
Member

@serathius serathius left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We will use different approach to fix the issue.

@ahrtr
Copy link
Member Author

ahrtr commented Apr 8, 2022

Sounds good to me. Can we move this one out of 3.5.3? I don't think it's a blocker for 3.5.3

When etcdserver receives a LeaseRenew request, it may be still in
progress of processing the LeaseGrantRequest on exact the same
leaseID. Accordingly it may return a TTL=0 to client due to the
leaseID not found error. So the leader should wait for the appliedID
to be available before processing client requests.
Copy link
Contributor

@ptabor ptabor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did another round, and I think this patch is the best we can get in the shorterm:

  1. It does NOT fixes issues when leader changes, and it's not the goal of this patch.
  2. waitApplied index seems to not 'get stuck' as I initially thought when the member is disconnected, as ApplyWait() is just waiting whether current CommitteIndex (already received hardState) is applied. This process happens when all the needed entries are in the hard-state, so the application process should eventually converge...
  3. The solution that changes reporting 'TTL' does not address race with 'Grant' that the problem solves.

@ahrtr
Copy link
Member Author

ahrtr commented Apr 11, 2022

Thanks @ptabor for the feedback, which basically makes sense.

Just one comment/question on your comment It does NOT fixes issues when leader changes, and it's not the goal of this patch.. Just as I explained in 13915#issuecomment-1094491226, leader changing during the LeaseRenew shouldn't be a problem. Could you elaborate why do you think it's a problem?

@serathius serathius changed the title Support linearizable renew lease Leases wait for entries to be applied Apr 11, 2022
@serathius
Copy link
Member

Just as I explained in 13915#issuecomment-1094491226, leader changing during the LeaseRenew shouldn't be a problem. Could you elaborate why do you think it's a problem?

As explained in #13915, checking isLeader doesn't makes sense outside raft protocol. Leader election timeout of 1 second (default), means that it will take the leader up to 1 second to notice that it was disconnected from cluster, during that time check isLeader will still return true and leases will be served it.

Based on @ptabor comment I think that this change makes sense and addresses one part of the problem I missed, possibility that there is a LeaseRevoke request in hard state that is waiting to be applied. With a need to cut v3.5.3 soon, I think this is the best fix we can prepare for now.

@serathius
Copy link
Member

serathius commented Apr 11, 2022

FYI integration-1-cpu workflow is failing for the third time. Possibly not related to this issue, however still very worrying as the error in the last one was

logger.go:130: 2022-04-11T10:16:18.540Z	FATAL	m0	Verification failed	{"member": "m0", "data-dir": "/tmp/TestV3WatchRestoreSnapshotUnsync158866305/002/etcd3711313259", "error": "backend.ConsistentIndex (8) must be >= last snapshot index (24)"}

Possibly this is result of enabling ETCD_VERIFY in tests.

@ahrtr
Copy link
Member Author

ahrtr commented Apr 11, 2022

FYI integration-1-cpu workflow is failing for the third time. Possibly not related to this issue, however still very worrying as the error in the last one was

logger.go:130: 2022-04-11T10:16:18.540Z	FATAL	m0	Verification failed	{"member": "m0", "data-dir": "/tmp/TestV3WatchRestoreSnapshotUnsync158866305/002/etcd3711313259", "error": "backend.ConsistentIndex (8) must be >= last snapshot index (24)"}

Possibly this is result of enabling ETCD_VERIFY in tests.

I will take a deep dive into this tomorrow.

@ahrtr
Copy link
Member Author

ahrtr commented Apr 11, 2022

Yes, the failure should not be related to this PR. Previously all test were gree, I just rebased this PR.

The test are green again.

@serathius serathius merged commit dd08e15 into etcd-io:main Apr 11, 2022
@serathius
Copy link
Member

@ahrtr Can you backport the change to release-3.5?

@ahrtr
Copy link
Member Author

ahrtr commented Apr 11, 2022

@ahrtr Can you backport the change to release-3.5?

Sure, will do today.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

Successfully merging this pull request may close these issues.

None yet

6 participants