-
Notifications
You must be signed in to change notification settings - Fork 9.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
WIP: mod/lock #347
WIP: mod/lock #347
Conversation
Ouch I have totally being working on a lock module too.... told Brandon but did not notify etcd-dev :( |
@derekchiang Oh dang, I thought you were still working on the set module :( On the upside this is a really important bit so having two people who have attempted to implement it means we can get it right. |
@philips I realized I needed a lock module as soon as I started working on the set module, so I started writing a lock module instead. Sorry for not communicating this clearly with the team! My implementation is similar to Ben's, except that mine doesn't allow for renewing a lock, and that instead of using So what should I do now? |
@benbjohnson I like the approach alot! I think we need to tweak the API slightly to be resilient against long wait times and HTTP connections dying though:
As it is now there isn't a way to refresh your place in line, delete a lock that you no longer want or recover from a severed HTTP connection if the client or master dies. Maybe we should always stream back the body but keep the connection open on wait=true too. Overall it works and looks great on the handful of tests I tried out. |
Oh, and we should version this API just like the rest. Start at v1 |
@philips Sounds good. How do these changes sound:
You can delete a lock using the |
"Re: stream back the body but keep the connection open": Right now the API creates you an id that you don't get until you successfully get the lock. I can see how it would be useful to get your id back immediately in languages where keeping an HTTP connection open is expensive or the critical section is really long (minutes). So you stream back the ticket and give the client the ability to refresh it within the TTL window. We can add this feature later however but something to consider. |
@benbjohnson Xiang's response refactor got merged so we can probably clean this up and merge it. |
@philips I'm still adding the fixes to clean up broken connections. Then I'll merge in @xiangli-cmu's changes and then merge this branch. |
@benbjohnson Cool, sounds good. Thanks. |
Conflicts: server/v2/tests/delete_handler_test.go server/v2/tests/get_handler_test.go server/v2/tests/post_handler_test.go server/v2/tests/put_handler_test.go third_party/github.com/coreos/go-etcd/etcd/requests.go
@philips Ok, I got the connection monitoring working well, I versioned the lock to point at |
Overview
This pull request implements a locking module for etcd. The goal is to provide a way for server nodes to be able to easily lock resources while they are in use and then release the lock when it's done. This locking mechanism is similar, in premise, to the distributed lock system that Zookeeper.
Because etcd does not have ephemeral nodes, a simpler TTL-based approach was used to provide a fallback for failed nodes. This approach has the benefit that it can be used in languages that don't easily support multithreading such as BASH scripts.
Adding ephemeral nodes in the future can provide a quicker release of locks acquired by machines that fail completely. However, a TTL should still be used since a failed machine could simply hang and maintain the connection to etcd.
Usage
Acquiring a lock
To obtain a lock, simply send a
POST
request to/mod/lock/<key>
:This returns the index of the lock:
2
. You can use this index to renew or release the lock later.If another process already has a lock on
/myresource
, then the request will hang until the other process releases the lock or the TTL causes the lock to expire.Renewing a lock
If you have a long, multi-step process you can renew the lock that you originally obtained by using a
PUT
and the lock index:This will renew lock index
2
for another 60 seconds if the index still owns the lock. If the lock has been released for some reason, then an error will be returned.Releasing a lock
When you're done with your lock, you can release it by using a
DELETE
request with the index:Checking for lock existence
You can check to see the current index that has the lock by using a
GET
:However, this is mainly for debugging. Locks can be lost in the time between request and response of the
GET
call.Notes
I fixed a bug in the functional tests which caused a bunch of failed assertions. I don't think it's anything serious but I wanted to get this PR in front of you guys so you can review the approach.
Don't check in this PR yet. :)
/cc: @xiangli-cmu @philips