Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

if one etcd node is hang, watch is also blocked #7593

Closed
xiaoyulei opened this issue Mar 24, 2017 · 7 comments
Closed

if one etcd node is hang, watch is also blocked #7593

xiaoyulei opened this issue Mar 24, 2017 · 7 comments

Comments

@xiaoyulei
Copy link
Contributor

xiaoyulei commented Mar 24, 2017

For example, three etcd node A,B,C. AB is OK, but C is shutdown.
My program start a watch on etcd before shutdown but unfortunately connect to C, it can not receive any event for a long time over 5 min. Until it connect to another one, but it already lost something.
Is there anyway to handle this situation?

@xiaoyulei xiaoyulei changed the title if one etcd node is hang, watch is still block if one etcd node is hang, watch is also blocked Mar 24, 2017
@xiang90
Copy link
Contributor

xiang90 commented Mar 24, 2017

The best way to handle these situation is to setup monitoring properly and retire problematic members.

Additionally, you can create watch stream with RequiredLeader setting. Then a partitioned member or faulty member will drop client streams when RequiredLeader is set.

@xiang90 xiang90 closed this as completed Mar 24, 2017
@xiaoyulei
Copy link
Contributor Author

@xiang90 where is RequiredLeader? I did not find this setting.

@heyitsanthony
Copy link
Contributor

@YuleiXiao https://godoc.org/github.com/coreos/etcd/clientv3#WithRequireLeader

The client should probably be smart about these cases for multi-endpoint setups instead of leaving it on the user to work around. People seem to expect the client to be smart about failing over and it's a bit much to expect people to write code to switch endpoints (cf. #7321)

@roymarantz
Copy link

roymarantz commented Mar 24, 2017 via email

@heyitsanthony
Copy link
Contributor

@roymarantz If it's added to the client then it'll automatically be available through the grpcproxy. Adding it to the gateway is less likely since the gateway is not really meant to reason about cluster state / cluster protocols; it only forwards connections.

@roymarantz
Copy link

roymarantz commented Mar 25, 2017 via email

@heyitsanthony
Copy link
Contributor

@roymarantz the gateway only forwards data. The idea would be to give all the clients a gateway address like 127.0.0.1:27900, then have the gateway forward to endpoints so the clients don't need to be reconfigured for new endpoints. It doesn't understand etcd at the application level (e.g., whether an etcd member has lost the leader).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

4 participants