if one etcd node is hang, watch is also blocked #7593

xiaoyulei · 2017-03-24T03:24:46Z

For example, three etcd node A,B,C. AB is OK, but C is shutdown.
My program start a watch on etcd before shutdown but unfortunately connect to C, it can not receive any event for a long time over 5 min. Until it connect to another one, but it already lost something.
Is there anyway to handle this situation?

xiang90 · 2017-03-24T04:22:46Z

The best way to handle these situation is to setup monitoring properly and retire problematic members.

Additionally, you can create watch stream with RequiredLeader setting. Then a partitioned member or faulty member will drop client streams when RequiredLeader is set.

xiaoyulei · 2017-03-24T07:23:15Z

@xiang90 where is RequiredLeader? I did not find this setting.

heyitsanthony · 2017-03-24T19:29:23Z

@YuleiXiao https://godoc.org/github.com/coreos/etcd/clientv3#WithRequireLeader

The client should probably be smart about these cases for multi-endpoint setups instead of leaving it on the user to work around. People seem to expect the client to be smart about failing over and it's a bit much to expect people to write code to switch endpoints (cf. #7321)

roymarantz · 2017-03-24T21:01:27Z

Is there any chance of getting this kind of code into etcd gateway. I would love to have all the mess of maintaining a connection be there. Roy

…

On Fri, Mar 24, 2017 at 3:29 PM, Anthony Romano ***@***.***> wrote: @YuleiXiao <https://github.com/YuleiXiao> https://godoc.org/github.com/ coreos/etcd/clientv3#WithRequireLeader The client should probably be smart about these cases for multi-endpoint setups instead of leaving it on the user to work around. People seem to expect the client to be smart about failing over and it's a bit much to expect people to write code to switch endpoints (cf. #7321 <#7321>) — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub <#7593 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AF2LSiy2CPeTAIK-k_0TBOpVh4zWLJlSks5rpBmegaJpZM4MnoPy> .

heyitsanthony · 2017-03-24T21:14:51Z

@roymarantz If it's added to the client then it'll automatically be available through the grpcproxy. Adding it to the gateway is less likely since the gateway is not really meant to reason about cluster state / cluster protocols; it only forwards connections.

roymarantz · 2017-03-25T17:41:42Z

I'm using etcd gateway to hide connection details (endpoints) from my application clients. To me this is the kind of detail I'd like hidden for all applications. Am I using the gateway incorrectly?

…

On Fri, Mar 24, 2017 at 5:15 PM, Anthony Romano ***@***.***> wrote: @roymarantz <https://github.com/roymarantz> If it's added to the client then it'll automatically be available through the grpcproxy. Adding it to the gateway is less likely since the gateway is not really meant to reason about cluster state / cluster protocols; it only forwards connections. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#7593 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AF2LSi4tfKIKd_9uI_gRs84sb-6sHr-Rks5rpDJVgaJpZM4MnoPy> .

heyitsanthony · 2017-03-25T20:11:50Z

@roymarantz the gateway only forwards data. The idea would be to give all the clients a gateway address like 127.0.0.1:27900, then have the gateway forward to endpoints so the clients don't need to be reconfigured for new endpoints. It doesn't understand etcd at the application level (e.g., whether an etcd member has lost the leader).

xiaoyulei changed the title ~~if one etcd node is hang, watch is still block~~ if one etcd node is hang, watch is also blocked Mar 24, 2017

xiang90 closed this as completed Mar 24, 2017

thrawn01 mentioned this issue Sep 12, 2018

Added timeout while establishing etcd v3 watch vulcand/vulcand#360

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

if one etcd node is hang, watch is also blocked #7593

if one etcd node is hang, watch is also blocked #7593

xiaoyulei commented Mar 24, 2017 •

edited

Loading

xiang90 commented Mar 24, 2017

xiaoyulei commented Mar 24, 2017

heyitsanthony commented Mar 24, 2017

roymarantz commented Mar 24, 2017 via email

heyitsanthony commented Mar 24, 2017

roymarantz commented Mar 25, 2017 via email

heyitsanthony commented Mar 25, 2017

if one etcd node is hang, watch is also blocked #7593

if one etcd node is hang, watch is also blocked #7593

Comments

xiaoyulei commented Mar 24, 2017 • edited Loading

xiang90 commented Mar 24, 2017

xiaoyulei commented Mar 24, 2017

heyitsanthony commented Mar 24, 2017

roymarantz commented Mar 24, 2017 via email

heyitsanthony commented Mar 24, 2017

roymarantz commented Mar 25, 2017 via email

heyitsanthony commented Mar 25, 2017

xiaoyulei commented Mar 24, 2017 •

edited

Loading