Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Network-partition aware health service #8673

Closed
gyuho opened this issue Oct 10, 2017 · 7 comments
Closed

Network-partition aware health service #8673

gyuho opened this issue Oct 10, 2017 · 7 comments

Comments

@gyuho
Copy link
Contributor

gyuho commented Oct 10, 2017

clientv3 health balancer should be able to reason about network-partitions with keepalive HTTP/2 ping.

#8660 makes balancer aware of network partitions on time-out errors.
But only handles the case when client specifies context time-out.

We can do better.

  1. Client sends linearized requests with context time-out x
    • or no time-out with context.Background
  2. Client configures keepalive HTTP/2 ping time-out y, where y < x
  3. Balancer pins endpoint A in 3-node cluster
  4. Member A becomes isolated
  5. Linearized request to A blocks until time-out x
    • blocks forever if requested with context.Background

When y < x, keepalive pings should detect that member A cannot reach other members.
Then trigger endpoint switch before time-out x elapse.

@xiang90 xiang90 added this to the v3.4.0 milestone Oct 10, 2017
@gyuho gyuho changed the title clientv3 network-partiton-aware keepalive HTTP/2 ping clientv3 network-partition-aware keepalive HTTP/2 ping Oct 17, 2017
@gyuho
Copy link
Contributor Author

gyuho commented Oct 17, 2017

/cc @jpbetz

@jpbetz
Copy link
Contributor

jpbetz commented Oct 17, 2017

/cc @wojtek-t

@gyuho gyuho modified the milestones: etcd-v3.4, etcd-v3.5 Mar 3, 2018
@gyuho gyuho changed the title clientv3 network-partition-aware keepalive HTTP/2 ping Network-partition aware health service May 2, 2018
@gyuho
Copy link
Contributor Author

gyuho commented May 2, 2018

We still plan to do this (maybe in v3.5).

Merging with #8022, since both server and client need reason about network partitions. Current HTTP/2 ping client and server do not reason about network partition.

There are cases where the client needs to know that the server is responding to requests. If the server is not responding (but possibly still accepting connections), the balancer should try another endpoint. For instance, a watch may be disconnected but the connection will stay open and hang instead of connecting to another member.

This heartbeat mechanism is somewhat weaker in that it doesn't need a leader, but there still needs to be some kind of polling to avoid waiting indefinitely on a disconnected socket. The client implementations for heartbeat/lost leader reconnect can both poll or lost leader can depend on heartbeat to cover the wait case.

ref. #7321

@gyuho
Copy link
Contributor Author

gyuho commented Jun 15, 2018

Another use case kubernetes/kubernetes#59848 (comment)

This can happen on a singleton master using an HA etcd cluster:

apiserver is talking to member 0 in the cluster
apiserver crashes and watch cache performs its initial list at RV=10
kubelet deletes pod at RV=11
a new pod is created at RV=12 on a different node
kubelet crashes
apiserver becomes partitioned from the etcd majority but is able to reestablish a watch from the partitioned member at RV=10
kubelet contacts the apiserver and sees RV=10, launches the pod

@gyuho
Copy link
Contributor Author

gyuho commented Jun 20, 2018

And just to clarify, currently require leader metadata must be passed to prevent this:

wch :=  cli.Watch(clientv3.WithRequireLeader(context.Background()), "foo")
// will be closed when the node loses leader
wch :=  cli.Watch(context.Background(), "foo")
// block forever, even if a node loses leader

@stale
Copy link

stale bot commented Apr 7, 2020

This issue has been automatically marked as stale because it has not had recent activity. It will be closed after 21 days if no further activity occurs. Thank you for your contributions.

@stale stale bot added the stale label Apr 7, 2020
@stale stale bot closed this as completed Apr 28, 2020
@spzala spzala reopened this Jun 8, 2020
@stale stale bot removed the stale label Jun 8, 2020
@stale
Copy link

stale bot commented Sep 6, 2020

This issue has been automatically marked as stale because it has not had recent activity. It will be closed after 21 days if no further activity occurs. Thank you for your contributions.

@stale stale bot added the stale label Sep 6, 2020
@stale stale bot closed this as completed Sep 27, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

No branches or pull requests

4 participants