Client Service "tolerate-unready-endpoints" annotation causes connection issues #2030

gjcarneiro · 2018-12-18T19:17:45Z

I don't know whether this is just a question, or a bug in etcd-operator, or a bug in kubernetes itself.

In any case, I was just playing with a 3 node etcd cluster and the impact on clients when I delete one of the cluster members.

I use a simple command like etcdctl get --prefix / in a loop, while I delete one pod, and wait for a replacement pod to appear.

The problem happens when the new pod appears, it is not yet Ready, but the Service endpoints include the new pod:

$ kubectl get ep example-etcd-cluster-client
[...]
example-etcd-cluster-client   10.125.27.21:2379,10.125.6.10:2379,10.126.8.18:2379   41m

But at the same time the pod is still initialising:

example-etcd-cluster-kbn69vn2xn   0/1     Init:0/1   0          6s    10.125.6.10    hex-48b-pm.k2.gambit   <none>           <none>

As a result, my etcdctl gets either delayed for 1 second or so, or gets an error, or works fast, randomly.

As soon as the new pod finishes initialising, the service is restored.

To be honest, to be this seems like a Kubernetes bug -- it shouldn't add a pod to a Service endpoints list before the pod is Ready -- but I could be missing something. Otherwise, how can you have zero downtime with etcd, with expected node maintenance requiring me to evict pods from nodes, once in a while?

Any thoughts?

The text was updated successfully, but these errors were encountered:

gjcarneiro · 2018-12-18T19:43:57Z

Argh... after much googling and even reading kubernetes source code I found the problem:

07:42:09 {master} ~/Downloads/etcd-operator$ kubectl --context=dev get svc  example-etcd-cluster-client -o yaml
apiVersion: v1
kind: Service
metadata:
  annotations:
    service.alpha.kubernetes.io/tolerate-unready-endpoints: "true"

Why is the tolerate-unready-endpoints annotation being used, and how can I get rid of it?

hexfusion · 2018-12-18T20:25:53Z

Why is the tolerate-unready-endpoints annotation being used, and how can I get rid of it?

Looks like it is deprecated as well in favor of Service.spec.publishNotReadyAddresses but we need to understand it better as well. Can you keep digging on this for us?

kubernetes/kubernetes#63742

gjcarneiro · 2018-12-19T11:24:44Z

This is something you set here

This is the commit: 5635478

I think this is a very bad idea. Clients will try to connect to an etcd node that is not ready, causing high latency or even connection failure. And unfortunately I don't even see an option to disable this.

gjcarneiro · 2018-12-19T11:27:47Z

Documentation for publishNotReadyAddresses:

publishNotReadyAddresses, when set to true, indicates that DNS implementations must publish the notReadyAddresses of subsets for the Endpoints associated with the Service. The default value is false. The primary use case for setting this field is to use a StatefulSet’s Headless Service to propagate SRV records for its Pods without respect to their readiness for purpose of peer discovery. This field will replace the service.alpha.kubernetes.io/tolerate-unready-endpoints when that annotation is deprecated and all clients have been converted to use this field.

Well, this certainly does not fit the use case. This is neither a Stateful Set, nor is the Service a Headless one.

gjcarneiro · 2019-01-15T18:50:46Z

So, #1257 says "Set TolerateUnreadyEndpoints for service of peer URLs".

I have no problem with the peer service. Makes sense to have TolerateUnreadyEndpoints for it.

But the same patch also added TolerateUnreadyEndpoints to the client service. I think this one was accidental and is harmful. cc @hongchaodeng.

alaypatel07 · 2019-02-07T00:05:12Z

@gjcarneiro I agree, the peer URLs are networked by a headless service and the annotation is used in a similar manner as it would be in a StatefulSet (mentioned in the documentation of publishNotReadyAddress).

Each pod checks its own DNS entry before initializing, here:

etcd-operator/pkg/util/k8sutil/k8sutil.go

Line 373 in aeb3e3e

Name: "check-dns",

If only the headless service (one which services Peer URLs) is annotated with publishNotReadyAddresses:

A DNS entry will be populated because of the headless service, and check-dns will resolve correctly.
The [cluster-name]-client endpoint will not be populated with an unready pod IP:port-number, resolving this issue.

Good catch :)

Previously unready etcd nodes were already receiving client connections although they are still in the initiation phase and not able to accept any traffic. This caused connection failure or high latency. Fixes #2030 Signed-off-by: Christian Köhn <christian.koehn@figo.io>

Previously unready etcd nodes were already receiving client connections although they are still in the initiation phase and not able to accept any traffic. This caused connection failure or high latency. Fixes coreos#2030 Signed-off-by: Christian Köhn <christian.koehn@figo.io>

gjcarneiro changed the title ~~When a pod is still initialising, the -client service endpoints already include it~~ Client Service "tolerate-unready-endpoints" annotation causes connection issues Dec 19, 2018

ckoehn mentioned this issue Mar 4, 2019

pkg/util: Don't expose unready nodes via client service #2063

Merged

alaypatel07 mentioned this issue Apr 4, 2019

ETCD Snapshot is not taken when 2 client is reachable #2070

Open

hexfusion closed this as completed in #2063 Apr 17, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Client Service "tolerate-unready-endpoints" annotation causes connection issues #2030

Client Service "tolerate-unready-endpoints" annotation causes connection issues #2030

gjcarneiro commented Dec 18, 2018

gjcarneiro commented Dec 18, 2018

hexfusion commented Dec 18, 2018

gjcarneiro commented Dec 19, 2018

gjcarneiro commented Dec 19, 2018

gjcarneiro commented Jan 15, 2019

alaypatel07 commented Feb 7, 2019

Client Service "tolerate-unready-endpoints" annotation causes connection issues #2030

Client Service "tolerate-unready-endpoints" annotation causes connection issues #2030

Comments

gjcarneiro commented Dec 18, 2018

gjcarneiro commented Dec 18, 2018

hexfusion commented Dec 18, 2018

gjcarneiro commented Dec 19, 2018

gjcarneiro commented Dec 19, 2018

gjcarneiro commented Jan 15, 2019

alaypatel07 commented Feb 7, 2019