Skip to content
This repository has been archived by the owner on Mar 28, 2020. It is now read-only.

is static tls configuration currently supported? #1962

Closed
tkellen opened this issue May 21, 2018 · 7 comments
Closed

is static tls configuration currently supported? #1962

tkellen opened this issue May 21, 2018 · 7 comments

Comments

@tkellen
Copy link

tkellen commented May 21, 2018

I am running v0.9.2 in "cluster-wide" mode.

Manifest: (works perfectly if TLS is commented out)

apiVersion: etcd.database.coreos.com/v1beta2
kind: EtcdCluster
metadata:
  name: etcd
  namespace: secrets
  annotations:
    etcd.database.coreos.com/scope: clusterwide
spec:
  size: 3
  version: "3.3.5"
  repository: quay.io/coreos/etcd
  TLS:
    static:
      member:
        peerSecret: etcd-peer-tls
        serverSecret: etcd-server-tls
      operatorSecret: etcd-etcd-client-tls

PKI:
tls.tar.gz

Resulting state:

➜ k get pods -n secrets
NAME                    READY     STATUS             RESTARTS   AGE
etcd-zl7vxqvgjl         0/1       Completed          0          3m
etcd-zltrqxjgdr         0/1       Error              0          2m

Logs for etcd-zltrqxjgdr

2018-05-21 00:15:38.078179 W | pkg/flags: unrecognized environment variable ETCD_CLIENT_PORT_2379_TCP=tcp://10.10.0.178:2379
2018-05-21 00:15:38.078225 W | pkg/flags: unrecognized environment variable ETCD_CLIENT_PORT_2379_TCP_PROTO=tcp
2018-05-21 00:15:38.078231 W | pkg/flags: unrecognized environment variable ETCD_CLIENT_SERVICE_PORT=2379
2018-05-21 00:15:38.078240 W | pkg/flags: unrecognized environment variable ETCD_CLIENT_SERVICE_PORT_CLIENT=2379
2018-05-21 00:15:38.078244 W | pkg/flags: unrecognized environment variable ETCD_CLIENT_PORT=tcp://10.10.0.178:2379
2018-05-21 00:15:38.078247 W | pkg/flags: unrecognized environment variable ETCD_CLIENT_PORT_2379_TCP_PORT=2379
2018-05-21 00:15:38.078251 W | pkg/flags: unrecognized environment variable ETCD_CLIENT_PORT_2379_TCP_ADDR=10.10.0.178
2018-05-21 00:15:38.078256 W | pkg/flags: unrecognized environment variable ETCD_CLIENT_SERVICE_HOST=10.10.0.178
2018-05-21 00:15:38.078280 I | etcdmain: etcd Version: 3.3.5
2018-05-21 00:15:38.078285 I | etcdmain: Git SHA: 70c872620
2018-05-21 00:15:38.078289 I | etcdmain: Go Version: go1.9.6
2018-05-21 00:15:38.078292 I | etcdmain: Go OS/Arch: linux/amd64
2018-05-21 00:15:38.078324 I | etcdmain: setting maximum number of CPUs to 2, total number of available CPUs is 2
2018-05-21 00:15:38.078359 I | embed: peerTLS: cert = /etc/etcdtls/member/peer-tls/peer.crt, key = /etc/etcdtls/member/peer-tls/peer.key, ca = , trusted-ca = /etc/etcdtls/member/peer-tls/peer-ca.crt, client-cert-auth = true, crl-file = 
2018-05-21 00:15:38.079901 I | embed: listening for peers on https://0.0.0.0:2380
2018-05-21 00:15:38.079942 I | embed: listening for client requests on 0.0.0.0:2379
2018-05-21 00:15:38.103522 W | etcdserver: could not get cluster response from https://etcd-zl7vxqvgjl.etcd.secrets.svc:2380: Get https://etcd-zl7vxqvgjl.etcd.secrets.svc:2380/members: EOF
2018-05-21 00:15:38.114090 I | embed: rejected connection from "10.1.1.23:55110" (error "tls: \"10.1.1.23\" does not match any of DNSNames [\"*.etcd.secrets.svc\" \"*.etcd.secrets.svc.cluster.local\"] (lookup 23.1.1.10.in-addr.arpa. on 10.10.0.2:53: dial udp 10.10.0.2:53: operation was canceled)", ServerName "etcd-zltrqxjgdr.etcd.secrets.svc", IPAddresses [], DNSNames ["*.etcd.secrets.svc" "*.etcd.secrets.svc.cluster.local"])
2018-05-21 00:15:38.114113 C | etcdmain: cannot fetch cluster info from peer urls: could not retrieve cluster information from the given urls

Logs for etcd-zl7vxqvgjl:

2018-05-21 00:15:05.180636 W | pkg/flags: unrecognized environment variable ETCD_CLIENT_PORT_2379_TCP_PROTO=tcp
2018-05-21 00:15:05.180684 W | pkg/flags: unrecognized environment variable ETCD_CLIENT_PORT_2379_TCP_ADDR=10.10.0.178
2018-05-21 00:15:05.180689 W | pkg/flags: unrecognized environment variable ETCD_CLIENT_PORT_2379_TCP=tcp://10.10.0.178:2379
2018-05-21 00:15:05.180693 W | pkg/flags: unrecognized environment variable ETCD_CLIENT_PORT_2379_TCP_PORT=2379
2018-05-21 00:15:05.180697 W | pkg/flags: unrecognized environment variable ETCD_CLIENT_SERVICE_PORT=2379
2018-05-21 00:15:05.180701 W | pkg/flags: unrecognized environment variable ETCD_CLIENT_SERVICE_PORT_CLIENT=2379
2018-05-21 00:15:05.180707 W | pkg/flags: unrecognized environment variable ETCD_CLIENT_PORT=tcp://10.10.0.178:2379
2018-05-21 00:15:05.180713 W | pkg/flags: unrecognized environment variable ETCD_CLIENT_SERVICE_HOST=10.10.0.178
2018-05-21 00:15:05.180735 I | etcdmain: etcd Version: 3.3.5
2018-05-21 00:15:05.180739 I | etcdmain: Git SHA: 70c872620
2018-05-21 00:15:05.180743 I | etcdmain: Go Version: go1.9.6
2018-05-21 00:15:05.180746 I | etcdmain: Go OS/Arch: linux/amd64
2018-05-21 00:15:05.180750 I | etcdmain: setting maximum number of CPUs to 2, total number of available CPUs is 2
2018-05-21 00:15:05.180780 I | embed: peerTLS: cert = /etc/etcdtls/member/peer-tls/peer.crt, key = /etc/etcdtls/member/peer-tls/peer.key, ca = , trusted-ca = /etc/etcdtls/member/peer-tls/peer-ca.crt, client-cert-auth = true, crl-file = 
2018-05-21 00:15:05.181408 I | embed: listening for peers on https://0.0.0.0:2380
2018-05-21 00:15:05.181443 I | embed: listening for client requests on 0.0.0.0:2379
2018-05-21 00:15:05.189312 I | pkg/netutil: resolving etcd-zl7vxqvgjl.etcd.secrets.svc:2380 to 10.1.1.23:2380
2018-05-21 00:15:05.192716 I | pkg/netutil: resolving etcd-zl7vxqvgjl.etcd.secrets.svc:2380 to 10.1.1.23:2380
2018-05-21 00:15:05.192775 I | etcdserver: name = etcd-zl7vxqvgjl
2018-05-21 00:15:05.192802 I | etcdserver: data dir = /var/etcd/data
2018-05-21 00:15:05.192807 I | etcdserver: member dir = /var/etcd/data/member
2018-05-21 00:15:05.192811 I | etcdserver: heartbeat = 100ms
2018-05-21 00:15:05.192815 I | etcdserver: election = 1000ms
2018-05-21 00:15:05.192818 I | etcdserver: snapshot count = 100000
2018-05-21 00:15:05.192827 I | etcdserver: advertise client URLs = https://etcd-zl7vxqvgjl.etcd.secrets.svc:2379
2018-05-21 00:15:05.192832 I | etcdserver: initial advertise peer URLs = https://etcd-zl7vxqvgjl.etcd.secrets.svc:2380
2018-05-21 00:15:05.192839 I | etcdserver: initial cluster = etcd-zl7vxqvgjl=https://etcd-zl7vxqvgjl.etcd.secrets.svc:2380
2018-05-21 00:15:05.198598 I | etcdserver: starting member 88f13611632a87eb in cluster e73c852ab2c57093
2018-05-21 00:15:05.198627 I | raft: 88f13611632a87eb became follower at term 0
2018-05-21 00:15:05.198636 I | raft: newRaft 88f13611632a87eb [peers: [], term: 0, commit: 0, applied: 0, lastindex: 0, lastterm: 0]
2018-05-21 00:15:05.198646 I | raft: 88f13611632a87eb became follower at term 1
2018-05-21 00:15:05.206801 W | auth: simple token is not cryptographically signed
2018-05-21 00:15:05.211109 I | etcdserver: starting server... [version: 3.3.5, cluster version: to_be_decided]
2018-05-21 00:15:05.213273 I | embed: ClientTLS: cert = /etc/etcdtls/member/server-tls/server.crt, key = /etc/etcdtls/member/server-tls/server.key, ca = , trusted-ca = /etc/etcdtls/member/server-tls/server-ca.crt, client-cert-auth = true, crl-file = 
2018-05-21 00:15:05.213421 I | etcdserver: 88f13611632a87eb as single-node; fast-forwarding 9 ticks (election ticks 10)
2018-05-21 00:15:05.213881 I | etcdserver/membership: added member 88f13611632a87eb [https://etcd-zl7vxqvgjl.etcd.secrets.svc:2380] to cluster e73c852ab2c57093
2018-05-21 00:15:05.998953 I | raft: 88f13611632a87eb is starting a new election at term 1
2018-05-21 00:15:05.998990 I | raft: 88f13611632a87eb became candidate at term 2
2018-05-21 00:15:05.999013 I | raft: 88f13611632a87eb received MsgVoteResp from 88f13611632a87eb at term 2
2018-05-21 00:15:05.999026 I | raft: 88f13611632a87eb became leader at term 2
2018-05-21 00:15:05.999034 I | raft: raft.node: 88f13611632a87eb elected leader 88f13611632a87eb at term 2
2018-05-21 00:15:05.999370 I | etcdserver: setting up the initial cluster version to 3.3
2018-05-21 00:15:06.000400 N | etcdserver/membership: set the initial cluster version to 3.3
2018-05-21 00:15:06.000441 I | etcdserver/api: enabled capabilities for version 3.3
2018-05-21 00:15:06.000466 I | etcdserver: published {Name:etcd-zl7vxqvgjl ClientURLs:[https://etcd-zl7vxqvgjl.etcd.secrets.svc:2379]} to cluster e73c852ab2c57093
2018-05-21 00:15:06.000507 I | embed: ready to serve client requests
2018-05-21 00:15:06.002248 I | embed: serving client requests on [::]:2379
2018-05-21 00:15:06.017930 I | etcdserver/membership: added member 9d2cd063aaf158a8 [https://etcd-zltrqxjgdr.etcd.secrets.svc:2380] to cluster e73c852ab2c57093
2018-05-21 00:15:06.017964 I | rafthttp: starting peer 9d2cd063aaf158a8...
2018-05-21 00:15:06.018004 I | rafthttp: started HTTP pipelining with peer 9d2cd063aaf158a8
2018-05-21 00:15:06.020176 I | rafthttp: started streaming with peer 9d2cd063aaf158a8 (writer)
2018-05-21 00:15:06.020400 I | rafthttp: started streaming with peer 9d2cd063aaf158a8 (writer)
2018-05-21 00:15:06.021183 I | rafthttp: started peer 9d2cd063aaf158a8
2018-05-21 00:15:06.021272 I | rafthttp: added peer 9d2cd063aaf158a8
2018-05-21 00:15:06.021393 I | rafthttp: started streaming with peer 9d2cd063aaf158a8 (stream MsgApp v2 reader)
2018-05-21 00:15:06.021648 I | rafthttp: started streaming with peer 9d2cd063aaf158a8 (stream Message reader)
2018-05-21 00:15:07.998912 W | raft: 88f13611632a87eb stepped down to follower since quorum is not active
2018-05-21 00:15:07.998943 I | raft: 88f13611632a87eb became follower at term 2
2018-05-21 00:15:07.998950 I | raft: raft.node: 88f13611632a87eb lost leader 88f13611632a87eb at term 2
2018-05-21 00:15:09.598875 I | raft: 88f13611632a87eb is starting a new election at term 2
2018-05-21 00:15:09.598905 I | raft: 88f13611632a87eb became candidate at term 3
2018-05-21 00:15:09.598913 I | raft: 88f13611632a87eb received MsgVoteResp from 88f13611632a87eb at term 3
2018-05-21 00:15:09.598922 I | raft: 88f13611632a87eb [logterm: 2, index: 5] sent MsgVote request to 9d2cd063aaf158a8 at term 3
2018-05-21 00:15:11.021603 W | rafthttp: health check for peer 9d2cd063aaf158a8 could not connect: dial tcp: lookup etcd-zltrqxjgdr.etcd.secrets.svc on 10.10.0.2:53: no such host
2018-05-21 00:15:11.400073 I | raft: 88f13611632a87eb is starting a new election at term 3
2018-05-21 00:15:11.400106 I | raft: 88f13611632a87eb became candidate at term 4
2018-05-21 00:15:11.400117 I | raft: 88f13611632a87eb received MsgVoteResp from 88f13611632a87eb at term 4
2018-05-21 00:15:11.400156 I | raft: 88f13611632a87eb [logterm: 2, index: 5] sent MsgVote request to 9d2cd063aaf158a8 at term 4
2018-05-21 00:15:12.998945 I | raft: 88f13611632a87eb is starting a new election at term 4
2018-05-21 00:15:12.998975 I | raft: 88f13611632a87eb became candidate at term 5
2018-05-21 00:15:12.998984 I | raft: 88f13611632a87eb received MsgVoteResp from 88f13611632a87eb at term 5
2018-05-21 00:15:12.999154 I | raft: 88f13611632a87eb [logterm: 2, index: 5] sent MsgVote request to 9d2cd063aaf158a8 at term 5
2018-05-21 00:15:14.698945 I | raft: 88f13611632a87eb is starting a new election at term 5
2018-05-21 00:15:14.699025 I | raft: 88f13611632a87eb became candidate at term 6
2018-05-21 00:15:14.699037 I | raft: 88f13611632a87eb received MsgVoteResp from 88f13611632a87eb at term 6
2018-05-21 00:15:14.699081 I | raft: 88f13611632a87eb [logterm: 2, index: 5] sent MsgVote request to 9d2cd063aaf158a8 at term 6
2018-05-21 00:15:14.762599 W | etcdserver: read-only range request "key:\"foo\" " took too long (4.989795589s) to execute
2018-05-21 00:15:16.021830 W | rafthttp: health check for peer 9d2cd063aaf158a8 could not connect: dial tcp: lookup etcd-zltrqxjgdr.etcd.secrets.svc on 10.10.0.2:53: no such host
2018-05-21 00:15:16.298926 I | raft: 88f13611632a87eb is starting a new election at term 6
2018-05-21 00:15:16.298957 I | raft: 88f13611632a87eb became candidate at term 7
2018-05-21 00:15:16.298966 I | raft: 88f13611632a87eb received MsgVoteResp from 88f13611632a87eb at term 7
2018-05-21 00:15:16.298976 I | raft: 88f13611632a87eb [logterm: 2, index: 5] sent MsgVote request to 9d2cd063aaf158a8 at term 7
2018-05-21 00:15:16.772827 W | etcdserver: timed out waiting for read index response
2018-05-21 00:15:17.398911 I | raft: 88f13611632a87eb is starting a new election at term 7
2018-05-21 00:15:17.398938 I | raft: 88f13611632a87eb became candidate at term 8
2018-05-21 00:15:17.398951 I | raft: 88f13611632a87eb received MsgVoteResp from 88f13611632a87eb at term 8
2018-05-21 00:15:17.398960 I | raft: 88f13611632a87eb [logterm: 2, index: 5] sent MsgVote request to 9d2cd063aaf158a8 at term 8
2018-05-21 00:15:18.798905 I | raft: 88f13611632a87eb is starting a new election at term 8
2018-05-21 00:15:18.798940 I | raft: 88f13611632a87eb became candidate at term 9
2018-05-21 00:15:18.798950 I | raft: 88f13611632a87eb received MsgVoteResp from 88f13611632a87eb at term 9
2018-05-21 00:15:18.799065 I | raft: 88f13611632a87eb [logterm: 2, index: 5] sent MsgVote request to 9d2cd063aaf158a8 at term 9
2018-05-21 00:15:19.824141 W | etcdserver: read-only range request "key:\"foo\" " took too long (4.989969325s) to execute
2018-05-21 00:15:19.826778 I | embed: rejected connection from "[::1]:55438" (error "EOF", ServerName "")
2018-05-21 00:15:20.298907 I | raft: 88f13611632a87eb is starting a new election at term 9
2018-05-21 00:15:20.298949 I | raft: 88f13611632a87eb became candidate at term 10
2018-05-21 00:15:20.298972 I | raft: 88f13611632a87eb received MsgVoteResp from 88f13611632a87eb at term 10
2018-05-21 00:15:20.298981 I | raft: 88f13611632a87eb [logterm: 2, index: 5] sent MsgVote request to 9d2cd063aaf158a8 at term 10
2018-05-21 00:15:21.022049 W | rafthttp: health check for peer 9d2cd063aaf158a8 could not connect: dial tcp: lookup etcd-zltrqxjgdr.etcd.secrets.svc on 10.10.0.2:53: no such host
2018-05-21 00:15:21.698962 I | raft: 88f13611632a87eb is starting a new election at term 10
2018-05-21 00:15:21.699014 I | raft: 88f13611632a87eb became candidate at term 11
2018-05-21 00:15:21.699030 I | raft: 88f13611632a87eb received MsgVoteResp from 88f13611632a87eb at term 11
2018-05-21 00:15:21.699049 I | raft: 88f13611632a87eb [logterm: 2, index: 5] sent MsgVote request to 9d2cd063aaf158a8 at term 11
2018-05-21 00:15:23.198958 I | raft: 88f13611632a87eb is starting a new election at term 11
2018-05-21 00:15:23.198992 I | raft: 88f13611632a87eb became candidate at term 12
2018-05-21 00:15:23.199002 I | raft: 88f13611632a87eb received MsgVoteResp from 88f13611632a87eb at term 12
2018-05-21 00:15:23.199011 I | raft: 88f13611632a87eb [logterm: 2, index: 5] sent MsgVote request to 9d2cd063aaf158a8 at term 12
2018-05-21 00:15:23.773055 W | etcdserver: timed out waiting for read index response
2018-05-21 00:15:24.298977 I | raft: 88f13611632a87eb is starting a new election at term 12
2018-05-21 00:15:24.299057 I | raft: 88f13611632a87eb became candidate at term 13
2018-05-21 00:15:24.299093 I | raft: 88f13611632a87eb received MsgVoteResp from 88f13611632a87eb at term 13
2018-05-21 00:15:24.299117 I | raft: 88f13611632a87eb [logterm: 2, index: 5] sent MsgVote request to 9d2cd063aaf158a8 at term 13
2018-05-21 00:15:24.896286 W | etcdserver: read-only range request "key:\"foo\" " took too long (4.989408333s) to execute
2018-05-21 00:15:24.897143 I | embed: rejected connection from "[::1]:55460" (error "EOF", ServerName "")
2018-05-21 00:15:25.598977 I | raft: 88f13611632a87eb is starting a new election at term 13
2018-05-21 00:15:25.599015 I | raft: 88f13611632a87eb became candidate at term 14
2018-05-21 00:15:25.599026 I | raft: 88f13611632a87eb received MsgVoteResp from 88f13611632a87eb at term 14
2018-05-21 00:15:25.599035 I | raft: 88f13611632a87eb [logterm: 2, index: 5] sent MsgVote request to 9d2cd063aaf158a8 at term 14
2018-05-21 00:15:26.022265 W | rafthttp: health check for peer 9d2cd063aaf158a8 could not connect: dial tcp: lookup etcd-zltrqxjgdr.etcd.secrets.svc on 10.10.0.2:53: no such host
2018-05-21 00:15:26.798927 I | raft: 88f13611632a87eb is starting a new election at term 14
@tkellen
Copy link
Author

tkellen commented May 21, 2018

I assume it is due to this TLS error:

embed: rejected connection from "10.1.1.23:55110" (error "tls: \"10.1.1.23\" does not match any of DNSNames [\"*.etcd.secrets.svc\" \"*.etcd.secrets.svc.cluster.local\"] (lookup 23.1.1.10.in-addr.arpa. on 10.10.0.2:53: dial udp 10.10.0.2:53: operation was canceled)", ServerName "etcd-zltrqxjgdr.etcd.secrets.svc", IPAddresses [], DNSNames ["*.etcd.secrets.svc" "*.etcd.secrets.svc.cluster.local"])

#1384 seems related.

@tkellen
Copy link
Author

tkellen commented May 21, 2018

Confirmed.

etcd 3.1.15 works correctly, anything newer fails as above.

@tkellen
Copy link
Author

tkellen commented May 21, 2018

etcd-io/etcd#8268 is related as well.

@tkellen
Copy link
Author

tkellen commented May 21, 2018

Also relevant:
https://github.com/coreos/etcd/blob/master/Documentation/op-guide/security.md

Looks like there has been some pretty intense thrashing around how to manage this from 3.2 on. Not quite clear how to satisfy the constraints defined in that document (yet).

@tkellen
Copy link
Author

tkellen commented May 21, 2018

etcd-io/etcd#8534 also documents this.

At this point I believe a simple statefulset will be easier to manage. I want to love the idea of operators but, as of today they seem like an abstraction too far.

If anyone else finds themselves here, you might appreciate:
https://github.com/sgotti/k8s-persistent-etcd

@raoofm
Copy link

raoofm commented May 21, 2018

#1323 This is what is missing. Once this is fixed then you can probably use it.

@tkellen
Copy link
Author

tkellen commented May 21, 2018

Following up to say that this persisted while using etcd directly. The underlying issue was the fact that my cluster's DNS server (coredns) wasn't configured to handle reverse dns lookups for the pod CIDR.

etcd-io/etcd#8803 is also relevant to this issue.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants