Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unable to launch v3.1.0-alpha.1 with full client cert on mac el capitan #6565

Closed
smarterclayton opened this issue Oct 1, 2016 · 19 comments
Closed
Assignees

Comments

@smarterclayton
Copy link
Contributor

smarterclayton commented Oct 1, 2016

Using the latest etcd-v3.1.0-alpha.1 binaries on OS X El Capitan, I am unable to launch a full etcd client cert cluster using the same cert setup that worked throughout 2.x -> 3.0.x (full client and server certs).

When I applied the patch for X onto v3.1.0-alpha.0, I was able to launch with the following config. The certs I am using have both DNS and IP sections for local IPs, so my initial suspicion was that something around SNI name selection or gRPC host -> tls config was changed, or that something around how etcd expects to receive client certs changed.

$ curl -L https://github.com/coreos/etcd/releases/download/v3.1.0-alpha.1/etcd-v3.1.0-alpha.1-darwin-amd64.zip -o etcd-v3.1.0-alpha.1-darwin-amd64.zip
$ unzip etcd-v3.1.0-alpha.1-darwin-amd64.zip
$ etcd-v3.1.0-alpha.1-darwin-amd64/etcd --listen-peer-urls=https://0.0.0.0:7001 --listen-client-urls=https://0.0.0.0:4001  --advertise-client-urls=https://192.168.1.103:4001 --cert-file openshift.local.config/master/etcd.server.crt --key-file openshift.local.config/master/etcd.server.key --peer-cert-file openshift.local.config/master/etcd.server.crt  --peer-key-file openshift.local.config/master/etcd.server.key --initial-advertise-peer-urls https://192.168.1.103:7001 --initial-cluster=default=https://192.168.1.103:7001 --peer-client-cert-auth --client-cert-auth
2016-10-01 14:58:14.133430 I | etcdmain: etcd Version: 3.1.0-alpha.1
2016-10-01 14:58:14.133519 I | etcdmain: Git SHA: 2469a95
2016-10-01 14:58:14.133522 I | etcdmain: Go Version: go1.7.1
2016-10-01 14:58:14.133528 I | etcdmain: Go OS/Arch: darwin/amd64
2016-10-01 14:58:14.133531 I | etcdmain: setting maximum number of CPUs to 8, total number of available CPUs is 8
2016-10-01 14:58:14.133539 W | etcdmain: no data-dir provided, using default data-dir ./default.etcd
2016-10-01 14:58:14.133569 I | embed: peerTLS: cert = openshift.local.config/master/etcd.server.crt, key = openshift.local.config/master/etcd.server.key, ca = , trusted-ca = , client-cert-auth = true
2016-10-01 14:58:14.134151 I | embed: listening for peers on https://0.0.0.0:7001
2016-10-01 14:58:14.134189 I | embed: listening for client requests on 0.0.0.0:4001
2016-10-01 14:58:14.136236 I | etcdserver: name = default
2016-10-01 14:58:14.136246 I | etcdserver: data dir = default.etcd
2016-10-01 14:58:14.136250 I | etcdserver: member dir = default.etcd/member
2016-10-01 14:58:14.136253 I | etcdserver: heartbeat = 100ms
2016-10-01 14:58:14.136256 I | etcdserver: election = 1000ms
2016-10-01 14:58:14.136259 I | etcdserver: snapshot count = 10000
2016-10-01 14:58:14.136265 I | etcdserver: advertise client URLs = https://192.168.1.103:4001
2016-10-01 14:58:14.136269 I | etcdserver: initial advertise peer URLs = https://192.168.1.103:7001
2016-10-01 14:58:14.136274 I | etcdserver: initial cluster = default=https://192.168.1.103:7001
2016-10-01 14:58:14.241248 I | etcdserver: starting member 3092679e8c56a1a5 in cluster e989df3141e943e1
2016-10-01 14:58:14.241294 I | raft: 3092679e8c56a1a5 became follower at term 0
2016-10-01 14:58:14.241314 I | raft: newRaft 3092679e8c56a1a5 [peers: [], term: 0, commit: 0, applied: 0, lastindex: 0, lastterm: 0]
2016-10-01 14:58:14.241324 I | raft: 3092679e8c56a1a5 became follower at term 1
2016-10-01 14:58:14.244579 I | etcdserver: starting server... [version: 3.1.0-alpha.1, cluster version: to_be_decided]
2016-10-01 14:58:14.244607 I | embed: ClientTLS: cert = openshift.local.config/master/etcd.server.crt, key = openshift.local.config/master/etcd.server.key, ca = , trusted-ca = , client-cert-auth = true
2016-10-01 14:58:14.244889 E | etcdserver: cannot monitor file descriptor usage (cannot get FDUsage on darwin)
2016-10-01 14:58:14.245190 I | membership: added member 3092679e8c56a1a5 [https://192.168.1.103:7001] to cluster e989df3141e943e1
2016-10-01 14:58:14.643319 I | raft: 3092679e8c56a1a5 is starting a new election at term 1
2016-10-01 14:58:14.643470 I | raft: 3092679e8c56a1a5 became candidate at term 2
2016-10-01 14:58:14.643482 I | raft: 3092679e8c56a1a5 received vote from 3092679e8c56a1a5 at term 2
2016-10-01 14:58:14.643498 I | raft: 3092679e8c56a1a5 became leader at term 2
2016-10-01 14:58:14.643509 I | raft: raft.node: 3092679e8c56a1a5 elected leader 3092679e8c56a1a5 at term 2
2016-10-01 14:58:14.643746 I | etcdserver: setting up the initial cluster version to 3.1
2016-10-01 14:58:14.648828 N | membership: set the initial cluster version to 3.1
2016-10-01 14:58:14.648873 I | etcdserver: published {Name:default ClientURLs:[https://192.168.1.103:4001]} to cluster e989df3141e943e1
2016-10-01 14:58:14.648901 I | embed: ready to serve client requests
2016-10-01 14:58:14.648933 I | api: enabled capabilities for version 3.1
2016-10-01 14:58:14.649463 I | embed: serving client requests on [::]:4001
2016/10/01 14:58:14 Failed to dial [::]:4001: connection error: desc = "transport: remote error: tls: bad certificate"; please retry.
2016/10/01 14:58:14 Failed to dial [::]:4001: connection error: desc = "transport: remote error: tls: bad certificate"; please retry.
2016/10/01 14:58:14 Failed to dial [::]:4001: connection error: desc = "transport: remote error: tls: bad certificate"; please retry.
2016/10/01 14:58:14 Failed to dial [::]:4001: connection error: desc = "transport: remote error: tls: bad certificate"; please retry.
2016/10/01 14:58:14 Failed to dial [::]:4001: connection error: desc = "transport: remote error: tls: bad certificate"; please retry.
2016/10/01 14:58:14 Failed to dial [::]:4001: connection error: desc = "transport: remote error: tls: bad certificate"; please retry.

If I remove --peer-client-cert-auth --client-cert-auth then the server starts (so I suspect the wrong cert is being presented). Is this a change to how etcd expects client certs + server certs to be presented?

Attached are the certs. The master.etcd-client.{key,cert} files are the ones we use for clients to access with.
certs.zip

@smarterclayton
Copy link
Contributor Author

smarterclayton commented Oct 1, 2016

$ openssl x509 -in openshift.local.config/master/etcd.server.crt -text
Certificate:
    Data:
        Version: 3 (0x2)
        Serial Number: 6 (0x6)
        Signature Algorithm: sha256WithRSAEncryption
        Issuer: CN=openshift-signer@1475348241
        Validity
            Not Before: Oct  1 18:57:21 2016 GMT
            Not After : Oct  1 18:57:22 2018 GMT
        Subject: CN=127.0.0.1
        Subject Public Key Info:
            Public Key Algorithm: rsaEncryption
            RSA Public Key: (2048 bit)
                Modulus (2048 bit):
                    00:e6:27:19:0a:4e:31:21:79:db:89:21:e8:09:42:
                    a7:84:2a:da:8e:6d:95:8b:17:4b:36:a5:65:69:4b:
                    e5:0a:80:f6:04:71:7e:c9:ee:3e:06:7f:5d:c4:ab:
                    03:a1:1f:dc:85:89:7c:f7:93:94:27:db:95:72:e6:
                    cc:4e:3a:33:e1:0b:28:d8:2d:32:a1:a3:5a:c8:2e:
                    85:75:3b:a9:f1:b3:73:fd:30:8d:c9:08:a3:df:ae:
                    4b:b9:cd:0c:fe:66:fb:64:43:b5:be:32:f5:40:a0:
                    63:98:a2:8b:22:15:12:3f:10:9e:99:c2:bb:c5:b6:
                    15:b8:40:eb:01:2e:fb:91:81:7f:29:e7:42:59:0a:
                    c7:26:02:b2:2c:85:dd:de:27:4c:54:b7:09:87:a8:
                    b5:e6:99:70:bf:d7:d5:e3:d0:55:9a:aa:4c:2e:df:
                    4d:b4:5d:49:79:5d:f6:e0:ee:dc:5a:00:54:90:ee:
                    3f:5f:95:fa:72:58:10:4f:69:49:d3:38:b1:bc:61:
                    7a:0a:55:a8:75:15:1a:65:86:35:41:fa:69:da:4b:
                    bb:6e:67:d5:c7:10:a8:7f:e9:1c:ba:33:7a:8e:7d:
                    6e:df:a8:33:e9:8f:e5:24:f2:bd:6e:69:db:4f:10:
                    87:73:1c:38:ba:0e:e7:a3:4b:32:c9:0f:ba:e8:d4:
                    30:7d
                Exponent: 65537 (0x10001)
        X509v3 extensions:
            X509v3 Key Usage: critical
                Digital Signature, Key Encipherment
            X509v3 Extended Key Usage:
                TLS Web Server Authentication
            X509v3 Basic Constraints: critical
                CA:FALSE
            X509v3 Subject Alternative Name:
                DNS:kubernetes, DNS:kubernetes.default, DNS:kubernetes.default.svc, DNS:kubernetes.default.svc.cluster.local, DNS:localhost, DNS:openshift, DNS:openshift.default, DNS:openshift.default.svc, DNS:openshift.default.svc.cluster.local, DNS:127.0.0.1, DNS:172.30.0.1, DNS:192.168.1.103, IP Address:127.0.0.1, IP Address:172.30.0.1, IP Address:192.168.1.103
$ openssl x509 -in openshift.local.config/master/master.etcd-client.crt -text
Certificate:
    Data:
        Version: 3 (0x2)
        Serial Number: 5 (0x5)
        Signature Algorithm: sha256WithRSAEncryption
        Issuer: CN=openshift-signer@1475348241
        Validity
            Not Before: Oct  1 18:57:21 2016 GMT
            Not After : Oct  1 18:57:22 2018 GMT
        Subject: CN=system:master
        Subject Public Key Info:
            Public Key Algorithm: rsaEncryption
            RSA Public Key: (2048 bit)
                Modulus (2048 bit):
                    00:d0:70:5e:9a:f5:b8:f8:f3:3b:4b:3c:c0:f3:55:
                    e0:83:60:2b:c0:4f:b0:c5:08:81:75:4a:c2:09:a7:
                    df:2e:7c:be:e1:14:2a:c5:ff:71:fc:b1:4c:02:7b:
                    72:bf:1d:09:e9:9e:ab:25:67:c4:3f:8c:1d:a3:95:
                    57:2d:a1:71:38:5f:d3:06:5e:fa:41:2c:f8:3c:3d:
                    b8:ea:c0:ac:78:40:c7:eb:39:aa:de:df:21:4a:e7:
                    c7:27:d7:0e:b8:e6:7f:06:5d:60:aa:b9:0c:4c:04:
                    82:51:4b:7e:7b:5b:ff:93:75:0d:ad:bc:17:e4:1a:
                    57:30:11:6a:3a:ed:22:6e:e7:d7:b0:47:43:11:d2:
                    27:5a:b2:23:07:8b:1a:6b:0b:a2:9c:0a:2d:88:98:
                    21:a1:83:73:3a:5e:b3:13:ff:6f:af:32:6e:5f:0d:
                    54:cc:36:7e:49:5c:2f:81:ca:d2:33:d2:fa:1b:19:
                    51:6f:9e:71:8e:f5:0e:48:fd:ad:02:59:00:cd:e1:
                    2e:e3:86:06:1a:11:58:ac:da:17:13:45:7c:7c:c7:
                    d0:0a:70:8b:f3:65:2b:c4:27:2f:38:ef:c5:ec:1c:
                    a2:26:67:f9:5f:28:50:96:40:55:63:fb:66:1a:2c:
                    fd:69:a9:b9:dd:5b:31:a4:83:39:92:71:a9:34:27:
                    2a:95
                Exponent: 65537 (0x10001)
        X509v3 extensions:
            X509v3 Key Usage: critical
                Digital Signature, Key Encipherment
            X509v3 Extended Key Usage:
                TLS Web Client Authentication
            X509v3 Basic Constraints: critical
                CA:FALSE

@gyuho
Copy link
Contributor

gyuho commented Oct 4, 2016

I tried the certs that you attached, and am getting same errors in linux + etcd master branch. There was no significant change in grpc and etcd client between alpha.0 and alpha.1, as far as I know.

@heyitsanthony Any idea?

@timothysc
Copy link

@gyuho w/ golang 1.7?

@gyuho
Copy link
Contributor

gyuho commented Oct 4, 2016

@timothysc I tried with Go 1.7.1

@timothysc
Copy link

So I was trying to compare godep versions on deps, but in v3.1.0-alpha.1 and master the Godeps.json has been dropped.

What are you using for version tracking your dependencies now?

@gyuho
Copy link
Contributor

gyuho commented Oct 5, 2016

We use glide now.

@timothysc
Copy link

@smarterclayton does this problem exist on go-1.6? All major grpc deps are the same across 3.0.X & 3.1.

@smarterclayton
Copy link
Contributor Author

I was able to reproduce it against Go 1.6.2

@gyuho
Copy link
Contributor

gyuho commented Oct 10, 2016

So just to clarify both in Go 1.6.x and 1.7.x, same certs that work in v3.1.0-alpha.0 do not work in v3.1.0-alpha.1, right? Between two releases, there's only this change 7a48ca4 that might have affected the client TLS part, as far as I know.

Can you share how you generate your certs?

Thanks.

@smarterclayton
Copy link
Contributor Author

https://github.com/openshift/origin/blob/master/pkg/cmd/server/crypto/crypto.go#L474 and https://github.com/openshift/origin/blob/master/pkg/cmd/server/crypto/crypto.go#L493 and https://github.com/openshift/origin/blob/master/pkg/cmd/server/crypto/crypto.go#L544 are the heart of it.

I think I only checked on v3.1.0-alpha.0/1 for Go 1.7, and alpha.1 for Go 1.6.

No GRPC bump, right? I know that there was logic in both Go and GRPC that changed how client certificates were checked (more flexibility), but as you note the only major change was 7a48ca4.

I'll try bisecting between that commit and v3.1.0-alpha.0, since I know alpha.0 with that commit worked.

@gyuho
Copy link
Contributor

gyuho commented Oct 14, 2016

@smarterclayton Yes, gRPC bump only happened after v3.1.0-alpha.1.

69ea359

Thanks!

@smarterclayton
Copy link
Contributor Author

It looks like d26cfdb with 69ea359 was the last working commit, and c9e06fa with 69ea359 is broken.

@gyuho
Copy link
Contributor

gyuho commented Oct 20, 2016

@smarterclayton That's strange... c9e06fa doesn't seem to affect any cert-related code-path.

And grpc bump is now related? Thought v3.1.0-alpha.0 is the last release that's working. And v3.1.0-alpha.1 does not work but this does not have that grpc bump either.

@smarterclayton
Copy link
Contributor Author

I see the comment about the host name passed to dial in gRPC must match the
client cert. Since this has to do with client certs not connecting, it
does seem likely that this changed the behavior of the name that the gRPC
client is dialing the server with.

On Wed, Oct 19, 2016 at 11:24 PM, Gyu-Ho Lee notifications@github.com
wrote:

@smarterclayton https://github.com/smarterclayton That's strange...
c9e06fa
c9e06fa
doesn't seem to affect any cert-related code-path.

And grpc bump is now related? Thought v3.1.0-alpha.0 is the last release
that's working. And v3.1.0-alpha.1 does not work but this does not have
that grpc bump either.


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#6565 (comment), or mute
the thread
https://github.com/notifications/unsubscribe-auth/ABG_p_sFk31IqbV9qY1DJDYbGdfGaxq6ks5q1t7tgaJpZM4KL1UN
.

@gyuho
Copy link
Contributor

gyuho commented Oct 20, 2016

I will give another try with the certs tomorrow. Thanks for investigating on this!

@xiang90
Copy link
Contributor

xiang90 commented Oct 20, 2016

@smarterclayton So this is a grpc related issue?

@gyuho
Copy link
Contributor

gyuho commented Oct 20, 2016

@smarterclayton Works for me in linux (CoreOS)?

/etcd --listen-peer-urls=https://0.0.0.0:7001 --listen-client-urls=https://0.0.0.0:4001 --advertise-client-urls=https://10.240.0.31:4001 --cert-file /home/gyuho/etcd.server.crt --key-file /home/gyuho/etcd.server.key --peer-cert-file /home/gyuho/etcd.server.crt --peer-key-file /home/gyuho/etcd.server.key --initial-advertise-peer-urls https://10.240.0.31:7001 --initial-cluster=default=https://10.240.0.31:7001 --peer-client-cert-auth --client-cert-auth

Think it's OSX issue? Have you tried the same certs in linux?

I tried with v3.1.0-rc.0

@smarterclayton
Copy link
Contributor Author

Good news, it looks like it is fixed with v3.1.0-rc.0 and OSX. Doing some more testing to be absolutely sure.

@smarterclayton
Copy link
Contributor Author

This was fixed in later v3.1.0-rc.0 - thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

4 participants