Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Peer TLS SAN check couldn't reason about non-FQDN endpoints #8797

Closed
hongchaodeng opened this issue Oct 31, 2017 · 9 comments
Closed

Peer TLS SAN check couldn't reason about non-FQDN endpoints #8797

hongchaodeng opened this issue Oct 31, 2017 · 9 comments

Comments

@hongchaodeng
Copy link
Contributor

BUG REPORT

etcd version: v3.2.9

Problem

As reported in #8534

Root cause analysis

When a new secure peer connection is made, etcd server tries to verify that its remote network address (IP address) is in the SAN of the TLS cert. See: https://github.com/coreos/etcd/blob/v3.2.9/pkg/transport/listener_tls.go#L122

Then it tries to check if the IP address is in the SAN by doing (https://github.com/coreos/etcd/blob/v3.2.9/pkg/transport/listener_tls.go#L122):

  • Reverse lookup the given IP's DNS name. Check if the resolved DNS name matches any SAN? Because the resolved DNS name is FQDN, it won't match anything in the SAN including the wildcard one.
  • Forward lookup all DNS names in the SAN except the wildcards to match with remote IP. However since it couldn't reason about wildcards some valid IPs are missing.

Example walkthrough

Background

  • search domain: .c
  • SAN: *.b
  • Peer host: a.b

Steps:

  • etcd servers gets remote address 10.1.2.3
  • reverse DNS lookup 10.1.2.3 -> a.b.c
  • Match round 1 (reverse lookup):
    • a.b.c doesn't match *.b
  • Match round 2 (forward lookup):
    • There is no DNS names other than wildcards in SAN. So nothing to look up.

Result: match failed.

Expectation: a.b should match *.b. It should succeed.

@hongchaodeng
Copy link
Contributor Author

/cc @hasbro17 @xiang90 @raoofm

@xiang90
Copy link
Contributor

xiang90 commented Oct 31, 2017

I am not sure this is something etcd can address.

We have two options:

  1. verify the SAN matches remote address
  2. do not verify the SAN matches remote address

We did 2 before, and people asked to tight security to do 1.

To make this work, SAN should either contain fqdn or IP address.

@hongchaodeng
Copy link
Contributor Author

hongchaodeng commented Oct 31, 2017

verify the SAN matches remote address

I agree that's the correct thing to happen. But what needs to be clarified is how to "verify"?

SAN is provided by the server to client so that client can verify the server, not the reverse way. Now peer server is trying to use its own cert to verify the client -- this is not what SAN is for. Instead, when peer A receives a connection from peer B, A should actively request B's certificate and verify SAN against B's address that A would talk to (not necessarily the TCP connection remote address).

And that's what original issue is about, when B tried to join A, only B verified A's cert, but A didn't verify B's cert. The correct fix expected would be A would try to verify B's cert.

@xiang90
Copy link
Contributor

xiang90 commented Oct 31, 2017

I agree that's the correct thing to happen. But what needs to be clarified is how to "verify"? SAN is provided by the server to client so that client can verify the server, not the reverse way.

That was the old behavior, only the client verifies the server. etcd server communicates both way, not just single way. A peer can be both server and client to other peers depending on its role (follower vs leader). If we allow one direction, not the other then it is still a broken model. And the bi-directional verification is what you requested exactly previously.

I want to understand what is really needed before making any further decision on this one.

@raoofm
Copy link
Contributor

raoofm commented Nov 1, 2017

@hongchaodeng @xiang90 why can't we have the IP address in the SAN by automating generation of certs on the fly, rather than generating them before deploying (creating them as secrets).

I mean we can use the IP address received from the downward API (status.podIP) for the pod and use an init container or a script that generates the certs before the actual etcd container comes up.

@hongchaodeng
Copy link
Contributor Author

@raoofm
That's etcd deployment issue, not related here. More specifically, etcd operator issue.

We do have plan to support that via k8s csr endpoint: coreos/etcd-operator#1465 .

@raoofm
Copy link
Contributor

raoofm commented Nov 17, 2017

@xiang90 @hongchaodeng any progress here?

@wenjiaswe
Copy link
Contributor

@wenjiaswe

@wenjiaswe
Copy link
Contributor

@raoofm looks like @hongchaodeng 's initiated coreos/etcd-operator#1644 in etcd-operator for this issue. Is there anything needed here? Otherwise we are closing out the long pending issues.

Please feel free to reopen it if it's still relevant.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

No branches or pull requests

5 participants