Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Moved PROXY protocol wrap to execute before the TLS wrap #3195

Merged
merged 2 commits into from
Aug 23, 2017

Conversation

solmonk
Copy link
Contributor

@solmonk solmonk commented Aug 17, 2017

In my setup I use AWS ELB with PROXY protocol enabled as a TCP proxy that transparently passes SSL traffic to vault. However, when the vault server receives traffic it throws the following error:

2017/08/16 09:11:03 [ERR] Failed to read proxy prefix: tls: oversized record received with length 22617

It seems Vault currently does not work if both PROXY protocol and TLS are enabled. Based on what I saw in other successful implementations (https://github.com/fabiolb/fabio/blob/master/proxy/listen.go#L28), PROXY protocol listener wrapping should execute before TLS listener wrapping. I tested this change on my setup described above and the problem was fixed.

@jefferai
Copy link
Member

The proxy protocol documentation (https://www.haproxy.org/download/1.8/doc/proxy-protocol.txt) does not explicitly state whether the proxy protocol should be wrapping TLS or TLS should wrap proxy, however, it multiple places it suggests that proxy is supposed to wrap TLS since part of the protocol contains information about the underlying TLS state (ALPN, SNI header, etc. -- see sections 2.2.1 and 2.2.2 for instance). Of course, this is only in version 2, and we only currently handle version 1, and there is no guidance at all for version 1, so you can either infer that the intent should be the same, or not infer anything at all.

Given the lack of any specification in the protocol + the vague hints (at least for v2 which can handle it) that PROXY is supposed to wrap TLS, not the other way around, and given Vault's security focus which encourages end-to-end TLS without unwrapping by a load balancer (which is often required by the industry standards our clients must adhere to), we implemented it with PROXY wrapping TLS. I don't think that's the wrong call, regardless of what Fabio does, and AFAIK it works perfectly fine with ELB (and probably other LBs) in TCP-proxying mode.

Accordingly: I'm happy to have an option that allows flipping the order (with the current behavior the default), but I won't merge a PR that does it unilaterally.

@solmonk
Copy link
Contributor Author

solmonk commented Aug 18, 2017

ELB sends PROXYv1-wrapped TLS traffic as expected (when I tcpdump I see plaintext PROXYv1 header), but Vault server does not understand it. Interestingly 22617 in error message translates into XY when ascii decoded, which kind of implies that Vault tries to unwrap TLS first but fails because of PROXY protocol header.

So after this observation I read through some library codes, and I think the TLS connection should read from underlying PROXY-wrapped conection, so that it consumes PROXY protocol header at the very first. In order to do this PROXY wrapping should occur before TLS wrapping (code-wise).

I agree with your explanation that PROXY should wrap TLS, but isn't Vault's current implementation unwrapping in the wrong order?

@jefferai
Copy link
Member

TLS wrapping is here: https://github.com/hashicorp/vault/blob/master/command/server.go#L454 - inside the NewListener function.

Note how in the block below that is where the proxy protocol wrapping happens, so the proxy handler should be wrapping the tls listener.

So after this observation I read through some library codes, and I think the TLS connection should read from underlying PROXY-wrapped conection, so that it consumes PROXY protocol header at the very first. In order to do this PROXY wrapping should occur before TLS wrapping (code-wise).

What should be happening with Vault right now is that the LB gets a TCP connection with TLS data, then the LB tacks on the proxy stuff in front so you get PROXY(TLS(data)). Then the last wrapper of the listener in Vault is PROXY, which wraps the TLS listener, which wraps the underlying data -- the reverse of what the LB should be sending.

If in your testing this is the ordering of the data that you see in your tcpdumps but to work Vault needs to have the ordering of the listeners changed, then I'm honestly a bit mystified.

@solmonk
Copy link
Contributor Author

solmonk commented Aug 19, 2017

Currently Vault wraps the TLS listener with PROXY listener, so this line of go-proxyproto - which should read and consume PROXY header - will read from underlying TLS connection:
https://github.com/hashicorp/vault/blob/master/vendor/github.com/armon/go-proxyproto/protocol.go#L192
When reading from TLS connection it basically reads record blocks by readRecord() function. When it executes, this line causes record overflow error what I saw: https://github.com/golang/go/blob/master/src/crypto/tls/conn.go#L614
data[3] and data[4] would be X and Y , since it accidentally reads PROXY header first, and it makes the record size exactly 22617, as I saw in the error.
So what I am assuming is that doing the PROXY listener wrapping last also makes it unwrap last, which is not the expected behavior.

@jefferai
Copy link
Member

@solmonk For that line to be reading the TLS connection data, the section above where it peeks at the data and tries to find a proxy proto header mismatch has to fail. If it finds a mismatch it exits out without attempting to read a header line. It could be buggy! But I'd find it odd if it was.

Any chance you can share the tcpdump data you have so I can be looking at the same input data as you? Also, can you share your vault config at the same time so I can see what combination of options you set on the listener?

@solmonk
Copy link
Contributor Author

solmonk commented Aug 19, 2017

My configuration looks like this: 443 receives direct traffic, and 8199 receives from ELB.

storage "consul" {}

listener "tcp" {
  address = "0.0.0.0:443"
  tls_cert_file = "/path/to/cert.pem"
  tls_key_file = "/path/to/key.pem"
}

listener "tcp" {
  address = "0.0.0.0:8199"
  proxy_protocol_behavior = "use_always"
  proxy_protocol_authorized_addrs = "0.0.0.0"
  tls_cert_file = "/path/to/cert.pem"
  tls_key_file = "/path/to/key.pem"
}

tcpdump on failing situation is below. I redacted ip addresses for some reasons, but hope this still helps... Just calling vault status to a Vault behind an ELB with PROXY enabled reproduces the same result.

$ sudo tcpdump -n -X port 8199
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on eth0, link-type EN10MB (Ethernet), capture size 262144 bytes
13:49:13.233900 IP x.x.x.x.22345 > x.x.x.x.8199: Flags [S], seq 1727503985, win 26883, options [mss 8961,sackOK,TS val 189345529 ecr 0,nop,wscale 8], length 0
	0x0000:  4500 003c b088 4000 ff06 f2b1 xxxx xxxx  E..<..@.....xxxx
	0x0010:  xxxx xxxx 5749 2007 66f7 9e71 0000 0000  xxxxWI..f..q....
	0x0020:  a002 6903 3232 0000 0204 2301 0402 080a  ..i.22....#.....
	0x0030:  0b49 2ef9 0000 0000 0103 0308            .I..........
13:49:13.233936 IP x.x.x.x.8199 > x.x.x.x.22345: Flags [S.], seq 2272620931, ack 1727503986, win 26847, options [mss 8961,sackOK,TS val 213569974 ecr 189345529,nop,wscale 7], length 0
	0x0000:  4500 003c 0000 4000 4006 623b xxxx xxxx  E..<..@.@.b;xxxx
	0x0010:  xxxx xxxx 2007 5749 8775 7183 66f7 9e72  xxxx..WI.uq.f..r
	0x0020:  a012 68df d8af 0000 0204 2301 0402 080a  ..h.......#.....
	0x0030:  0cba d1b6 0b49 2ef9 0103 0307            .....I......
13:49:13.234264 IP x.x.x.x.22345 > x.x.x.x.8199: Flags [.], ack 1, win 106, options [nop,nop,TS val 189345529 ecr 213569974], length 0
	0x0000:  4500 0034 b089 4000 ff06 f2b8 xxxx xxxx  E..4..@.....xxxx
	0x0010:  xxxx xxxx 5749 2007 66f7 9e72 8775 7184  xxxxWI..f..r.uq.
	0x0020:  8010 006a 0f6b 0000 0101 080a 0b49 2ef9  ...j.k.......I..
	0x0030:  0cba d1b6                                ....
13:49:13.234314 IP x.x.x.x.22345 > x.x.x.x.8199: Flags [P.], seq 1:48, ack 1, win 106, options [nop,nop,TS val 189345529 ecr 213569974], length 47
	0x0000:  4500 0063 b08a 4000 ff06 f288 xxxx xxxx  E..c..@.....xxxx
	0x0010:  xxxx xxxx 5749 2007 66f7 9e72 8775 7184  xxxxWI..f..r.uq.
	0x0020:  8018 006a f1a7 0000 0101 080a 0b49 2ef9  ...j.........I..
	0x0030:  0cba d1b6 5052 4f58 5920 5443 5034 2031  ....PROXY.TCP4.x
	0x0040:  3732 2e31 382e 302e 3439 2031 3732 2e32  xx.xx.x.xx.xxx.x
	0x0050:  302e 3634 2e34 3020 3539 3938 3420 3434  x.xx.xx.59984.44
	0x0060:  330d 0a                                  3..
13:49:13.234321 IP x.x.x.x.8199 > x.x.x.x.22345: Flags [.], ack 48, win 210, options [nop,nop,TS val 213569974 ecr 189345529], length 0
	0x0000:  4500 0034 a032 4000 4006 c210 xxxx xxxx  E..4.2@.@...xxxx
	0x0010:  xxxx xxxx 2007 5749 8775 7184 66f7 9ea1  xxxx..WI.uq.f...
	0x0020:  8010 00d2 d8a7 0000 0101 080a 0cba d1b6  ................
	0x0030:  0b49 2ef9                                .I..
13:49:13.234425 IP x.x.x.x.8199 > x.x.x.x.22345: Flags [P.], seq 1:8, ack 48, win 210, options [nop,nop,TS val 213569974 ecr 189345529], length 7
	0x0000:  4500 003b a033 4000 4006 c208 xxxx xxxx  E..;.3@.@...xxxx
	0x0010:  xxxx xxxx 2007 5749 8775 7184 66f7 9ea1  xxxx..WI.uq.f...
	0x0020:  8018 00d2 d8ae 0000 0101 080a 0cba d1b6  ................
	0x0030:  0b49 2ef9 1503 0100 0202 16              .I.........
13:49:13.234476 IP x.x.x.x.8199 > x.x.x.x.22345: Flags [F.], seq 8, ack 48, win 210, options [nop,nop,TS val 213569974 ecr 189345529], length 0
	0x0000:  4500 0034 a034 4000 4006 c20e xxxx xxxx  E..4.4@.@...xxxx
	0x0010:  xxxx xxxx 2007 5749 8775 718b 66f7 9ea1  xxxx..WI.uq.f...
	0x0020:  8011 00d2 d8a7 0000 0101 080a 0cba d1b6  ................
	0x0030:  0b49 2ef9                                .I..
13:49:13.234716 IP x.x.x.x.22345 > x.x.x.x.8199: Flags [.], ack 8, win 106, options [nop,nop,TS val 189345529 ecr 213569974], length 0
	0x0000:  4500 0034 b08b 4000 ff06 f2b6 xxxx xxxx  E..4..@.....xxxx
	0x0010:  xxxx xxxx 5749 2007 66f7 9ea1 8775 718b  xxxxWI..f....uq.
	0x0020:  8010 006a 0f35 0000 0101 080a 0b49 2ef9  ...j.5.......I..
	0x0030:  0cba d1b6                                ....
13:49:13.239790 IP x.x.x.x.22345 > x.x.x.x.8199: Flags [P.], seq 48:233, ack 9, win 106, options [nop,nop,TS val 189345531 ecr 213569974], length 185
	0x0000:  4500 00ed b08c 4000 ff06 f1fc xxxx xxxx  E.....@.....xxxx
	0x0010:  xxxx xxxx 5749 2007 66f7 9ea1 8775 718c  xxxxWI..f....uq.
	0x0020:  8018 006a 1596 0000 0101 080a 0b49 2efb  ...j.........I..
	0x0030:  0cba d1b6 1603 0100 b401 0000 b003 0359  ...............Y
	0x0040:  97c3 4931 a597 4387 5108 cefc 784a 2173  ..I1..C.Q...xJ!s
	0x0050:  49df 4b9e 8bad 616e eadb bfd7 0010 6f00  I.K...an......o.
	0x0060:  004e 00ff c02c c02b c024 c023 c00a c009  .N...,.+.$.#....
	0x0070:  c008 c030 c02f c028 c027 c014 c013 c012  ...0./.(.'......
	0x0080:  009f 009e 006b 0067 0039 0033 0016 009d  .....k.g.9.3....
	0x0090:  009c 003d 003c 0035 002f 000a c007 c011  ...=.<.5./......
	0x00a0:  0005 0004 00af 00ae 008d 008c 008a 008b  ................
	0x00b0:  0100 0039 000a 0008 0006 0017 0018 0019  ...9............
	0x00c0:  000b 0002 0100 000d 0012 0010 0401 0201  ................
	0x00d0:  0501 0601 0403 0203 0503 0603 0005 0005  ................
	0x00e0:  0100 0000 0000 1200 0000 1700 00         .............
13:49:13.239804 IP x.x.x.x.8199 > x.x.x.x.22345: Flags [R], seq 2272620940, win 0, length 0
	0x0000:  4500 0028 a487 4000 4006 bdc7 xxxx xxxx  E..(..@.@...xxxx
	0x0010:  xxxx xxxx 2007 5749 8775 718c 0000 0000  xxxx..WI.uq.....
	0x0020:  5004 0000 670d 0000                      P...g...

It seems the server just receives plaintext PROXY protocol header that starts with PROXY TCP4, and Vault throws TLS record overflow error 15030100020216 immediately. Client sends ClientHello afterwards but Vault refuses it.

For that line to be reading the TLS connection data, the section above where it peeks at the data and tries to find a proxy proto header mismatch has to fail. If it finds a mismatch it exits out without attempting to read a header line. It could be buggy! But I'd find it odd if it was.

Oops I missed that part! but AFAIK Peek() also Read()s from reader - TLS connection in this case - internally, which leads to the same error anyways.

@jefferai
Copy link
Member

Peek shouldn't advance the reader...see https://golang.org/pkg/bufio/#Reader.Peek

Any chance you'd be willing to share the actual cap files in a non public context, e.g. via email/keybase/etc. with me?

@solmonk
Copy link
Contributor Author

solmonk commented Aug 20, 2017

Since it is a buffered reader, Peek actually reads from io.Reader and fills the buffer.
https://github.com/golang/go/blob/master/src/bufio/bufio.go#L129
https://github.com/golang/go/blob/master/src/bufio/bufio.go#L97

I sent the email with the captured tcpdump to you. During the dump I also did some printf debugging and checked here is the exact error position.
https://github.com/hashicorp/vault/blob/master/vendor/github.com/armon/go-proxyproto/protocol.go#L181

@jefferai
Copy link
Member

@solmonk Peek reads from the reader, but if you look at the code in go-proxyproto, you'll see that it peeks until it knows that it's seeing a proxy header, then it does a read up until a new line to get the rest of it. Assuming that the proxy header is in the expected format (the information, then the newline), data after the newline should be the TLS data, which should not have been snarfed.

@jefferai
Copy link
Member

Got the tcpdump. Nothing looks out of sort initially when examining the trace, but will dig in.

If you're happy building a branch of Vault I could make a branch that adds some debugging into go-proxyproto. That would help make it clear exactly what the library thinks is going on (e.g. does it think it's actually not PROXY coming in for some reason, which would then cause it to send the full underlying data to the TLS handler, which would then choke as you saw).

@solmonk
Copy link
Contributor Author

solmonk commented Aug 21, 2017

I think my comments were a bit messy so to sum up my claim:

  1. Peek() reads from a reader in order to know what byte is upcoming next.
  2. The reader is actually tls.Conn in current Vault implementation, since proxyproto wraps the TLS listener.
  3. tls.Conn is a modified reader that it itself tries handshake and reads TLS record blocks, not plaintext.
  4. Therefore proxyproto is unable to peek plaintext PROXY header, since it is not a TLS record block.

As I mentioned I was already doing some kind of debugging by adding some printfs to the code, so yes, I am happy building a branch :)

@jefferai
Copy link
Member

Ah. I think I see what you're saying now:

You're saying that when attempting a read on an underlying tls connection it triggers the TLS handshake. Since the proxy header is not yet fully read in, the handshake is attempted on the proxy data instead of the underlying TLS data.

Is that your analysis?

@solmonk
Copy link
Contributor Author

solmonk commented Aug 21, 2017

Yes and I am pretty confident with it.

@jefferai
Copy link
Member

OK so then it seems like there are two ways forward, which are not exclusive:

  1. Have an option to flip the order, in case someone wants to TLS-wrap a PROXY connection.

  2. Write some kind of specialized conn implementation that can sit between proxy and TLS and allow access to the connection buffer without triggering the TLS handshake, then pass the resultant buffer to an underlying TLS connection as needed. Probably this would fit in go-proxyproto.

@solmonk
Copy link
Contributor Author

solmonk commented Aug 22, 2017

In both cases, the protocol simply consists in an easily parsable header placed
by the connection initiator at the beginning of each connection.

The receiver MUST NOT start processing the connection before it receives a
complete and valid PROXY protocol header.

Quoted from your link of PROXY protocol documentation. PROXY header should be placed at the very first of TCP connection regardless of version, which also means nothing (including TLS) can wrap PROXY. The doc (plus its implementation example) states enough that only header is the difference between two versions. I don't think the current behavior could be an option; instead I consider it a bug that have to be fixed.

@jefferai
Copy link
Member

I was saying that your suggested behavior (TLS wrapping PROXY) could be an option, not the current behavior.

@solmonk
Copy link
Contributor Author

solmonk commented Aug 22, 2017

Hmm that's confusing because my suggested behavior is PROXY protocol wrapping TLS, yet in the code it can be achieved by TLS listener wrapping PROXY listener (just as shown in this PR) because bytestream processing starts from the innermost listener.. Maybe the word "wrap" used in both context causes problem?

@jefferai
Copy link
Member

Bytestream processing doesn't start from the innermost connection, it starts at the outermost.

@solmonk
Copy link
Contributor Author

solmonk commented Aug 22, 2017

I think I did another bad explanation :( The execution order should be that but outer connection always reads from the data that inner connection already processed, unless we make some kind of specialized implementation as you said. I cannot think of any advantages for doing that over just flipping the order though.

@jefferai jefferai added this to the 0.8.2 milestone Aug 22, 2017
@jefferai
Copy link
Member

I have two reasons I'm hesitant about just flipping the order:

  1. It basically works by happy accident, not through any sort of reasonable layered API. If you follow the TLS connection code all the way down, when it gets to a read it goes into a handshake, which then eventually reads a few bytes to figure out basic parameters, which then triggers the proxy code. This just so happens to work but it's not working the way it's supposed to, where the outer layers strip away their bits and leave only inner data for inner connections. I'm worried that it's fragile and that some future Go change could break it.

  2. There's a lot of cognitive dissonance if you know the Go listener/conn model to wrap your head around the fact that it's doing the opposite of what's supposed to happen, so it makes the code less understandable.

Unfortunately the alternatives are pretty onerous.

I'm going to bring this up internally with the team and we'll figure out how we want to proceed.

@jefferai
Copy link
Member

OK, we're going to do it this way. Likely it won't break due to any change in the TLS libraries until at least Go 2, and we don't really want to implement a wrapping connection. Thanks!

@jefferai jefferai merged commit f855da7 into hashicorp:master Aug 23, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants