-
Notifications
You must be signed in to change notification settings - Fork 275
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
libp2p TLS 1.3 Handshake #151
Conversation
Could you include why we want TLS 1.3-like handshakes? Is it because of 0-RTT, or just keeping up with the times? |
|
||
## Peer Authentication | ||
|
||
In order to be able use arbitrary key types, peers don’t use their host key to sign the x509 certificate they send during the handshake. Instead, the host key is encoded into the [libp2p Public Key Extension](#libp2p-public-key-extension), which is carried in a self-signed certificate. The key used to generate and sign this certificate SHOULD NOT be related to the host's key. Endpoints MAY generate a new key and certificate for every connection attempt, or they MAY reuse the same key and certificate for multiple connections. Endpoints MUST choose a key that will allow the peer to verify the certificate (i.e. choose a signature algorithm that the peer supports), and SHOULD use a key type which allows for efficient signature computation and which reduces the combined size of the certificate and the signature. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we do this only because Go doesn't support some protocols in it's TLS implementation?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's one reason, the other is #111.
@anacrolix it’s in the design considerations. TLS 1.3 just takes a single round trip, older versions take two. And there’s little reason to roll out a new Handshake protocol on top of an old TLS version anyway. |
tls/tls.md
Outdated
|
||
The public key allows the peer to calculate the peer ID of the peer it is connecting to. Clients MUST verify that the peer ID derived from the certificate matches the peer ID they intended to connect to, and MUST abort the connection if it there is a mismatch. | ||
|
||
The peer signs its public key using the its host key. This signature provides cryptographic proof that the peer was in possession of the private key at the time the certificate was signed. Peers MUST verify the signature, and abort the connection attempt if signature verification fails. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Q: Which public key is signed?
Q: I feel like it should somehow be bound to TLS session we are establishing. Otherwise, this proof could be "stolen" and reused by someone else.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's the public key used to generate the certificate. I'll update the PR to clarify that.
Why do you want to bind it to the session that we're establishing? What we're doing here is basically creating a certificate chain, with the host key at the root. Just that we're not using x509 certificates, due to the limitations described earlier.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I initially thought that the publicKey
field of SignedKey
was referring to the host key. I think it's this sentence which is a bit ambiguous:
The peer signs its public key using the its host key.
Since you were just talking about the public key for the peer ID in the prior paragraph, I assumed that's what "public key" referred to here.
It's clearer down below, where it says "The publicKey field of SignedKey contains the public key of the endpoint", although it wouldn't hurt to spell out that the endpoint key is the one used to generate the certificate.
Since it is the endpoint key being signed, if you wanted to have the proof be bound to the session as @Kubuxu suggests, you could just generate a new key and self-signed certificate for each session. Is that correct @marten-seemann?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm still not clear about which keys, public keys and certificates are being sent over and in which fields.
I don't see cleanly that the chain confirming ownership of key used in TLS handshake is bound to a PeerID.
I have few ideas regarding possible attacks, and possible reuse in case TLS key is leaked but first I need to understand this part fully.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm reading it again so bear with me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for the confusion here. You sign the public key that was used to generate the certificate with your host private key.
The reason is that the key used to generate the certificate is used by TLS during the handshake to sign the CertificateVerify message. By signing the public key of the certificate we get a chain of verified signatures.
Using the host's public key would be a bad idea. An attacker could just copy the contents of the libp2p Public Key Extension and use that to impersonate this peer. We really need an operation that proves that the peer actually meant to use the key used by TLS.
tls/design considerations.md
Outdated
|
||
### TLS 1.3 - What about older versions? | ||
|
||
The handshake protocol requires TLS 1.3 support. This means that the handshake between to peers that have never communicated before will typically complete in just a single roundtrip. With older TLS versions, a handshake typically takes two roundtrips. By not specifying support for older TLS versions, we increase perfomance and simplify the protocol. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Most importantly, it would make us vulnerable to downgrade attacks.
|
||
### Versioning - How we could roll out a new version of this protocol in the future | ||
|
||
An earlier version of this document included a version negotiation mechanism. While it is a desireable property to be able to change things in the future, it also adds a lot of complexity. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about the protocol string in multistream-select
? That's a version negotiation mechanism.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes but we may want to speak TLS before multistream. That is, we should allow peers to advertise /ip4/.../tcp/443/tls/ipfs/Qm...
directly. This should help disguise IPFS traffic.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That works for TCP, but not for QUIC. In QUIC, the first packet sent by the client already contains the ClientHello.
tls/tls.md
Outdated
} | ||
``` | ||
|
||
The publicKey field of SignedKey contains the public key of the endpoint, encoded using the following protobuf. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That seems redundant with #100
tls/tls.md
Outdated
### libp2p Public Key Extension | ||
|
||
In order to prove ownership of its host key, an endpoint sends two values: | ||
- the public key corresponding to its host key |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So my main issue is this sentence, I don't understand the difference between public key
and host key
.
As Noise is more flexible than TLS 1.3, we might support specific Noise handshakes for specific nodes roles, but TLS 1.3 supports cipher suite agility and conceivably helps obfuscate traffic, but only if integrated with the web-like transports. As a naive first approximation, we might consider (a) replacing secio with Noise in a way exploits Noise's flexibility, (b) integrate TLS 1.3 with the web-like transports in a way that really permits blending in with web traffic, and (c) correctly permit both Noise and TLS 1.3 to manage authentication using key provided by the application. In fact, there is an enormous nasty can of worms involved in doing traffic obfuscation like I suggest in doing (b), for which the Tor project recruits an army of young, idealistic, and talented developers to write pluggable transports, and for which U.S. State Dept pays them like $2 M per year. We should not compete with them here, so maybe the correct solution for obfuscation is actually TLS 1.3 or Noise running over Tor pluggable transports. We're left with TLS 1.3 being the only secure protocol with cipher suite agility, and Noise providing more handshake flexibility. Increasingly, we view cipher suite agility as harmful, based in part on how hard doing it correctly proved for TLS, so modern protocols should dump cipher suite agility and simply version instead. IPFS appears rather committed to cipher suite agility though. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like a good start! Thanks, @marten-seemann.
-
I'd suggest adding a formal definition of the different keys and how they relate to each other.
-
To my understanding, the peer generates an ephemeral per-session key, and produces a self-signed certificate without using the host key. This certificate then embeds (via certificate extensions) the host's permanent identity public key, and a signature of that public key with its pairing private key.
Unless I'm mistaken, isn't this scheme subject to man-in-the-middle attacks, where one could replace the outer certificate by extracting the inner host key and injecting it into a new certificate produced with a key they control? I'm sure I'm missing something. It seems there is no cryptographic crosslink between the host key and the ephemeral session key.
P.S.: as per one of my comments, I think using ephemeral session keys oughts to be a MUST requirement, as otherwise we would not guarantee Perfect Forward Secrecy.
|
||
### Why we're not using the host key for the certificate | ||
|
||
The current proposal uses a self-signed certificate to carry the host's public key in the libp2p Public Key Extension. The key used to generate the self-signed certificate has no relationship with the host key. This key can be generated for every single connection, or can be generated at boot time. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO another reason for not using the host key directly is to achieve Perfect Forward Secrecy.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TLS 1.3 uses ephemeral Diffie-Hellman for the key exchange mechanism, so it's always PFS, no matter what kind of certificate you use.
|
||
## Peer Authentication | ||
|
||
In order to be able use arbitrary key types, peers don’t use their host key to sign the x509 certificate they send during the handshake. Instead, the host key is encoded into the [libp2p Public Key Extension](#libp2p-public-key-extension), which is carried in a self-signed certificate. The key used to generate and sign this certificate SHOULD NOT be related to the host's key. Endpoints MAY generate a new key and certificate for every connection attempt, or they MAY reuse the same key and certificate for multiple connections. Endpoints MUST choose a key that will allow the peer to verify the certificate (i.e. choose a signature algorithm that the peer supports), and SHOULD use a key type which allows for efficient signature computation and which reduces the combined size of the certificate and the signature. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think generating a new key per session should be a MUST, otherwise Perfect Forward Secrecy might be compromised.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let me expand on my earlier comment. In TLS 1.3, the client send a key_share extension in the ClientHello. This key share is the client's ephemeral DH key (i.e. it is a fresh value for every connection). The server sends its part of the ephemeral DH key share in the key_share extension in the ServerHello.
Since TLS 1.3 decouples key exchange mechanism from the signature algorithms, TLS 1.3 handshakes are always forward secure, no matter what kind of certificate you use.
By the way, you might have heard of eTLS 1.3. It's supposed to be used by businesses who want middleboxes to be able to decode their traffic (and who don't care about their customer's privacy). It removes PFS from the protocol by having the server send a static key share in the ServerHello. The sad thing is, there's no immediate way to detect a server that's misbehaving in that way, other than running heuristics over multiple connections.
I'm a bit skeptical about obfuscation. As you note, obfuscation done right is really hard, and we even fail at the most basic aspects. The first one is that you'd have to run your node on port 443, a request to a non-standard HTTPS port will immediately stick out. Second, there's the ALPN extension which is sent unencrypted in the ClientHello, and I'd like to avoid putting a fake value into that field. And then there are hard parts like traffic patterns, which are very characteristic for HTTPS traffic, which will probably be impossible to replicate in libp2p.
The advantage of TLS 1.3 at this point is that we can use the same handshake logic on TCP and on QUIC. Noise looks really interesting and we might be able to use it on top of TCP, but it's not obvious how to integrate it into the QUIC handshake. |
@Kubuxu I just pushed an update that (hopefully) makes it clear which key I'm referring to. Can you please have another look? |
It is much clearer now. One issue I see (and it is a small issue that depends on our threat model) is if ever the private key for the outside certificate was leaked, it gives the attacker an ability to impersonate owner of the host key forever. What do you think about this? |
@Kubuxu: Thanks for the diagram!
That's true, but I'm wondering if this is a realistic threat. After all, losing your host key would also allow an attacker to impersonate you forever (obviously). If at all, the host key will be more vulnerable to be compromised, since it's typically written to disk (at least for IPFS) and loaded into memory, where as the key for the outer certificate only needs to exist in memory. |
|
||
## Peer Authentication | ||
|
||
In order to be able use arbitrary key types, peers don’t use their host key to sign the x509 certificate they send during the handshake. Instead, the host key is encoded into the [libp2p Public Key Extension](#libp2p-public-key-extension), which is carried in a self-signed certificate. The key used to generate and sign this certificate SHOULD NOT be related to the host's key. Endpoints MAY generate a new key and certificate for every connection attempt, or they MAY reuse the same key and certificate for multiple connections. Endpoints MUST choose a key that will allow the peer to verify the certificate (i.e. choose a signature algorithm that the peer supports), and SHOULD use a key type which allows for efficient signature computation and which reduces the combined size of the certificate and the signature. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The key used to generate and sign this certificate SHOULD NOT be related to the host's key.
This precludes the optimization we talked about where a peer MAY derive the certificate key from the host key and the verifier MAY skip verifying the signature in the extension if the key matches the certificate key.
I guess SHOULD allows implementations to do what they want, but we may want to say that implementations MUST NOT reject certificates that use the host key.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, SHOULD means that you're not supposed to do it, unless you think you have good reasons for it. I've described some measurements I did in the Design Considerations, section Why we're not using the host key for the certificate, which led me to conclude that this optimization is not worth it. Do you agree?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It probably isn't but there's no reason to preclude it. Basically, I just don't want some over-zealous dev to reject these keys.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, that's why it's a SHOULD. SHOULD implies that you can't rely on the peer following the recommendation, so I don't think any additional text is needed.
} | ||
``` | ||
|
||
TODO: PublicKey.Data looks underspecified. Define precisely how to marshal the key. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some discussion here: https://github.com/libp2p/specs/pull/100/files.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have a bunch of specs that only live in PRs now, but don't seem to be blocked by anything major. We should get them merged some time, so we can properly refer to them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mostly blocked on reviews, IIRC. But yeah, we should just get MVPs merged and mark them as drafts.
|
||
An earlier version of this document included a version negotiation mechanism. While it is a desireable property to be able to change things in the future, it also adds a lot of complexity. | ||
|
||
To keep things simple, the current proposal does not include a version negotiation mechanism. A future version of this protocol might: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that's a bit of a hole here, let's at least make it compatible with negotiation in the future.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ideally, that's what (2) below provides. We can use the SNI field as it's sent in the first packet.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(LGTM)
|
||
An earlier version of this document included a version negotiation mechanism. While it is a desireable property to be able to change things in the future, it also adds a lot of complexity. | ||
|
||
To keep things simple, the current proposal does not include a version negotiation mechanism. A future version of this protocol might: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ideally, that's what (2) below provides. We can use the SNI field as it's sent in the first packet.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added some minor copy corrections. Overall I think this looks good to mark as Draft and merge.
Co-Authored-By: marten-seemann <martenseemann@gmail.com>
For what it's worth, the If I were to implement this on a short time frame, I'd generate a CA certificate, embed it in the source code as a trusted source, and sign the ephemeral certificates from this CA. But that means we wouldn't be compatible. |
|
||
|
||
|
||
### libp2p Public Key Extension |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the exact string name of the extension to use when generating the certificate?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure I understand the question. You can find the OID further down in the document.
This sounds like something that should be fixed in rustls. Accepting self-signed certificate should be a basic feature of any TLS library. I really don't want to increase the length of the certificate chain to work around that shortcoming (thereby increasing the number of signature verifications to be performed during each handshake as well as significantly increase the number of bytes transmitted). Any estimate how long would it take to get this fixed in rustls? |
That's debatable. I'm pretty sure the authors of rustls would disagree. |
They probably don't want to blindly trust self-signed certificates but I doubt they'd object to a custom certificate validator if cleanly implemented. |
@tomaka I'd like to merge this PR very soon. What do you think is the best way forward here? |
1. We want to use different key types: RSA, ECDSA, and Ed25519, Secp256k1 (and maybe more in the future?). | ||
2. We want to be able to send the key type along with the key (see https://github.com/libp2p/specs/issues/111). | ||
|
||
The first point is problematic in practice, because Go currently only supports RSA and ECDSA certificates. Support for Ed25519 was planned for Go 1.12, but was deferred recently, and the Go team is now evaluating interest in this in order to prioritze their work, so this might or might not happen in Go 1.13. I'm not aware of any plans for Secp256k1 at the moment. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You should fork the required library and use Ed25519 using agl's libraries.
ECDSA kinda works but.. secp256k1 should be avoided anyways, due to side channel concerns. RSA should not even be considered for any current use cases, although supporting it for whatever strange future scenarios sounds harmless..
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Really, we might wan to rephrase this in terms of ecosystem support. That is, TLS is a rather complicated protocol and we don't want to force libp2p implementations to add support for new key types.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You should fork the required library and use Ed25519 using agl's libraries.
I considered that, but concluded that's it's not a good solution. It would mean pulling in a forked x509 package, and in order to use that one, we'd also have to fork tls just to change the import path.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As a general remark, in my opinion whether or not a library exists in the ecosystem should only be a very minor consideration when writing specs. You want your specs to do the right thing, not the thing that can be implemented the most easily.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Which is why I didn't answer your question "Any estimate how long would it take to get this fixed in rustls?" (#151 (comment))
To me how long something takes to implement is not relevant when discussing specs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Filo has a branch at golang/go#25355 I'm surprised this takes google so long given agl's role in Ed25519 being accepted by standards bodies.
@tomaka: Is there any way to define a custom certificate verifier in rustls. Or if that’s not possible, could you temporarily add the certificate to the root CAs until it is verified? |
It is possible to have a custom verifier, but once a connection is established it is not possible to know which certificate it corresponded to (cc libp2p/rust-libp2p#211 (comment)). Again, the specs shouldn't be based on whether they are easy to implement. |
@tomaka Are you ok with merging the specs as they are now, or do you think we need to change anything? |
I'm fine with merging that! My "regret" is that the libp2p handshake will still be distinguishable from an HTTP handshake, but there's no way to fix that as far as I know. |
@tomaka I'm not sure if it will be distinguishable, TLS1.3 establishes encrypted (not authenticated) channel after ClientHello and ServerHello. Certificate, CertificateRequest are sent in a separate message: EncryptedExtensions. I think this should make libp2p handshake and https handshake roughly the same. |
@tomaka: The cleartext messages of the handshake will be identical to an HTTP handshake if we use the same ALPN (the spec doesn't say anything about which ALPN to use, so we're free to specify that later). However, as long as people don't run their nodes on port 443, and connect from random port numbers, libp2p traffic can be easily distinguished from genuine HTTP traffic just by looking at the IP header. |
In the last couple of weeks I spent a lot of time designing a TLS 1.3 based handshake for libp2p. This document has come a long way, and I managed to remove a lot of the complexity, while at the same time preserving the future extensibility which I originally built into this version of the protocol.
This is a proposal how we can run the libp2p handshake using TLS 1.3 in the future. Since we're currently only using secio in production, we are completely free to design things the way we want, without having to worry about backwards compatibility. That means that now is the right time to criticize every single aspect of this protocol.
I already implemented a proof of concept in libp2p/go-libp2p-tls#20, feel free to play around with that code as well.