-
Notifications
You must be signed in to change notification settings - Fork 67
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
TLS without cstruct
#497
TLS without cstruct
#497
Conversation
In case it helps, this patch 05b203e should provide the missing |
this is great. I guess we need to rebase and wait for the x509 release. (plus then adapt lower bounds) |
Nice rebase! The eio patch and a fix for the latest X509 breaking change (renaming of Authenticator functions) can be found at https://github.com/art-w/ocaml-tls/commits/no-cstruct/ |
@dinosaure would you mind to pick @art-w commits here? |
let ( let* ) = Result.bind | ||
|
||
let guard p e = if p then Ok () else Error e | ||
|
||
let split_str ?(start = 0) str off = | ||
String.sub str start off, | ||
String.sub str (start + off) (String.length str - off - start) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it's likely I introduced this function, but from our learnings in MirageVPN, I guess we should re-visit the allocation and ownership stuff in this PR. The Strong.sub is super-expensive, and we should try to allocate the least possible.
This doesn't need to be in this PR, but before a release. We should as well benchmark the amount of handshakes and throughput before and after this change (with varying block sizes in send()). Not sure whether we have a suitable executable for this right now. Best to compare to OpenSSL on the same machine.
buf | ||
let buf = Bytes.create 8 in | ||
Bytes.set_int64_be buf 0 seq ; | ||
Bytes.unsafe_to_string buf |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
allocation-wise likely a bad idea
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do you mean? We should use Bytes.to_string
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no, the entire function and allocation discipline should be different. better to allocate a bytes once, and then filling it & passing offsets along. this requires, as mentioned, a pretty deep rewriting -- which brings the suggestion to first work on benchmarks (for the latest release - e2e, maybe similar to what @reynir worked on for MirageVPN -- but would be great if cstruct allocation could be part of the benchmark), to be able to compare with the changes in here and future changes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
better to allocate a bytes once
Not sure to understand again, the content of this buffer actually depends on the given argument (seq
). We probably can pre-allocate a set of sequence_buf
and use them then but not sure this is what you want. If I think in this way, it's related to the multicore usage than can be done with ocaml-tls
. In such context, "allocate a bytes once" (globally? per-domain?) can be risky.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh, sorry for the misunderstanding. my intention is: we get some data to be sent out. now, instead of allocating a header for crypto, a header for tls, etc. - let's allocate once a buffer where the encrypted payload together with the protocol header has space. thus we won't need any other allocations, neither a concat or append. this is all local to a send_data operation, not global/per domain.
nothing for this pr, though.
in | ||
Uncommon.Cs.xor nonce s | ||
Uncommon.xor nonce s |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same, likely better without allocations, and xor_into
@@ -424,9 +427,9 @@ let answer_server_finished state (session : session_data) client_verify fin log | |||
let computed = | |||
Handshake_crypto.finished (state_version state) session.ciphersuite session.common_session_data.master_secret "server finished" log | |||
in | |||
let* () = guard (Cstruct.equal computed fin) (`Fatal `BadFinished) in | |||
let* () = guard (String.equal computed fin) (`Fatal `BadFinished) in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we need eqaf here?
lib/handshake_client.ml
Outdated
let piece = Cstruct.sub sh.server_random 24 8 in | ||
let* () = guard (not (Cstruct.equal Packet.downgrade12 piece)) (`Fatal `Downgrade12) in | ||
guard (not (Cstruct.equal Packet.downgrade11 piece)) (`Fatal `Downgrade11) | ||
let piece = String.sub sh.server_random 24 8 in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
allocates, maybe we have a function in eqaf or define our own to have off and len params for eq?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did a try about implementations of String.sub_equal
(which does not guarantee the constant time as eqaf
). My benchmark is available here: https://gist.github.com/dinosaure/7ef54e9431217435e3b81df890192b72. From my results (on AMD), the "naive" implementation (the Oracle one) is fast enough for the worst case. For readability, I will consider this implementation for ocaml-tls
(into the lib/utils.ml
file) than the other.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, and according to my benchmark, it seems that an equal_sub
which uses String.sub
or an equal_sub
which tries to iterate into given strings
have the same performance. My conclusion is: even if we do a String.sub
here, try to avoid this String.sub
and try to directly read bytes and compare them is not faster than the String.sub
...
| Some idx, Some (psk, _epoch) -> | ||
let* () = guard (idx = 0) (`Fatal `InvalidServerHello) in | ||
Ok (psk.secret, true) | ||
in | ||
let early_secret = Handshake_crypto13.(derive (empty cipher) psk) in | ||
let hs_secret = Handshake_crypto13.derive early_secret shared in | ||
let log = log <+> raw in | ||
let log = log ^ raw in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if I remember, we should be able to use a hash state since we never need the entire log data. this is for sure a separate PR.
let s_mac, rt1 = Core.split_str rt0 mac in | ||
let c_key, rt2 = Core.split_str rt1 key in | ||
let s_key, rt3 = Core.split_str rt2 key in | ||
let c_iv , s_iv = Core.split_str rt3 iv |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is 10 string.sub. we can easily have fewer by using the offset and string.sub directly
I think this is already pretty big, and we should work towards merging it. From my point of view, what is important:
Then we can compare and merge this PR and prepare performance improvements in subsequent changesets (e.g. allocate less). We as well have these decrypt/encrypt_into functions in mirage-crypto that we should use. If someone would be eager to develop the benchmark, that'd speed up our release schedule:) |
I don't understand the tls-eio mdx failures... I see two options: remove it or someone who's into that stuff proposing a fix /cc @talex5 (who developed these bits) |
It doesn't seem to have much to do with tls-eio or mdx; just trying to load tls in utop no longer works:
|
thanks for your investigation. @dinosaure is it known and is there a workaround for digestif not being loadable into ocaml toplevel (due to virtual library?)? |
You need to load an implementation of |
Co-authored-by: Hannes Mehnert <hannes@mehnert.org>
force-pushed on top of main (#500 was merged), and fixed speed.ml |
CHANGES: * API breaking change: remove usage of Cstruct.t inside of TLS, use bytes and string instead (mirleft/ocaml-tls#497 by @art-w, @hannesm, @dinosaure, @reynir) Performance is up to 3x improved (bandwidth), 2x improvement for handshake/s on an Intel Core(TM) i7-5600U CPU @ 2.60GHz * FEATURE: add tls-miou-unix package, which adds miou support for TLS (mirleft/ocaml-tls#494 mirleft/ocaml-tls#503 @dinosaure) * FEATURE: tls-lwt and tls-async: allow TLS over an existing connection `Tls_lwt.client_of_channels : Tls.Config.client -> ?host:[`host] Domain_name.t -> Lwt_io.input_channel * Lwt_io.output_channel -> t Lwt.t` and `Tls_lwt.server_of_channels : Tls.Config.server -> Lwt_io.input_channel * Lwt_io.output_channel -> t Lwt.t` (mirleft/ocaml-tls#499 @art-w @MisterDA) * API breaking changes: revise errors - reduce the polymorphic variant in size, align it with RFC specified errors, be in parts more precise about errors, in other parts skip data (mirleft/ocaml-tls#505, @hannesm - fixes mirleft/ocaml-tls#491) NB: if you relied on a specific error constructor, please open an issue * Remove unused constructors from Packet.{alert_type, compression_methods, client_certificate_type, extension_type} (mirleft/ocaml-tls#505, @hannesm) NB: if you relied on specific constructors, please open an issue * API breaking change: Tls.Config.{server,client} now return a result type instead of raising an exception (mirleft/ocaml-tls#502, @hannesm, fixes mirleft/ocaml-tls#411) * FEATURE: add bench/speed.exe, a benchmark for bandwidth (for different ciphersuites) and handshakes (different key exchanges and private keys) (mirleft/ocaml-tls#500 @hannesm @dinosaure @reynir) * BUGFIX: tests/feedback.exe update with TLS 1.3 semantics, run as test (mirleft/ocaml-tls#501, @hannesm - reported by @dinosaure)
CHANGES: * API breaking change: remove usage of Cstruct.t inside of TLS, use bytes and string instead (mirleft/ocaml-tls#497 by @art-w, @hannesm, @dinosaure, @reynir) Performance is up to 3x improved (bandwidth), 2x improvement for handshake/s on an Intel Core(TM) i7-5600U CPU @ 2.60GHz * FEATURE: add tls-miou-unix package, which adds miou support for TLS (mirleft/ocaml-tls#494 mirleft/ocaml-tls#503 @dinosaure) * FEATURE: tls-lwt and tls-async: allow TLS over an existing connection `Tls_lwt.client_of_channels : Tls.Config.client -> ?host:[`host] Domain_name.t -> Lwt_io.input_channel * Lwt_io.output_channel -> t Lwt.t` and `Tls_lwt.server_of_channels : Tls.Config.server -> Lwt_io.input_channel * Lwt_io.output_channel -> t Lwt.t` (mirleft/ocaml-tls#499 @art-w @MisterDA) * API breaking changes: revise errors - reduce the polymorphic variant in size, align it with RFC specified errors, be in parts more precise about errors, in other parts skip data (mirleft/ocaml-tls#505, @hannesm - fixes mirleft/ocaml-tls#491) NB: if you relied on a specific error constructor, please open an issue * Remove unused constructors from Packet.{alert_type, compression_methods, client_certificate_type, extension_type} (mirleft/ocaml-tls#505, @hannesm) NB: if you relied on specific constructors, please open an issue * API breaking change: Tls.Config.{server,client} now return a result type instead of raising an exception (mirleft/ocaml-tls#502, @hannesm, fixes mirleft/ocaml-tls#411) * FEATURE: add bench/speed.exe, a benchmark for bandwidth (for different ciphersuites) and handshakes (different key exchanges and private keys) (mirleft/ocaml-tls#500 @hannesm @dinosaure @reynir) * BUGFIX: tests/feedback.exe update with TLS 1.3 semantics, run as test (mirleft/ocaml-tls#501, @hannesm - reported by @dinosaure)
CHANGES: * API breaking change: remove usage of Cstruct.t inside of TLS, use bytes and string instead (mirleft/ocaml-tls#497 by @art-w, @hannesm, @dinosaure, @reynir) Performance is up to 3x improved (bandwidth), 2x improvement for handshake/s on an Intel Core(TM) i7-5600U CPU @ 2.60GHz * FEATURE: add tls-miou-unix package, which adds miou support for TLS (mirleft/ocaml-tls#494 mirleft/ocaml-tls#503 @dinosaure) * FEATURE: tls-lwt and tls-async: allow TLS over an existing connection `Tls_lwt.client_of_channels : Tls.Config.client -> ?host:[`host] Domain_name.t -> Lwt_io.input_channel * Lwt_io.output_channel -> t Lwt.t` and `Tls_lwt.server_of_channels : Tls.Config.server -> Lwt_io.input_channel * Lwt_io.output_channel -> t Lwt.t` (mirleft/ocaml-tls#499 @art-w @MisterDA) * API breaking changes: revise errors - reduce the polymorphic variant in size, align it with RFC specified errors, be in parts more precise about errors, in other parts skip data (mirleft/ocaml-tls#505, @hannesm - fixes mirleft/ocaml-tls#491) NB: if you relied on a specific error constructor, please open an issue * Remove unused constructors from Packet.{alert_type, compression_methods, client_certificate_type, extension_type} (mirleft/ocaml-tls#505, @hannesm) NB: if you relied on specific constructors, please open an issue * API breaking change: Tls.Config.{server,client} now return a result type instead of raising an exception (mirleft/ocaml-tls#502, @hannesm, fixes mirleft/ocaml-tls#411) * FEATURE: add bench/speed.exe, a benchmark for bandwidth (for different ciphersuites) and handshakes (different key exchanges and private keys) (mirleft/ocaml-tls#500 @hannesm @dinosaure @reynir) * BUGFIX: tests/feedback.exe update with TLS 1.3 semantics, run as test (mirleft/ocaml-tls#501, @hannesm - reported by @dinosaure)
#493 rebased to include #495 & #496 plus an update of
tls-async
. The only remaining package about theno-cstruct
move istls-eio
.