Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create QUIC library that can be exposed to JS and uses the Node dgram module #1

Closed
7 tasks done
CMCDragonkai opened this issue Nov 20, 2022 · 219 comments
Closed
7 tasks done
Assignees
Labels
development Standard development epic Big issue with multiple subissues r&d:polykey:core activity 4 End to End Networking behind Consumer NAT Devices

Comments

@CMCDragonkai
Copy link
Member

CMCDragonkai commented Nov 20, 2022

Specification

We need QUIC in order to simplify our networking stack in PK.

QUIC is a superior UDP layer that can make use of any UDP socket, and create multiplexed reliable streams. It is also capable of hole punching either just by attempting to send ping frames on the stream, or through the unreliable datagrams.

Our goals is to make use of a QUIC library, something that is compilable to desktop and mobile operating systems, expose its functionality to JS, but have the JS runtime manage the actual sockets.

On NodeJS, it can already manage the underlying UDP sockets, and by relying on NodeJS, it will also ensure that these sockets will mix well with the concurrency/parallelism used by the rest of the NodeJS system due to libuv and thus avoid creating a second IO system running in parallel.

On Mobile runtimes, they may not have a dgram module readily available. In such cases, having an IO runtime to supply the UDP sockets may be required. But it is likely there are already existing libraries that provide this like https://github.com/tradle/react-native-udp.

The underlying QUIC library there is expected to be agnostic to the socket runtime. It will give you data that you need to the UDP socket, and it will take data that comes of the UDP socket.

However it does need 2 major duties:

  1. The multiplexing and managing of streams.
  2. The encryption/decryption TLS side

Again if we want to stay cross platform, we would not want to bind into Node.js's openssl crypto. It would require instead that the library can take a callback of crypto routines to use. However I've found that this is generally not the case with most existing QUIC libraries. But let's see how we go with this.

Additional context

QUIC and NAPI-RS

Sub issues

Tasks

  1. Experiment with Neon or Napi-rs
  2. Experiment with quiche by cloudflare
  3. Create bridge code plumbing UDP sockets and QUIC functions
  4. Create self signed TLS certificate during development - Create QUIC library that can be exposed to JS and uses the Node dgram module #1 (comment)
  5. Extract out the TLS configuration so that it can be set via in-memory PEM variable and key variable. - 2 day - see Enable TLS Configuration with in-memory PEM strings #2
  6. Decide whether application protocols are necessary here, or abstract the quiche Config so the user can decide this (especially since this is not a HTTP3 library). - 0.5 day - see Check if application protocols are required #13
  7. Fix the timeout events and ensure that when a timeout occurs, that the connection gets cleaned up, and we are not inadvertently clearing the timeout due to null. Right now when a quiche client connects to the server, even after closing, the server side is keeping the connection alive. - 1 day
  8. We need lifecycle events for QUICConnection and QUICStream and QUICServer and QUICSocket. This will allow users to hook into the destruction of the object, and perhaps remove their event listeners. These events must be post-facto events. - 0.5 day
  9. [ ] Test the QUICStream and change to BYOB style, so that way there can be a byte buffer for it. Testing code should be able generator functions similar to our RPC handlers. - 1 day - see Fixed size buffer for QUICStream instead of object stream #5.
  10. Complete the QUICClient with the shared socket QUICSocket. - 3 day
  11. Test the multiplexing/demultipexing of the UDP socket with multiple QUICClient and a single QUICServer. - 1 day - See Test connection multiplexing across multiple servers and clients #14
  12. [ ] Test the error handling of the QUIC stream, and life cycle and destruction routines. - 1 day - Create tests for QUICStreams #10
  13. Benchmark this system, by sending lots of data through. - 1 day - See Create benchmarks for sending data #15
  14. Propagate the rinfo from the UDP datagram into the conn.recv() so that the streams (either during construction or otherwise) can have its rinfo updated. Perhaps we can just "set" the rinfo properties of the connection every time we do a conn.recv(). Or... we just mutate the conn parameters every time we receive a UDP packet. - See Propagate connection info along side stream creation #16
  15. Ensure that when a user asks stream.connection they can acquire the remote information and also all the remote peer's certificate chain. - See Propagate connection info along side stream creation #16
  16. [ ] Integrate dgram sending and recving for hole punching logic. - see Allow Bidirectional Hole Punching from QUICClient and QUICServer #4
  17. [ ] Integrate the napi program into the scripts/prebuild.js so we can actually build the package in a similar to other native packages we are using like js-db - see Packaging up js-quic for Linux, Windows, MacOS #7
@CMCDragonkai CMCDragonkai added the development Standard development label Nov 20, 2022
@CMCDragonkai CMCDragonkai changed the title Create QUIC library that can be exposed to JS and uses the Node dgram module or libuv UDP socket Create QUIC library that can be exposed to JS and uses the Node dgram module or libuv UDP socket Nov 20, 2022
@CMCDragonkai CMCDragonkai self-assigned this Nov 20, 2022
@CMCDragonkai
Copy link
Member Author

CMCDragonkai commented Nov 22, 2022

Ok I have managed to get the ability for quiche to send a packet. Actually I haven't sent the packet to the UDP socket, but I can see how quiche attempts to create an initial packet for me plumb into the UDP socket.

This involved several challenges.

The first is that self.0.send takes in a reference to a mutable buffer. There's a preference to avoid copying in Rust lang similar to C and C++.

This buffer can actually be taken from the outside back in JS land. This might be a better idea, as it minimises the amount of copy pasting going on.

Alternatively we can create the array in Rust, then return as a Buffer. However the send call returns the number of bytes written.

Therefore the buffer is pre-allocated to be MAX_DATAGRAM_SIZE and it is subsequently written to.

This means later the array needs to be sliced according the to the write size and then returned as a buffer.

This means we generally 2 styles of calling the Rust functions. We can call them the way rust expected to be called, by passing in a buffer as a parameter and returning the write length as well. It turns out that socket.send actually easily supports taking a buffer, offset and the length to be used: https://nodejs.org/api/dgram.html#socketsendmsg-offset-length-port-address-callback. In terms of speed, it would make use to re-use the same buffer over and over.

The other way is more idiomatic JS, which would mean that the function should return the buffer to be sent that is properly sliced up.

I think we should preserve the rust style, and then do any JS-specific idiomatic abstractions on the JS side.

At the same time, the send can also return an error which indicates that it is Done. If it is done, then there are no more packets to send. Using a JS style call, we can just return undefined in that situation. However I'm not entirely sure if this is actually possible using napi-rs. It seems you have return a Rust value, and that it automatically gets converted to JS via the napi-rs macros.

With rust style calls, the done is returned as an exception which is kind of strange, but also doable. But this is very much different from how JS should be used, and it means we also have to catch non-done exceptions.

Another thing to realise is that when you have situations where you have a buffer of some fixed size, and a then write length indicating how much of that buffer was used. This will always eventually lead to a slicing operation.

When doing any slicing, the function's stack will end up with a pointer (reference) to the slice, it won't do any copying.

Rust does not have dynamically allocated arrays that sit on the stack, so you have to use vectors instead. Vectors are heap-allocated arrays that can be dynamically changed. String is also heap allocated string relative to &str. At the same time, Rust manages the memory of vectors and knows when to deallocate them automatically.

So if we wanted to slice out from the array and also return it as a Buffer to JS, then it has to be turned into a vector first, and that vector will be dynamically allocated simply through the !vec macro.

Finally another problem is sending structs back to JS as objects. This requires hte usage of #[napi(object)] macro. However this only works on structs have all the fields public.

The send info contains information that is necessary to know how to send the packet data to the UDP socket.

It however ends up containing std::net::SocketAddr and std::time::Instant both which contain private fields. So these types cannot be easily automatically converted to JS object. It could probably be done if I implement some sort of interface for the SocketAddr. Like https://docs.rs/napi/latest/napi/trait.NapiRaw.html and https://docs.rs/napi/latest/napi/trait.NapiValue.html.

Alternatively I created my own Host struct that just contains a String and u16. This results in a conversion from the SocketAddr to Host and wrapping std::time::Instant into External::new() because it's supposed to be opaquely compared at the rust-level.

This led to a few other issues:

  1. You cannot declare nested structs in Rust, you have to give each one a separate name.
  2. There does not appear to be a way to send Rust tuples to JS, napi-rs does not understand how to deal with tuples. I would have thought that they just become arrays on the JS side... but no.
  3. Rust enums cannot be represented either, they become objects on the JS side. This is because Rust enums are actually ADTs. They have a tag, and then the subsequent value/values.

So with this all in mind, I ended up creating another custom struct just so that so I can get the value out of conn.send.

This reveals a structure that always contains the initial packet. This is probably the packet that leads to connection negotiation. If you call it again right now, you get a JS error with Done as the message.

{
  out: <Buffer c3 00 00 00 01 10 98 f4 9d 84 85 7a dd 8e f6 ef fe 25 d7 84 b3 e1 14 d3 46 1a ab 5d d2 25 f2 61 c7 0d b0 d7 83 10 88 f0 b3 f8 eb 00 40 e8 58 ad 8e 1e ... 1150 more bytes>,
  info: {
    from: { hostname: '127.0.0.1', port: 55551 },
    to: { hostname: '127.0.0.1', port: 55552 },
    at: [External: 2d723f0]
  }
}

Now we can just plumb this into the dgram module to send.

@CMCDragonkai
Copy link
Member Author

When should the send method be called? Apparently it should be triggered under an under a number of different conditions:

https://docs.rs/quiche/0.16.0/quiche/struct.Connection.html#method.send

And it should be looped in fact, until it is done.

The conditions are:

The application should call send() multiple times until Done is returned, indicating that there are no more packets to send. It is recommended that send() be called in the following cases:

  • When the application receives QUIC packets from the peer (that is, any time recv() is also called).
  • When the connection timer expires (that is, any time on_timeout() is also called).
  • When the application sends data to the peer (for example, any time stream_send() or stream_shutdown() are called).
  • When the application receives data from the peer (for example any time stream_recv() is called).

Once is_draining() returns true, it is no longer necessary to call send() and all calls will return Done.

@CMCDragonkai
Copy link
Member Author

So I can imagine attaching callback to things like on_timeout to trigger "sending", and any time we receive packets, and send data to the peer or receive data to the peer.

We could call this "socketFlushing", it will loop until done. The socket flushing would be triggered every time any of the above things happen... which in turn will flush all the packets to the socket.... will it wait for the socket to have actually sent it by waiting on the callback? I'm not sure if that's necessary...

I guess it depends. On the QUIC stream level, it should be considered independent, you don't wait for the socket to be flushed before returning to the user. But one could await a promise will resolve when they really want to have the entire socket flushed.

@CMCDragonkai
Copy link
Member Author

Ok I can see that there is no support for tuples at all here: https://github.com/napi-rs/napi-rs/tree/main/examples/napi. Also not for tuple-structs which I use in a newtype style pattern.

The only thing closest is to create an array like so:

#[napi]
fn to_js_obj(env: Env) -> napi::Result<JsObject> {
  let mut arr = env.create_array(0)?;
  arr.insert("a string")?;
  arr.insert(42)?;
  arr.coerce_to_object()
}

On the generated TS it looks like a object.

I reckon it would be dynamically deconstructable as an array. Pretty useful here, to avoid having to define one-off structs.

@CMCDragonkai
Copy link
Member Author

I've settled on just returning an Array. It beats having to redefine the output type all the time.

@CMCDragonkai
Copy link
Member Author

The streams in quiche doesn't have its own data type. Instead they are identified by a stream ID. The connection struct provides the methods https://docs.rs/quiche/0.16.0/quiche/struct.Connection.html#method.stream_recv and you have to keep track of the stream IDs somehow.

So we would expect that upon creating an RPC call, it would call into the network to create a new stream for a connection.

It seems creating a new stream is basically performing this call: https://docs.rs/quiche/0.16.0/quiche/struct.Connection.html#method.stream_send

You select a stream ID like 0, and you just keep incrementing that number to ensure you have unique stream IDs. The client-side just increments the number, the server side just receives the streams.

Note that once we have a stream, the stream is fully duplex (not half-duplex), in-order and reliable.

@CMCDragonkai
Copy link
Member Author

The quiche::connect and quiche::accept should be both factory method of creating the Connection class.

However in the case of the server side, you can't create a new connection until you've taken a specific packet from the UDP socket and parsed it first. You have to check the internal packet information to decide whether to accept a new connection, or to simply plumb it in as if you received the data for a given stream.

So the order of operations:

  1. Receive packet from UDP socket.
  2. Check if the packet is a new connection, if so, create the new connection, otherwise retrieve an existing connection.
  3. Call conn.recv() to plumb the packet data into some relevant stream.

This does mean we maintain clients hashmap of ConectionId => Connection.

@CMCDragonkai
Copy link
Member Author

CMCDragonkai commented Nov 24, 2022

I created a function to randomly generate the connection ID.

It's a random 20 bytes.

Instead of using Rust's ring library to get the random data. I'm just calling into a passed in JS function to acquire the random data. This unifies the random data usage directly from the Node runtime (and reduces how much rust code we are using here).

/// Creates random connection ID
///
/// Relies on the JS runtime to provide the randomness system
#[napi]
pub fn create_connection_id<T: Fn(Buffer) -> Result<()>>(
  get_random_values: T
) -> Result<External<quiche::ConnectionId<'static>>> {
  let scid = [0; quiche::MAX_CONN_ID_LEN].to_vec();
  let scid = Buffer::from(scid);
  get_random_values(scid.clone()).or_else(
    |err| Err(Error::from_reason(err.to_string()))
  )?;
  let scid = quiche::ConnectionId::from_vec(scid.to_vec());
  eprintln!("New connection with scid {:?}", scid);
  return Ok(External::new(scid));
}

However I realised that since it's just a 20 byte buffer. Instead of passing an External<quiche::ConnectionId<'static>> around.

We can just take a Buffer directly that we expect to be of length MAX_CONN_ID_LEN.

Meaning the connection ID is just made in JS side, and then just passed in.

On the rust side, they can use quiche::ConnectionId::from_ref(&scid) which means there's no need to to keep any memory on the Rust side. It's all just on the JS side.

This basically means we generally try to keep as much data on the JS side, and manage minimal amounts of memory on the Rust side.

@CMCDragonkai
Copy link
Member Author

The <'static> is used because it's sort of a dummy lifetime, because the ConnectionId is actually an enum of either an owned vector or a reference to some memory that exists elsewhere. The owned vector doesn't have a lifetime annotation.

@CMCDragonkai
Copy link
Member Author

A better alternative to just Buffer would be our own ConnectionId newtype.

But I can't seem to create good constructor for it that would take a callback. Problem in napi-rs/napi-rs#1374.

@CMCDragonkai
Copy link
Member Author

CMCDragonkai commented Nov 24, 2022

On the connecting side, there's an option for the server name.

If QUIC uses the standard TLS verification, we could imagine that this will not work for us. I'm not entirely sure.

We need to provide our custom TLS verification logic. I haven't found how to do this yet with QUIC but it should be possible.


Found it: cloudflare/quiche#326 (comment)

It is in fact possible.

The standard certificate verification must be disabled, and you have to fetch the certificate itself to verify.

@CMCDragonkai
Copy link
Member Author

The stream methods are mostly done on the rust side.

I've found that the largest type I can make use of is i64 not u32. This gets closer to what JS numbers are capable of, although not all of i64 can be represented in JS. but those are the edges, is unlikely to reach those.

So I've converted it to those. I'm still using as instead of the "safer" variants of try_into or try_from because it's just more convenient to coerce the numbers.

I've also found how to write the FromNapiValue trait. Basically it ends up calling NAPI-related C++ functions that convert the JS value (known as NAPI value) into a C-like value, and that value can be stored in Rust.

So basically whenever napi-rs can't automatically derive the marshalling code, we have to implement them like FromNapivalue and ToNapiValue.

I think the napi macros also expose ways of writing out exactly what the generated types should be. I'm still not entirely sure we want to do use the generated types from napi-rs because we can do it like in js-db where we just write out our TypeScript types manually.

@CMCDragonkai
Copy link
Member Author

On the JS side, I'm hoping to manage most of the state there. That means things like a map of connection IDs to connection objects, and maps of stream IDs to stream data.

That way the Rust part stays as pure as possible. We would only have 1 UDP socket for all of this. So that would simplify the the network management.

@CMCDragonkai
Copy link
Member Author

It turns out in order to send dgrams, you need to have a connection object created. However if we aren't calling conn.send, we aren't sending the initial packet for handshaking purposes.

This means, we can use conn.dgram_send for the hole punching packets first, before performing the conn.send which would initiate the handshaking process.

One thing I'm not sure about is how the TLS interacts with the dgram send. For hole punching packets we don't care encrypting the data that much, but if we can encrypt the hole punching packets that would nice. However this would at the very least require some handshake for the TLS to work... and if we are hole punching then we cannot have already done the TLS handshake.

@CMCDragonkai
Copy link
Member Author

I've started modularising the rust code for consumption on the TS side.

Some things I've found out.

For enums, there are 2 ways to expose third party enums to the TS side.

The first solution is to something like this:

use napi::bindgen_prelude::{
  FromNapiValue,
  sys
};

#[napi]
pub struct CongestionControlAlgorithm(quiche::CongestionControlAlgorithm);

impl FromNapiValue for CongestionControlAlgorithm {
  unsafe fn from_napi_value(env: sys::napi_env, value: sys::napi_value) -> Result<Self> {
    let value = i64::from_napi_value(env, value)?;
    match value {
      0 => Ok(CongestionControlAlgorithm(quiche::CongestionControlAlgorithm::Reno)),
      1 => Ok(CongestionControlAlgorithm(quiche::CongestionControlAlgorithm::CUBIC)),
      2 => Ok(CongestionControlAlgorithm(quiche::CongestionControlAlgorithm::BBR)),
      _ => Err(Error::new(
        Status::InvalidArg,
        "Invalid congestion control algorithm value".to_string(),
      )),
    }
  }
}

The basic idea is the new type pattern, but then we have to implement the FromNapiValue trait, and marshal numbers from JS side to the equivalent Rust values.

The problem here is that the generated code results in class Shutdown {} which is just a object type.

On the TS side, there's no indication of how you're supposed to construct this object.

We would most likely have to create a constructor or factory method implementation that enables the ability to construct CongestionControlAlgorithm like CongestionControlAlgorithm.reno() or CongestionControlAlgorithm.bbr().

However this is kind of verbose for what we want.

Another alternative is to use the https://napi.rs/docs/concepts/types-overwrite on the method that uses CongestionControlAlgorithm.

#[napi(ts_arg_type = "0 | 1 | 2")] algo: CongestionControlAlgorithm,

However I found another way.

Redefine the macro:

/// Equivalent to quiche::CongestionControlAlgorithm
#[napi]
pub enum CongestionControlAlgorithm {
  Reno = 0,
  CUBIC = 1,
  BBR = 2,
}

Then this is converted to TS like:

/** Equivalent to quiche::CongestionControlAlgorithm */
export const enum CongestionControlAlgorithm {
  Reno = 0,
  CUBIC = 1,
  BBR = 2
}

Then we use it, we use the match to map all the value from our enum to the third party enum.

  #[napi]
  pub fn set_cc_algorithm(&mut self, algo: CongestionControlAlgorithm) -> () {
    return self.0.set_cc_algorithm(match algo {
      CongestionControlAlgorithm::Reno => quiche::CongestionControlAlgorithm::Reno,
      CongestionControlAlgorithm::CUBIC => quiche::CongestionControlAlgorithm::CUBIC,
      CongestionControlAlgorithm::BBR => quiche::CongestionControlAlgorithm::BBR,
    });
  }

This application of the match can be a bit verbose... so if we want to do multiple times, it can be abstracted to a internal function next to the enum.

Maybe we can provide a Into trait implementation? https://doc.rust-lang.org/rust-by-example/conversion/from_into.html That might help automate this process.

@CMCDragonkai
Copy link
Member Author

By doing this:

/// Equivalent to quiche::CongestionControlAlgorithm
#[napi]
pub enum CongestionControlAlgorithm {
  Reno = 0,
  CUBIC = 1,
  BBR = 2,
}

impl From<CongestionControlAlgorithm> for quiche::CongestionControlAlgorithm {
  fn from(algo: CongestionControlAlgorithm) -> Self {
    match algo {
      CongestionControlAlgorithm::Reno => quiche::CongestionControlAlgorithm::Reno,
      CongestionControlAlgorithm::CUBIC => quiche::CongestionControlAlgorithm::CUBIC,
      CongestionControlAlgorithm::BBR => quiche::CongestionControlAlgorithm::BBR,
    }
  }
}

impl From<quiche::CongestionControlAlgorithm> for CongestionControlAlgorithm {
  fn from(item: quiche::CongestionControlAlgorithm) -> Self {
    match item {
      quiche::CongestionControlAlgorithm::Reno => CongestionControlAlgorithm::Reno,
      quiche::CongestionControlAlgorithm::CUBIC => CongestionControlAlgorithm::CUBIC,
      quiche::CongestionControlAlgorithm::BBR => CongestionControlAlgorithm::BBR,
    }
  }
}

We get bidirectional transformation of the enums and we can use algo.into() generically.

@CMCDragonkai
Copy link
Member Author

The pacing at gives us an Instant. Which is not consistent across platforms. On non-macos, non-ios and non-windows, this can be converted into a timespec.

#[cfg(not(any(target_os = "macos", target_os = "ios", target_os = "windows")))]
fn std_time_to_c(time: &std::time::Instant, out: &mut timespec) {
    unsafe {
        ptr::copy_nonoverlapping(time as *const _ as *const timespec, out, 1)
    }
}

#[cfg(any(target_os = "macos", target_os = "ios", target_os = "windows"))]
fn std_time_to_c(_time: &std::time::Instant, out: &mut timespec) {
    // TODO: implement Instant conversion for systems that don't use timespec.
    out.tv_sec = 0;
    out.tv_nsec = 0;
}

A time spec is basically a timestamp with seconds component and a nano seconds component.

There isn't any portable rust functions right now that convert a std::time::Instant to something like milliseconds where I could just turn it into a JS timestamp that being of Date.

On the other hand it is intended to be an opaque type, so that's why I have it as an External, however this doesn't help us figure out how to pace the sending of the packet data.

One idea is to just do what the ffi.rs does and convert it to a timespec and then extract out the underlying second and nanosecond data and convert to milliseconds or Date object on JS.

However there's no information how to do this for the macos, ios and windows case. So it's kind of useless atm.

@CMCDragonkai
Copy link
Member Author

I think pacing isn't necessary since it isn't used in the examples, so we can leave it out for now. But the generated types won't be usable as it is referring to Instant.

I'm not entirely sure how generated types are supposed to work with External. Unless you're supposed to just create a dummy class type for that type that it is returning.

I'm likely end up writing our own TS file instead of relying on the generated code, but the generated code is useful for debugging.

@CMCDragonkai
Copy link
Member Author

Ok so I'm starting to experiment with the rust library by calling it from the JS.

One thing I'm unclear about is that this library being @matrixai/quic will essentially be about exposing the quiche library directly, or is intending to wrap it into its own set of abstractions.

In this library do we also create the socket and thus expose a socket runtime, such as creating a class Server and thus class Client.

It seems so, because quiche would expose quiche::accept to create a server connection, and quiche::connect to create a client connection.

We know that in both cases, they can use the same socket address, thus using 1 socket address to act as both client and server.

If we do this, we might just create a class Connection. However we already have this as provided by the native library.

So if we just expose the QUICHE design, then the actual socket management will be in PK.

@CMCDragonkai
Copy link
Member Author

Ok so then only in the tests would we be making use of the sockets.

@CMCDragonkai
Copy link
Member Author

Ok as I have been creating the QUIC server, I've found that there are many interactions with the socket during the connection establishment. It makes sense that this library would handle this complexity too. Things like address validation, stateless retries and other things all happen during connection establishment.

@CMCDragonkai
Copy link
Member Author

Ok I'm at the point where it's time to deal with streams.

So therefore this library does have to do:

  1. Handle the interaction between the connection establishment and socket.
  2. Convert the QUIC streams into actual Web/DOM streams

So there's quite a bit of work still necessary on the JS side.

@CMCDragonkai
Copy link
Member Author

The ALPNs may not be relevant unless we are working with HTTP3. It also seems we could just refer to our own special strings, since our application level protocol isn't HTTP3.

@CMCDragonkai
Copy link
Member Author

We could do something like polykey1 as the ALPN. Where 1 would be the application layer protocol which would be RPC oriented...? I'm not sure if this is even needed.

@CMCDragonkai
Copy link
Member Author

In the server.rs examples in quiche, we see a sort of structure like this:

loop {
  poll(events, timeout); // This is blocking on any event and timeout event
  loop {
    // If no events occurred, then it was a timeout that occurred
    if (!events) {
      for (const conn of conns) {
        // Calls the `on_timeout()` when any timeout event occurs
        conn.on_timeout();
      }
      break;
    }
    // Receive data from UDP socket
    packet = socketRecv();
    Header.parse(packet)
    // Do a bunch of validation and other stuff
    conn = Connection.accept();
    // Plumb into connection receive
    conn.recv();
    if (conn.inEarlyData() || conn.isEstablished()) {
      // handle writable streams (which calls streamSend())
      // handle readable streams (which calls streamRecv() then streamSend())
      // Note all of this is being buffered up in-memory, nothing gets sent out to the sockets yet
    }
  }
  // Now we process things to be sent out
  for (const conn of conns) {
    conn.send();
    socketTo();
  }
  // Now we GC all the conns that are dead
}

In JS we have to do things by registering handlers against event emitters, we cannot block the main thread. The above can be translated into several possible events.

  1. On socket message
  2. On connection timeout (any connection) (although we can be more specific)
  3. On stream data being written
  4. On connection being dead

For the first event, this is easy to do. The UDP dgram socket is already an event emitter, we simply register socket.on('message', handleMessage); and we can do the first part of the loop above which is to parse the packet data, and then decide whether to do retries or other protocol actions, then eventually accept a new connection or plumb it. The final plumbing is a bit strange... in terms of processing the streams. In particular right now it is iterating over streams that are writable and then iterating over streams that are readable. In terms of JS, this would make sense as an event propagation, that is quiche doesn't expose a readable event for the stream objects (which aren't even objects, they are IDs relative to quiche). Thus if we iterate over the streams, we would just then convert these to events on the actual stream. On the JS side it makes sense to create stream objects, and then emit those events as we iterate over readable and writable. This would only work for streamRecv(). For the streamWrite() we would have to do a "sink" that basically means every time a user writes stream.write() it ends up calling streamWrite(). However there's no callback here, it's all synchronous. So how does one know when their stream write is being flushed to the socket? There's no way to know because quiche decides on the multiplexing, could that mean stream.write(data, () => { /* resolves only when buffered into quiche, does not mean data is flushed to the socket */ })?

@CMCDragonkai
Copy link
Member Author

The above structure also implies that upon timeout, and readable socket events, that we also start to send packets out. The // Now we process things to send out. That works because it is assumed that handling writable and readable streams ends up calling streamSend which then puts data in quiche's buffer that needs to be sent out.

However in JS again, this doesn't seem idiomatic. Instead we should have an event emitter that emits events upon any streamSend occurring, that indicates there's data to be sent out. After all any data we buffer into quiche should eventually be flushed.

Upon doing so, we would want to then perform the conn.send() which plumbs the data out and then socketTo().

Note how it iterates over all connections in an attempt to find if there's any data to send. It would only make sense to send data from connections that have stream sends.

Unless... it's possible that that connections have data to be sent even if no stream has sent anything. For example things like keepalive packets. I'm not sure. But in that case, shouldn't that be done on the basis of timeouts?

All of this requires some prototyping.

@CMCDragonkai
Copy link
Member Author

I think the tls utilities cna be rolled into the utils.ts. Some of those utilities aren't really necessary.

We primarily need a way to just produce pem files from the JWK files.

@tegefaulkes
Copy link
Contributor

I created a benchmark for sending data through a stream. It has a problem however, when running the benchmark we get an error relating to async-init lifecycle locking. Something about reading isLocked of undefined. This sounds familiar and might be related to swc again.

Just to save time as a stop-gap measure I made a quick dummy.ts benchmark that does some basic maths operations. Hopefully the bench results should allow the mac runner to pass.

I also made some changes to the CI yaml file to have build:windows allow failure. Hopefully the pipeline can complete.

@CMCDragonkai
Copy link
Member Author

We are using swc, but we use it in js-db as well, and there's no problems with that. I'm pretty sure we can make it work. Something else is broken. Try to do the same thing with ts-node. If it breaks under benches, that means it would break under ts-node too.

@CMCDragonkai
Copy link
Member Author

So definitely it should work.

@CMCDragonkai
Copy link
Member Author

@tegefaulkes

BTW, we use swc regularly in via ts-node. If swc is causing the problem, then that would mean that no prototyping scripts could work when you bring in your QUIC classes you have been writing. So you can always just check by writing a regular ./test-something.ts script and run it with npm run ts-node ./test-something.ts.

@CMCDragonkai
Copy link
Member Author

I'm doing my test review fixes in #26. This will be merged into staging.

In the meantime @tegefaulkes you should figure out how to get the benches working and finish off the CI/CD.

I don't think swc should be a problem, but you should just disable and check. It shouldn't be failing here when it works for js-db.

@CMCDragonkai
Copy link
Member Author

When you finish the benches and CI/CD work push changes to staging so I can rebase with #26.

@tegefaulkes
Copy link
Contributor

The macos:build job is still failing. This time due to a timeout with the client connecting to the server. This is odd since we don't have a problem connecting during the tests that ran just before the benchmarks.

This seems to be a platform specific problem? I don't see this happening in the other jobs. Testing and fixing this through the CI is going to be slow, the job takes 20 min to complete so the feedback is really slow.

I can test on the local mac machine, but I'll need to configure the environment to build the library. It's not quite 1:1 process with the ci mac platform. It will take a little bit of working out.

Depending on the priority or what's required at this point, I can just disable the benches and the bench artefact requirements. Maybe the artefacts has an option to allow no files to be uploaded without error?

If the goal is just to reach a published pre-release version then the quickest way is to prevent the benches from failing the job.

@CMCDragonkai
Copy link
Member Author

I've already used the local Mac and built the library successfully. I don't think it's that difficult.

Try running the benches on the local Mac and find out why it's not working.

@CMCDragonkai
Copy link
Member Author

CMCDragonkai commented May 22, 2023

Your tests work fine which means it does connect. If your benches don't work... Try doing it on the local Mac without swc enabled.

We are switching the latest Mac soon on May 31st anyway. So whatever happens on our local Mac should be 1:1 to the CI/CD mac.

@tegefaulkes
Copy link
Contributor

Fixed, We were binding and trying to connect to the mapped ipv4 address. This was causing a problem on the mac.

We also fixed a side bug where the benchmark summary wasn't being awaited resulting in concurrent behaviour.

I've pushed the fix, hopefully the CI will finally pass.

@CMCDragonkai
Copy link
Member Author

IPv4 mapped IPv6 addresses should only be used when you are a dual stack client attempting to connect to an IPv4 server.

It should not be used on the host-side to avoid confusion.

@CMCDragonkai
Copy link
Member Author

I noticed that maxReadableStreamBytes and maxWritableStreamBytes are separate parameters for the readable stream and writable stream.

I think this should be part of our QUICConfig object, we are already adding custom things to it such as tlsConfig, although I kind of want to flatten all of it together as a domain specific configuration inside this library already.

@CMCDragonkai
Copy link
Member Author

I think verifyPem and verifyFromPemFile should be mutually exclusive options.

Also I believe these options are by convention called the ca or something. See the ca option in https://nodejs.org/docs/latest-v18.x/api/tls.html#tlscreatesecurecontextoptions. I think that's more conventional for our purposes.

@CMCDragonkai
Copy link
Member Author

It appears the default CA stuff is already configured with boring ssl. I think it uses whatever is already configured in the OS. I haven't confirmed this yet though since I haven't tried using the QUICClient to connect to the default cloudflare HTTPS test: https://cloudflare-quic.com. We should try that later.

In the mean time, I'll convert the verify* options to just a single ca option.

@CMCDragonkai
Copy link
Member Author

It should look like this:

  ca?: string | Array<string> | Uint8Array | Array<Uint8Array>;

We shouldn't be using Buffer here as much as possible.

Anyway this would go into the current with_boring_ssl_ctx to do this.

@CMCDragonkai
Copy link
Member Author

Node by default's TLS system uses its own bundled Mozilla certificates. A special flag node --use-openssl-ca switches it to using OpenSSL CA store on the operating system.

With boringssl it might be behaving similar to as if I was using --use-openssl-ca.

We should make sure we align our behaviour between websockets and quic here to ensure that we are always defaulting to the OS's openssl CA store.

Node's tls module defaults to its own bundled CA store: https://stackoverflow.com/questions/14619576/where-is-the-default-ca-certs-used-in-nodejs.

This means that there's no actual need to have a file version of this. We can just use the ca option. I'm removing the verifyFromPemFile as this is no longer needed then.

@tegefaulkes
Copy link
Contributor

CI is passing now, I've created a pre-release version so that should be published once the CI completes again.

@tegefaulkes
Copy link
Contributor

CI failed with pre-release on mac and linux, same problem for both.

> @matrixai/quic@0.0.2-alpha.0 prebuild
> node ./scripts/prebuild.js --production
Make sure that Cargo.toml version matches the package.json version
npm verb exit 1
npm verb code 1

Failed after checking if the cargo.toml version and the package.json version matches. Since this is pre-release the package version is 0.0.2-alpha.0. Is it even possible for the cargo.toml to have a version number like that?

@CMCDragonkai mentioned that the pre-release should update the cargo version to match automatically? I'm not sure it's doing that yet. Do we need to update the cargo version manually? Can we set it to the pre-release version number?

@CMCDragonkai
Copy link
Member Author

The cargo version has to be set manually. There's no automatic version update see the postversion command...

On the other hand maybe we can update the cargo version...

But I don't know if cargo understands those prerelease version tags. Check if that even exists in Cargo.

If they do exist then we could make postversion do an automatic replacement.

If not, then we may need to discard prerelease tags entirely for this project.

@CMCDragonkai
Copy link
Member Author

With the closing of #17:

This is done and released as pre-release. However there are some issues and bugs and we are refactoring connection and stream lifecycles and increasing the amount of native state transition tests in #26.

Closing this.

@CMCDragonkai
Copy link
Member Author

In the future note that NAPI-rs seems to have a problem with callback related code, and we were dealing with this in our custom TLS verification. Watch these 2 threads as they are not answered:

  1. Why does failing to call a callback result in a panic? napi-rs/napi-rs#1373
  2. Callback usage in factory method results in macro error napi-rs/napi-rs#1374 - author has labelled this as a bug now

Additionally there are some issues with using callbacks in Rust... especially relating to the context. @tegefaulkes can you write some notes down here about what you learned about callbacks so we have something to refer back to.

@tegefaulkes
Copy link
Contributor

Sure I can write some up. Do you want them here?

@CMCDragonkai
Copy link
Member Author

Yes for now.

@tegefaulkes
Copy link
Contributor

I've explained the kinds of callbacks that can be done in this comment #30 (comment).

Really there are 3 methods as method 3 in the comment seems to be a very old way of doing it.

Method 1 is the simplest as it handles all of the type conversions between rust and node automatically. Args and results are just used as the rust types and converted to node types. This function is synchronous and as far as I know must only be called within the context of that native function. I may be possible to store the callback and use it later by creating a reference to it to prevent garbage collection but I'm not sure.

Method 2 is similar to 1 but with more boilerplate. Args and results are JsValue types. To create new JsValues you must use the Env with something like env.create_string('something') to create a string that can be used with the callback. The callback can be called synchronously. It can even be stored for later by creating a reference to it to prevent garbage collection and can be called anywhere a Env is available.

Both methods 1 and 2 can be asynchronous since tasks can be converted to a Promise. I haven't looked too deep into how to do this.

Both method 1 and 2 can only be called within the main thread. References and Env explicitly can't be passed between threads, this is important for later. To call a callback between threads, a JsFunction must be converted to a ThreadsafeFunction that provides an Env when it is called.

Method 4 is similar to Method 2 but the JsFunction is wrapped in a ThreadsafeFunction within a callback |env: Env| { /* calling callback here */ }. This can be moved between threads and called but there is an annoying limitation. You can call the function and pass args synchronously. BUT you can't get the result of the callback synchronously. You need to await it in the rust code for the result. This is by design since ThreadsafeFunctions are added into queue when called.

This brings us to our use case. The very_callback that we need to create to override the verification process is the worst case scenario. By it's type definition the callback moves between threads and is synchronous. So we are forced to use a ThreadsafeFunction just to be able to call it. But we can't asynchronously await the result of the callback. We can however pass some verify_callback specific information through this callback.

General summary.

  1. method 1
    a. Is the simplest
    b. Can't be passed around, I haven't confirmed this.
    c. Can be both synchronous and async.
  2. method 2
    a. Slightly more complex.
    b. Can be passed around the main thread but not other threads.
    c. Can be both synchronous and async.
  3. method 4
    a. Most complex
    b. Can be passed around everywhere relatively simply.
    c. Can be both synchronous and async.
    d. Can call with args synchronously to pass data.
    e. Can't get results synchronously.

@CMCDragonkai CMCDragonkai added the r&d:polykey:core activity 4 End to End Networking behind Consumer NAT Devices label Jul 9, 2023
@CMCDragonkai
Copy link
Member Author

Upstream has labelled this as a bug now. And we also aren't using callbacks for now since we figured out an alternative for custom TLS.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
development Standard development epic Big issue with multiple subissues r&d:polykey:core activity 4 End to End Networking behind Consumer NAT Devices
Development

No branches or pull requests

2 participants