Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Review - Tests, Key and Certificate Generation, Concurrency Control, Custom TLS Verification #26

Merged
merged 22 commits into from
Jul 6, 2023

Conversation

CMCDragonkai
Copy link
Member

@CMCDragonkai CMCDragonkai commented May 21, 2023

Description

This is a review PR for the tests and crypto code that is being used in the tests.

Issues Fixed

Tasks

  • 1. Remove all the tests/fixtures in favour of auto-generated certificates and keys
  • 2. Replace with the new key generation that is more flexible
  • 3. Replace with certificate generation that supports RSA, ECDSA and Ed25519 keys
  • 4. Review why Ed25519 certs have a slow shutdown - only specific cases where the client fails to verify the server TLS certificate and also for ECDSA, there is a 3 second timeout at the end, this is built into quiche.
  • 5. Review why there might be stream lifecycle problems when we increase the stream concurrency between connections. 2 days
  • 8. Figure out if js-timer can be used for repetition (like reset the time and run again the handler). 0.5 days
  • 6. Use js-timer for keepAliveIntervalTimer. 0.5 days
  • 7. Use js-timer for connTimeoutTimer. 0.5 days
  • 9. Custom TLS verification. 3days, 0.5-1 day left
    • Native tests - have the native callback set, and then run the JS callback directly
    • JS tests
    • Send the ack eliciting frame after verification or after sending a short frame
    • This sort of proves that both sides are fully verified for our connection establishment
  • 10. Native tests required for stream life cycle, this will prove how the streams actually work internally. 1 day
  • 11. Test with respect to Review - Tests, Key and Certificate Generation, Concurrency Control, Custom TLS Verification #26 (comment) - test out the max idle timeout situation and also the high water mark of 0, ensure that no JS buffering of streams is occurring. 1 day
  • 12. Monitor integration into the atomic set of operations discussed here Review - Tests, Key and Certificate Generation, Concurrency Control, Custom TLS Verification #26 (comment) - required parameter because QUICConnection methods there are all internal. 1 day
  • 13. maxIdleTimeout needs to default to 1 min, the timeout for starting a connection and the keep alive needs to be strictly less than the maxIdleTimeout. If maxIdleTimeout is 0 then this constraint is ignored.
  • 14. The flow of events for the connection should be diagrammed with an activity diagram: Review - Tests, Key and Certificate Generation, Concurrency Control, Custom TLS Verification #26 (comment) - need to verify all these steps here

Final checklist

  • Domain specific tests
  • Full tests
  • Updated inline-comment documentation
  • Lint fixed
  • Squash and rebased
  • Sanity check the final build

@ghost
Copy link

ghost commented May 21, 2023

👇 Click on the image for a new way to code review

Review these changes using an interactive CodeSee Map

Legend

CodeSee Map legend

@CMCDragonkai
Copy link
Member Author

I've come to the conclusion that JsonWebKey is a far better key object to be used for portability even in-memory.

Right now in PK we are still using in-memory buffers, and that's fine, but if we want flexibility, then JsonWebKey is far superior.

So right now I've actually removing tests/tlsUtils.ts and replacing it with more generic functions in just tests/utils.ts that will be able to generate RSA, ECDSA, and Ed25519 keypairs and certificates while using JsonWebKey as the main portable format.

Note that it's easier to protect the memory of keys if they are buffers. But more difficult to do so for JsonWebKey objects. So in PK we only use Ed25519 keys so that's fine. But if we want to support more keypairs, then it may be a good idea to bring in the code I have right now and integrate it into PK.

@CMCDragonkai
Copy link
Member Author

The main libraries required here are just:

  1. @peculiar/webcrypto
  2. @peculiar/x509

That's it.

The ASN1 libraries are not needed. This is because we solely rely on peculiar's webcrypto to do everything.

The only reason PK is using the ASN1 libraries is to prepare to be more portable and eventually not rely on @peculiar/webcrypto when we are on mobile operating systems.

@CMCDragonkai
Copy link
Member Author

CMCDragonkai commented May 21, 2023

So in the fixtures, you generate regular keys and certs, and also you generate CAs.

I'm going to do this all dynamically now. Even CA signing is just a matter of having 2 keypairs.

In fact these keypairs can be different. The signature of the child cert however would be signed by the parent's private key.

@CMCDragonkai
Copy link
Member Author

@tegefaulkes I've noticed that you don't have any tests for QUICServer. No QUICServer.test.ts? Why is this?

@CMCDragonkai
Copy link
Member Author

I'm creating these.

@CMCDragonkai
Copy link
Member Author

@tegefaulkes make sure that exported types are at the bottom.

@tegefaulkes
Copy link
Contributor

Most tests for QUICServer depend heavily on the QUICClient. they are tested as a pair in the QUICClient.test.ts test file. That one test file has gotten pretty large, We could split it up and rename it. But I'm not sure how many tests we can make that test the QUICClient or QUICServer in isolation.

@CMCDragonkai
Copy link
Member Author

The way you do tests between client and server is that server just runs tests on starting the server, with utility functions creating the client. In this sense you are "fixing the client", and testing variation on the server.

On the client, you "fix the server" and you test variations on the client.

Yes they can be tested together, but it depends on how you create tests. Whenever you're testing things you must pick your independent variables and dependent variables.

Ideally you fix everything except one thing. That reduces the amount of state space you need to cover.

Of course the more independent variables that exist, the more multidimensional variation that can exist, that's why we use fast check too to test higher dimensional spaces of configuration.

But you always start with the simplest smallest stuff first before building up to the more complex tests, because the first smaller tests are your sanity check.

@CMCDragonkai
Copy link
Member Author

@tegefaulkes types should always be in types.ts. You put TlsConfig into config.ts.

@CMCDragonkai
Copy link
Member Author

Here's an interesting idea for testing UDP sockets.

There's an option called reuseAddr.

Right now QUICSocket sets this to false... but why? It doesn't really matter to us if if something else were to bind to this port as well? Maybe there's a security issue here, but we could expose this as a configuration option.

Anyway... this enables the ability for another socket to bind to the same address/port. Or it even enables running the QUICSocket on a port something else is using at the same time. Thus enabling 2 QUICSocket on the same port.

Subsequently, you can now send a message to the port, and I believe the message is duplicated to both sockets.

This allows you have 2 sockets that receive the same message sent to it.

This can enable some interesting tests, so that one can also get a copy of the messages that is being sent to the UDP socket used by QUICSocket.

@CMCDragonkai
Copy link
Member Author

I'm adding reuseAddr as an additional option to QUICSocket.start that defaults to false as normal.

But we can enable it for testing, or if there are special reasons to have this later in Polykey.

@CMCDragonkai
Copy link
Member Author

On QUICServer if the the socket is not shared, it will expose reuseAddr option too to be set. Again defaulting to false.

@CMCDragonkai
Copy link
Member Author

Ok the above idea means that with unicast packets, it only gets sent to the last bound socket. So I don't get both packets on both ports.

Basically by doing the above, the new socket will end up taking over, and messages get sent the new socket.

However IF I change QUICSocket to bind to 0.0.0.0 which binds to all interfaces, it will end up receiving the broadcast packet.

So the only way to get a fan-out structure is to have all sockets bound on 0.0.0.0 and then send a broadcast packet to 255.255.255.255.

Now you end up getting packets on the shared sockets.

@CMCDragonkai
Copy link
Member Author

Anyway I'm not sure if we can use this unless we are sending broadcast packets. So I'll leave that solution to be explored later.

I tried also sending 2 unicast packets, both end up being sent to the same socket. I thought that there might be round-robin, but now I'm not sure.

@CMCDragonkai
Copy link
Member Author

One of the challenges using fast check with webcrypto is that we cannot seed webcrypto API.

In PK we are able to do this with Ed25519 because the private key is just a random bunch of bytes.

However for RSA, ECDSA, this is not possible.

So for now I don't think we will be using fast check on our keys and certificates, instead we will just always randomly generate them or use them across a number of tests.

@CMCDragonkai
Copy link
Member Author

Therefore, rather than generating the keys and certs all the time, we can just do it once at the beginning of each test module.

Something like this:

    let keyPairRSA: {
      publicKey: JsonWebKey;
      privateKey: JsonWebKey;
    };
    let certRSA: X509Certificate;
    let keyPairRSAPEM: {
      publicKey: string;
      privateKey: string;
    };
    let certRSAPEM: string;
    let keyPairECDSA: {
      publicKey: JsonWebKey;
      privateKey: JsonWebKey;
    };
    let certECDSA: X509Certificate;
    let keyPairECDSAPEM: {
      publicKey: string;
      privateKey: string;
    };
    let certECDSAPEM: string;
    let keyPairEd25519: {
      publicKey: JsonWebKey;
      privateKey: JsonWebKey;
    };
    let certEd25519: X509Certificate;
    let keyPairEd25519PEM: {
      publicKey: string;
      privateKey: string;
    };
    let certEd25519PEM: string;
    beforeAll(async () => {
      keyPairRSA = await testsUtils.generateKeyPairRSA();
      certRSA = await testsUtils.generateCertificate({
        certId: '0',
        subjectKeyPair: keyPairRSA,
        issuerPrivateKey: keyPairRSA.privateKey,
        duration: 60 * 60 * 24 * 365 * 10,
      });
      keyPairRSAPEM = await testsUtils.keyPairRSAtoPEM(keyPairRSA);
      certRSAPEM = testsUtils.certToPEM(certRSA);
      keyPairECDSA = await testsUtils.generateKeyPairECDSA();
      certECDSA = await testsUtils.generateCertificate({
        certId: '0',
        subjectKeyPair: keyPairECDSA,
        issuerPrivateKey: keyPairECDSA.privateKey,
        duration: 60 * 60 * 24 * 365 * 10,
      });
      keyPairECDSAPEM = await testsUtils.keyPairECDSAtoPEM(keyPairECDSA);
      certECDSAPEM = testsUtils.certToPEM(certECDSA);
      keyPairEd25519 = await testsUtils.generateKeyPairEd25519();
      certEd25519 = await testsUtils.generateCertificate({
        certId: '0',
        subjectKeyPair: keyPairEd25519,
        issuerPrivateKey: keyPairEd25519.privateKey,
        duration: 60 * 60 * 24 * 365 * 10,
      });
      keyPairEd25519PEM = await testsUtils.keyPairEd25519ToPEM(keyPairEd25519);
      certEd25519PEM = testsUtils.certToPEM(certEd25519);
    });

@CMCDragonkai
Copy link
Member Author

Reviewing Node's TLS module, I can see that you can actually set multiple keys and associated cert chains.

This actually enables the TLS system to use multiple keys and cert chains, possibly using different keys. This may actually solve the #17.

In the boring ssl codebase, the function set_private_key translates to calling SSL_CTX_use_PrivateKey.

I think this means it's possible to call this function multiple times.

See: https://github.com/nodejs/node/blob/92a938b4dd318a8732110da03f1f666db2e96f97/lib/internal/tls/secure-context.js#L160-L177

@CMCDragonkai
Copy link
Member Author

Once you can set multiple keys and certs, it's possible that it's the TLS client that figures out which key/cert to use to verify. This is something that's worth considering since we are going to enable Ed25519 here.

@CMCDragonkai
Copy link
Member Author

So boringssl does use the same environment variables:

»» ~/Projects/boringssl
 ♖ ag 'SSL_CERT_DIR'                                                                            (master) pts/7 18:37:51
crypto/x509/x509_def.c
71:#define X509_CERT_DIR_EVP "SSL_CERT_DIR"
»» ~/Projects/boringssl
 ♖ ag 'SSL_CERT_FILE'                                                                           (master) pts/7 18:37:56
crypto/x509/x509_def.c
72:#define X509_CERT_FILE_EVP "SSL_CERT_FILE"
»» ~/Projects/boringssl
 ♖ less crypto/x509/x509_def.c                                                                  (master) pts/7 18:38:01
»» ~/Projects/boringssl
 ♖ ag 'X509_CERT_DIR_EVP'                                                                       (master) pts/7 18:38:06
crypto/x509/x509_def.c
71:#define X509_CERT_DIR_EVP "SSL_CERT_DIR"
82:const char *X509_get_default_cert_dir_env(void) { return X509_CERT_DIR_EVP; }

That does mean that without setting the CA cert deliberately, it just relies on the default ca cert location on the operating system. On NixOS, that would be dependent on the cacert package.

This will be important when pushing the docker container image.

@CMCDragonkai
Copy link
Member Author

CMCDragonkai commented May 23, 2023

There must also be a default signature algorithm set too. I'm not sure if I can find it in the boringssl source code. By default we want to also enable ed25519 (which is not enabled by default in boringssl). However there's way to extract the default from the boringssl codebase. So we just have to set it ourselves to be the default.

This is currently set to:

const supportedPrivateKeyAlgosDefault =
  'ed25519:RSA+SHA256:RSA+SHA384:RSA+SHA512:ECDSA+SHA256:ECDSA+SHA384:ECDSA+SHA512:RSA-PSS+SHA256:RSA-PSS+SHA384:RSA-PSS+SHA512';

I think it's better to use the new TLS v1.3 scheme names as written in the RFC and then join it together with :.

@CMCDragonkai
Copy link
Member Author

It appears Node.js relies on:

  • SSL_CTX_use_certificate_chain
  • SSL_CTX_add1_chain_cert

Both functions don't have equivalents in the boring package.

So I'm not sure if it's possible atm to supply multiple independent certificate chains atm.

It seems boringssl supports it. But not the boring package itself.

@CMCDragonkai
Copy link
Member Author

CMCDragonkai commented May 23, 2023

I refactored to something like this:

    // Setup all certificates and keys
    // The below may not actually work
    // We assume we can just use certificate and add them to it
    // However this may not be possible
    if let (Some(key), Some(cert)) = (key, cert) {
      for (k, c) in key.iter().zip(cert.iter()) {
        let private_key = boring::pkey::PKey::private_key_from_pem(&k)
          .or_else(
          |err| Err(Error::from_reason(err.to_string()))
        )?;
        ssl_ctx_builder.set_private_key(&private_key).or_else(
          |e| Err(napi::Error::from_reason(e.to_string()))
        )?;
        let x509_cert_chain = boring::x509::X509::stack_from_pem(
          &c.to_vec()
        ).or_else(
          |err| Err(napi::Error::from_reason(err.to_string()))
        )?;
        for (i, cert) in x509_cert_chain.iter().enumerate() {
          if i == 0 {
            ssl_ctx_builder.set_certificate(cert,).or_else(
              |err| Err(Error::from_reason(err.to_string()))
            )?;
          } else {
            ssl_ctx_builder.add_extra_chain_cert(
              cert.clone(),
            ).or_else(
              |err| Err(Error::from_reason(err.to_string()))
            )?;
          }
        }
      }
    }

@CMCDragonkai CMCDragonkai force-pushed the feature-key-gen branch 2 times, most recently from 7437d80 to 997f395 Compare May 24, 2023 06:36
@CMCDragonkai
Copy link
Member Author

Yea I'm pretty sure that the current boring crate does not provide a straight forward way to do this multiple certificate chain.

Also I did find a good resource that explains what all the boringssl functions do: https://commondatastorage.googleapis.com/chromium-boringssl-docs/ssl.h.html. It's much more clearer than all the other manpages.

In particular this https://commondatastorage.googleapis.com/chromium-boringssl-docs/ssl.h.html#SSL_CTX_set_cert_cb is actually one of the main ways in which one could potentially select the correct certificate.

So for now I'm going to disable the ability to set multiple certificates on the Rust side.

@CMCDragonkai
Copy link
Member Author

Ok config tests are done for now. No more TlsConfig anymore. It's just ca, key and cert.

@CMCDragonkai
Copy link
Member Author

I've started documenting the config more clearly. Some of the parameters of quiche requires a deeper test and review on how they work.

In particular:

  1. initialMaxStreamsBidi - we need to find out if this applies across the whole QUIC socket or on per-connection basis. Also what happens if you exceed it and correctly handle for this.
  2. Setting the highWaterMark to 0, we don't want to do repeated buffering in this library. It's enough for quiche itself to already be doing buffering on incoming stream data.
  3. Setting the default of maxIdleTimeout to 0, as per the default in quiche. This enables infinite timeout. Which means it's up to the user to decide if they want a smaller timeout. We can make use of this to enable the usage of js-async-cancellable to the explicitly cancel the connection creation. However we have to know how to actually cancel the connection creation. This would involve a change to the await Promise.race([connection.establishedP, errorP]), as it is currently intended to wait for a timeout error, if instead we use a 0 timeout, then the timeout is driven by JS instead of inside quiche.

I think the whole timeout looping is way too complicated as it is written. It requires some refactoring similar to what happened during the tasks domain in PK.

- Fixed up start erroring out.
- Fixed up graceful TLS test.
- Cleaning up.

[ci skip]
[ci skip]
- fixed up `QUICServer.test.ts`
- fixed up `QUICSocket.test.ts`
- fixed up `concurrency.test.ts`

[ci skip]
- cleaning up `QUICStream.ts` writable stream logic
- cleaning up `QUICStream.ts` readable stream logic
- fixing up `QUICStream` tests

[ci skip]
- logic fixes for early stream ending.
- fixing up tests.
- fixed up remote info for stream.
- fixed problem with process being held open and handling socket errors.
- propagating all events to the top level client or server.
- cleaning up comments and commentary.
- fixed up benchmark.
@tegefaulkes
Copy link
Contributor

I've fixed up the Monitor logic to clean them up when done.

@tegefaulkes
Copy link
Contributor

I'm merging this now. We'll do a proper review in staging.

@tegefaulkes tegefaulkes merged commit 06bb0d9 into staging Jul 6, 2023
@CMCDragonkai
Copy link
Member Author

Great!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

2 participants