Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support reloading TLS configuration #4204

Closed
wants to merge 9 commits into from
Closed

Support reloading TLS configuration #4204

wants to merge 9 commits into from

Conversation

akshayganeshen
Copy link
Contributor

@akshayganeshen akshayganeshen commented Jun 7, 2018

Features

  • Reload key-pair (changes to "key_file", and "cert_file" and/or underlying files)
  • Reload CA cert (changes to "ca_file"/"ca_path" and/or underlying files)
  • Reconfigure verify incoming/outgoing settings ("verify_incoming", and "verify_outgoing")
  • Enable/disable TLS for agent checks ("enable_agent_tls_for_checks")
  • Reconfigure server name used for verify outgoing ("node_name"/"server_name")
  • Reconfigure TLS handshake settings ("tls_min_version", "tls_cipher_suites", and "tls_prefer_server_cipher_suites")
  • Fall back on previously loaded key-pair if reloading fails (e.g. key file is deleted after startup)

Still TODO

  • Tests

Use-cases

Reload short-lived/automated certificates without downtime. See #2584.

Design

The design is based around a Loader object that supplies TLS configuration dynamically via callbacks. This is similar to the KeyLoader approach used in Nomad (see this pr).

The Agent holds a reference to the Loader instance that generates TLS configurations for both incoming and outgoing connections. During agent reload, the new configuration is passed to the Loader, which is subsequently used for any new connections.

Implementation

The Loader sets up the callback GetConfigForClient to provide a tls.Config for each incoming connection.

When calling Reload, the Loader pre-calculates the incoming and outgoing tls.Config and returns them as appropriate.

Questions

  1. Are there any session-related implications for using the callback approach? There's a note about it in the docs about SessionTicketKey that I don't fully understand.
  2. Is it reasonable to have the Loader be a part of Agent? I wanted to avoid modifying RuntimeConfig, but I'm not sure if there is a better place to put it.
  3. What should happen if there are errors during TLS reconfigure? I assumed that errors should abort the rest of the reload with one exception (the last point on the feature list).

@akshayganeshen akshayganeshen changed the title [WIP] Support reloading TLS configuration Support reloading TLS configuration Jun 11, 2018
@akshayganeshen
Copy link
Contributor Author

Current code is worth looking at. I'm looking into the failing tests.

Please let me know if changes are required. If the implementation looks fine, I'll add the remaining features and tests.

This brings back the default TLS configuration used for checks when
agent TLS is disabled for checks.

Fixes incorrect TLS configuration for https checks.
Adds the missing configuration fields in the TLS config.
This is normally added with http2.ConfigureServer. Since that modifies
the server configuration in-place, and we override the configuration
when using the callback, we have to supply it ourselves.
@banks
Copy link
Member

banks commented Sep 14, 2018

So sorry to have left this without a response this long - I was sure I had responded a while back.

Firstly, thanks for this contribution - it looks really good and is certainly a highly demanded improvement.

Our main issue with merging it as is are:

  1. As noted in SSL certificate reload without restart #2584 we really want to re-use some of the code Nomad wrote for this so our products remain consistent in their behaviour etc. That would need some collab with the Nomad team to refactor their helper lib into something shared and we've not been able to prioritize that yet.
  2. We have it on our near-term (order of months) TODO list to revamp the TLS config for Consul significantly. Part of that will be the work described in part 1 to re-use Nomads code. Another part will be improving the workflow for configuring and setting up certificates for servers. And finally we hope as part of this (no guarantees yet as it's not fully designed) to be able to use Connect CA to (optionally) completely automate provisioning and rotating certificates for all client agents.

Apologies to many folks waiting on this feature or who tried this PR and it solves their problem - we really want to balance getting this improved with the wider plans for Consul. Adding this before we've had time to fully design the above risks introducing behaviour/config that's not exactly what we need but would be re-written or even becomes a breaking change to replace in a few months.

@hanshasselberg
Copy link
Member

@akshayganeshen thanks a lot for you work! I reviewed your PR and it certainly does what it says it is doing! Consul uses the agent TLS config in many more places though which you did not cover. I started working on that based on your PR, but that was just too much work. Before we can start working on reloading TLS config, we need to centralise it. I am working on that now, and reloading afterwards. Thanks again!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants