Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

old certificate is used after switching to on-demand let's encrypt #1994

Closed
root360-AndreasUlm opened this issue Jan 12, 2018 · 2 comments · Fixed by #2015
Closed

old certificate is used after switching to on-demand let's encrypt #1994

root360-AndreasUlm opened this issue Jan 12, 2018 · 2 comments · Fixed by #2015
Labels
bug 🐞 Something isn't working
Milestone

Comments

@root360-AndreasUlm
Copy link

Unfortunately this issue is hard to track and I could find a way to reproduce it on my test systems, yet.
I hope by describing the issue you'll find a reason by checking the code.
It might be solved by the solution discussed in #1991.

1. What version of Caddy are you using (caddy -version)?

Caddy 0.10.10

2. What are you trying to do?

Switch certificate of a domain from third-party to on-demand let's encrypt.

3. What is your entire Caddyfile?

  • Caddyfile
http://*.eu-central-1.compute.amazonaws.com http://www.www.www.* {
    status 403 /
    ratelimit 1 2 hour
    import /etc/caddy/load.d/log
    import /etc/caddy/load.d/ipfilter
}

https://*.eu-central-1.compute.amazonaws.com https://www.www.www.* {
    status 403 /
    ratelimit 1 2 hour
    tls /etc/ssl/default.crt.pem /etc/ssl/default.key.pem
    import /etc/caddy/load.d/hsts
    import /etc/caddy/load.d/log
    import /etc/caddy/load.d/ipfilter
}

http:// {
    redir / {scheme}://www.{host}{uri} 301
    import /etc/caddy/load.d/log
    import /etc/caddy/load.d/ratelimit
    import /etc/caddy/load.d/ipfilter
    tls off
}
https:// {
    redir / {scheme}://www.{host}{uri} 301
    import /etc/caddy/load.d/log
    import /etc/caddy/load.d/letsencrypt_default
    import /etc/caddy/load.d/hsts
    import /etc/caddy/load.d/ratelimit
    import /etc/caddy/load.d/ipfilter
}

import /etc/caddy/conf.d/*.conf
  • Example for conf.d (currently we have ~280 files):
http://*.example.com http://example.com {
    redir 301 {
        / {scheme}://www.{host}{uri}
    }
    import /etc/caddy/load.d/log
    import /etc/caddy/load.d/ratelimit
    import /etc/caddy/load.d/ipfilter
    tls off
}
https://*.example.com https://example.com {
    redir 301 {
        / {scheme}://www.{host}{uri}
    }

    import /etc/caddy/load.d/log
    import /etc/caddy/load.d/letsencrypt_profile1
    import /etc/caddy/load.d/hsts
    import /etc/caddy/load.d/ratelimit
    import /etc/caddy/load.d/ipfilter
}
  • log:
request_id
log / syslog "({request_id}) {combined} {host} {latency_ms}"
errors syslog
  • hsts:
header / Strict-Transport-Security "max-age=31536000; includeSubDomains";
  • letsencrypt_default:
tls "foo+letsencrypt-default@example.com" {
    max_certs 10
    key_type rsa4096
}
  • letsencrypt_profile1:
tls "foo+letsencrypt-profile1@example.com" {
    max_certs 10
    key_type rsa4096
}
  • ipfilter:
# currently empty
  • ratelimit:
#ratelimit / 300 400 minute {
#    whitelist 128.199.222.65/32
#}

I just provided a short part of the configuration as all would result in 31026 lines of configuration.
Most of it is vhost configuration which just differ in the matching hostname.
'ratelimit' is deactivated because of xuqingfeng/caddy-rate-limit#23

4. How did you run Caddy (give the full command and describe the execution environment)?

Started using the sysv-init script extended by me with PR #1984 .

  • started as www-data
  • CADDYPATH=/etc/ssl/caddy/
/usr/local/bin/caddy -agree=true -log=/var/log/caddy.log -conf=/etc/caddy/Caddyfile -disable-tls-sni-challenge

5. Please paste any relevant HTTP request(s) here.

see 8.

6. What did you expect to see?

After reloading Caddy it should start generating a let's encrypt certificate on the first request made to the domain.

7. What did you see instead (give full error messages and/or log)?

It delivers the old certificate until you switch to not-on-demand mode, reload Caddy and a certificate is generated.
The behaviour does not depend on the validity of the old certificate. It happens with invalid and with valid certificates and certificate chains.

8. How can someone who is starting from scratch reproduce the bug as minimally as possible?

Thiis is the hard part.
With a minimal config using just one or a few domains Caddy behaves as expected.
We have 986 domains with https configuration.
316 are using let's encrypt configuration.
233 are using on-demand let's encrypt.

Following the steps I can use to reproduce it on our production system:

  1. curl with given certificate:
$ curl -Ikv https://example.com
* Rebuilt URL to: https://example.com/
*   Trying 52.XX.XX.XX...
* TCP_NODELAY set
* Connected to example.com (52.XX.XX.XX) port 443 (#0)
{...}
* Server certificate:
*  subject: C=DE; ST=Saxony; O=root360; OU=testing; CN=*.*; emailAddress=admin@*.*
*  start date: Sep  7 10:41:55 2015 GMT
*  expire date: Sep  4 10:41:55 2025 GMT
*  issuer: C=DE; ST=Saxony; L=Leipzig; O=root360; OU=testing; CN=*.*; emailAddress=admin@*.*
*  SSL certificate verify result: unable to get local issuer certificate (20), continuing anyway.
{...}
> HEAD / HTTP/2
> Host: example.com
> User-Agent: curl/7.57.0
> Accept: */*
> 
{ [5 bytes data]
* Connection state changed (MAX_CONCURRENT_STREAMS updated)!
} [5 bytes data]
< HTTP/2 301 
< location: https://www.example.com/
< server: Caddy
< strict-transport-security: max-age=31536000; includeSubDomains
< content-type: text/plain; charset=utf-8
< date: Fri, 12 Jan 2018 09:08:01 GMT
< 
* Connection #0 to host example.com left intact
HTTP/2 301 
location: https://www.example.com/
server: Caddy
strict-transport-security: max-age=31536000; includeSubDomains
content-type: text/plain; charset=utf-8
date: Fri, 12 Jan 2018 09:08:01 GMT
  1. switch tls config to use on-demand let's encrypt
tls test+letsencrypt@example.com {
  max_certs 10
}
  1. reload caddy
  2. curl after switching to on-demand let's encrypt
$ curl -Ikv https://example.com
* Rebuilt URL to: https://example.com/
*   Trying 52.XX.XX.XX...
* TCP_NODELAY set
* Connected to example.com (52.XX.XX.XX) port 443 (#0)
{...}
* Server certificate:
*  subject: C=DE; ST=Saxony; O=root360; OU=testing; CN=*.*; emailAddress=admin@*.*
*  start date: Sep  7 10:41:55 2015 GMT
*  expire date: Sep  4 10:41:55 2025 GMT
*  issuer: C=DE; ST=Saxony; L=Leipzig; O=root360; OU=testing; CN=*.*; emailAddress=admin@*.*
*  SSL certificate verify result: unable to get local issuer certificate (20), continuing anyway.
{...}
> HEAD / HTTP/2
> Host: example.com
> User-Agent: curl/7.57.0
> Accept: */*
> 
{ [5 bytes data]
* Connection state changed (MAX_CONCURRENT_STREAMS updated)!
} [5 bytes data]
< HTTP/2 301 
< location: https://www.example.com/
< server: Caddy
< strict-transport-security: max-age=31536000; includeSubDomains
< content-type: text/plain; charset=utf-8
< date: Fri, 12 Jan 2018 09:11:09 GMT
< 
* Connection #0 to host example.com left intact
HTTP/2 301 
location: https://www.example.com/
server: Caddy
strict-transport-security: max-age=31536000; includeSubDomains
content-type: text/plain; charset=utf-8
date: Fri, 12 Jan 2018 09:11:09 GMT
  1. restart caddy (just in case):
  2. curl after restart:
$ curl -Ikv https://example.com
* Rebuilt URL to: https://example.com/
*   Trying 52.XX.XX.XX...
* TCP_NODELAY set
* Connected to example.com (52.XX.XX.XX) port 443 (#0)
{...}
* Server certificate:
*  subject: C=DE; ST=Saxony; O=root360; OU=testing; CN=*.*; emailAddress=admin@*.*
*  start date: Sep  7 10:41:55 2015 GMT
*  expire date: Sep  4 10:41:55 2025 GMT
*  issuer: C=DE; ST=Saxony; L=Leipzig; O=root360; OU=testing; CN=*.*; emailAddress=admin@*.*
*  SSL certificate verify result: unable to get local issuer certificate (20), continuing anyway.
{...}
> HEAD / HTTP/2
> Host: example.com
> User-Agent: curl/7.57.0
> Accept: */*
> 
{ [5 bytes data]
* Connection state changed (MAX_CONCURRENT_STREAMS updated)!
} [5 bytes data]
< HTTP/2 301 
< location: https://www.example.com/
< server: Caddy
< strict-transport-security: max-age=31536000; includeSubDomains
< content-type: text/plain; charset=utf-8
< date: Fri, 12 Jan 2018 09:14:36 GMT
< 
* Connection #0 to host example.com left intact
HTTP/2 301 
location: https://www.example.com/
server: Caddy
strict-transport-security: max-age=31536000; includeSubDomains
content-type: text/plain; charset=utf-8
date: Fri, 12 Jan 2018 09:14:36 GMT
  1. switch tls config to use let's encrypt without on-demand mode
tls test+letsencrypt@example.com
  1. reload caddy, certificate is requested from let's encrypt
2018/01/12 09:25:12 [INFO][example.com] acme: Obtaining bundled SAN certificate                                                           
2018/01/12 09:25:13 [INFO][example.com] AuthURL: https://acme-v01.api.letsencrypt.org/acme/authz/4UwvWG1hCecjtpTRCsGVaP1yp0t90sgUKfbrJGFer-I
2018/01/12 09:25:13 [INFO][example.com] acme: Trying to solve HTTP-01                                                                     
2018/01/12 09:25:14 [INFO][example.com] Served key authentication                                                                         
2018/01/12 09:25:16 [INFO][example.com] The server validated our request                                                                  
2018/01/12 09:25:16 [INFO][example.com] acme: Validations succeeded; requesting certificates                                              
2018/01/12 09:25:17 [INFO] acme: Requesting issuer cert from https://acme-v01.api.letsencrypt.org/acme/issuer-cert                        
2018/01/12 09:25:17 [INFO][example.com] Server responded with a certificate.                                                               
2018/01/12 09:25:17 [INFO][example.com] Certificate written to disk: /etc/ssl/caddy/acme/acme-v01.api.letsencrypt.org/sites/example.com/example.com.crt
  1. curl with not-on-demand mode of let's encrypt
$ curl -Ikv https://example.com
* Rebuilt URL to: https://example.com/
*   Trying 52.XX.XX.XX...
* TCP_NODELAY set
* Connected to example.com (52.XX.XX.XX) port 443 (#0)
{...}
* Server certificate:
*  subject: CN=example.com
*  start date: Jan 12 08:25:17 2018 GMT
*  expire date: Apr 12 08:25:17 2018 GMT
*  issuer: C=US; O=Let's Encrypt; CN=Let's Encrypt Authority X3
*  SSL certificate verify ok.
{...}
> HEAD / HTTP/2
> Host: example.com
> User-Agent: curl/7.57.0
> Accept: */*
> 
{ [5 bytes data]
* Connection state changed (MAX_CONCURRENT_STREAMS updated)!
} [5 bytes data]
< HTTP/2 301 
< conf: example.com
< location: https://www.example.com/
< server: Caddy
< strict-transport-security: max-age=31536000; includeSubDomains
< content-type: text/plain; charset=utf-8
< date: Fri, 12 Jan 2018 09:28:34 GMT
< 
* Connection #0 to host example.com left intact
HTTP/2 301 
conf: example.com
location: https://www.example.com/
server: Caddy
strict-transport-security: max-age=31536000; includeSubDomains
content-type: text/plain; charset=utf-8
date: Fri, 12 Jan 2018 09:28:34 GMT
@root360-AndreasUlm
Copy link
Author

This issue is slightly different from #1991 as it is about switching the certificate and not about multiple certificates might secure the same site but IMHO it will be solved by implementing site based certificate caches.

@mholt
Copy link
Member

mholt commented Jan 13, 2018

Yes, I agree, it seems that if the certificate map/cache was "sandboxed" to each site, it would resolve this issue. I'll investigate!

@mholt mholt added this to the 0.10.11 milestone Jan 16, 2018
@mholt mholt added the bug 🐞 Something isn't working label Jan 16, 2018
@mholt mholt added the in progress 🏃‍♂️ Being actively worked on label Jan 27, 2018
mholt added a commit that referenced this issue Feb 4, 2018
- Expose the list of Caddy instances through caddy.Instances()

- Added arbitrary storage to caddy.Instance

- The cache of loaded certificates is no longer global; now scoped
  per-instance, meaning upon reload (like SIGUSR1) the old cert cache
  will be discarded entirely, whereas before, aggressively reloading
  config that added and removed lots of sites would cause unnecessary
  build-up in the cache over time.

- Key certificates in the cache by their SHA-256 hash instead of
  by their names. This means certificates will not be duplicated in
  memory (within each instance), making Caddy much more memory-efficient
  for large-scale deployments with thousands of sites sharing certs.

- Perform name-to-certificate lookups scoped per caddytls.Config instead
  of a single global lookup. This prevents certificates from stepping on
  each other when they overlap in their names.

- Do not allow TLS configurations keyed by the same hostname to be
  different; this now throws an error.

- Updated relevant tests, with a stark awareness that more tests are
  needed.

- Change the NewContext function signature to include an *Instance.

- Strongly recommend (basically require) use of caddytls.NewConfig()
  to create a new *caddytls.Config, to ensure pointers to the instance
  certificate cache are initialized properly.

- Update the TLS-SNI challenge solver (even though TLS-SNI is disabled
  currently on the CA side). Store temporary challenge cert in instance
  cache, but do so directly by the ACME challenge name, not the hash.
  Modified the getCertificate function to check the cache directly for
  a name match if one isn't found otherwise. This will allow any
  caddytls.Config to be able to help solve a TLS-SNI challenge, with one
  extra side-effect that might actually be kind of interesting (and
  useless): clients could send a certificate's hash as the SNI and
  Caddy would be able to serve that certificate for the handshake.

- Do not attempt to match a "default" (random) certificate when SNI
  is present but unrecognized; return no certificate so a TLS alert
  happens instead.

- Store an Instance in the list of instances even while the instance
  is still starting up (this allows access to the cert cache for
  performing renewals at startup, etc). Will be removed from list again
  if instance startup fails.

- Laid groundwork for ACMEv2 and Let's Encrypt wildcard support.

Server type plugins will need to be updated slightly to accommodate
minor adjustments to their API (like passing in an Instance). This
commit includes the changes for the HTTP server.

Certain Caddyfile configurations might error out with this change, if
they configured different TLS settings for the same hostname.

This change trades some complexity for other complexity, but ultimately
this new complexity is more correct and robust than earlier logic.

Fixes #1991
Fixes #1994
Fixes #1303
@mholt mholt removed the in progress 🏃‍♂️ Being actively worked on label Feb 17, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug 🐞 Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants