Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Caddy fails to obtain a fresh certificate except if the ssl/acme/users folder is deleted and Caddy reloaded. #2451

Closed
fr33tux opened this issue Jan 31, 2019 · 13 comments · Fixed by #2452
Labels
bug 🐞 Something isn't working

Comments

@fr33tux
Copy link

fr33tux commented Jan 31, 2019

Hello,

1. What version of Caddy are you using (caddy -version)?

Caddy 0.11.2 (+8369a121) (unofficial)

go version go1.11.4 linux/amd64

Compiled and ran on Ubuntu 18.10.

2. What are you trying to do?

I'm trying to get a certificate for a new domain by adding the domain configuration and reloading Caddy.

3. What is your entire Caddyfile?

It's composed of 10 snippets like this one below. The two differences between them are the proxy port and the log paths.

http://redacted.mb.fr33tux.org {
        # Logging
        errors /var/log/caddy/errors_redacted.fr33tux.org.log
        log /var/log/caddy/access_redacted.fr33tux.org.log

        # TLS configuration
        import tls.conf

        # Header configuration
        import headers.conf

        # Proxify
        proxy / 127.0.0.1:8080 {
                transparent
                insecure_skip_verify
        }
}

tls.conf:

tls redacted@redacted {
        must_staple
}

4. How did you run Caddy (give the full command and describe the execution environment)?

Using the systemd unit file provided. ExecStart:
/usr/local/bin/caddy -log stdout -agree=true -conf=/etc/caddy/Caddyfile -root=/var/tmp

5. Please paste any relevant HTTP request(s) here.

n/a

6. What did you expect to see?

journalctl -f -u caddy.service should log that the certificate for the new domain has been successfully obtained. Instead, it logs the following errors.

  • I've figured out that if I delete the whole ssl folder that Caddy uses to store its certificates, and reload Caddy, then it successfully obtains the already configured and the new certificate(s). It just can't obtain new certificates. I haven't tested the renewal.
  • If I delete the ssl/acme/users subfolder and reload Caddy, then the new certificate(s) are successfully obtained too.
  • If I comment all the other domains that already have a certificate to only leave the new one, and reload Caddy, same error than below.

Note that I'm seeing the error TLS handshake error from 66.133.109.36:60828: all certificates configured via GetConfigForClient from numerous IP addresses (and not only LE during certificate obtention like this one) since 0.11.1. This wasn't happening before.

7. What did you see instead (give full error messages and/or log)?

janv. 31 18:01:27 caddy[7225]: 2019/01/31 18:01:27 [INFO] [redacted.fr33tux.org] acme: Obtaining bundled SAN certificate
janv. 31 18:01:28  caddy[7225]: 2019/01/31 18:01:28 [INFO] [redacted.fr33tux.org] AuthURL: https://acme-v02.api.letsencrypt.org/acme/authz/redacted
janv. 31 18:01:28  caddy[7225]: 2019/01/31 18:01:28 [INFO] [redacted.fr33tux.org] acme: use tls-alpn-01 solver
janv. 31 18:01:28  caddy[7225]: 2019/01/31 18:01:28 [INFO] [redacted.fr33tux.org] acme: Trying to solve TLS-ALPN-01
janv. 31 18:01:30  caddy[7225]: 2019/01/31 18:01:30 http: TLS handshake error from 66.133.109.36:60828: all certificates configured via GetConfigForClient
janv. 31 18:01:34  caddy[7225]: [redacted.fr33tux.org] acme: error: 400 :: urn:ietf:params:acme:error:tls :: remote error: tls: internal error, url:
janv. 31 18:01:34 caddy[7225]: 2019/01/31 18:01:34 [ERROR] SIGUSR1: starting with listener file descriptors: [redacted.fr33tux.org] failed to obtain certificate: acme: Error -> One or more domains had a problem:
janv. 31 18:01:34 caddy[7225]: [redacted.fr33tux.org] acme: error: 400 :: urn:ietf:params:acme:error:tls :: remote error: tls: internal error, url:

8. How can someone who is starting from scratch reproduce the bug as minimally as possible?

It may somehow be related to #2400, but the behavior is the same before and after the published fix for it, and the behavior I observe is slightly different too.
I'm attaching the caddy build I'm using if it helps (plugins are used in this build, but this behavior happens with and without plugins).

caddy.zip

I'll be happy to help if you need further details. Thanks!

@magikstm
Copy link

magikstm commented Jan 31, 2019

Looks like a rights issue in your install sequence.

What was your full install script?

Did you include commands like these?

mkdir -p /etc/ssl/caddy
chown -R www-data:root /etc/ssl/caddy
chmod 770 /etc/ssl/caddy

@fr33tux
Copy link
Author

fr33tux commented Jan 31, 2019

I'm doubtful about a permission issue, as it's working well with 0.11.0 and below with the exact same setup, and what makes it working is deleting the users folder in the ssl directory before reloading..

@mholt
Copy link
Member

mholt commented Jan 31, 2019

I will be submitting a PR soon I'd like you to try, to test that it fixes the problem -- please make sure that the issue is still replicable until then. Thanks!

@fr33tux
Copy link
Author

fr33tux commented Jan 31, 2019

Sure, thank you!

mholt added a commit that referenced this issue Feb 2, 2019
vendor: update certmagic, needed to support this

Hopefully fixes #2451, fixes #2438, and fixes #2414
@mholt
Copy link
Member

mholt commented Feb 2, 2019

If you could, please build Caddy using the fix-sni branch and help test the PR #2452. Report back ASAP and let's get this fixed.

@fr33tux
Copy link
Author

fr33tux commented Feb 2, 2019

Hello,

Built Caddy based on 40ba973, but when tried to obtain a new certificate:

févr. 02 23:14:16  caddy[15381]: 2019/02/02 23:14:16 [INFO] [caddy.fr33tux.org] acme: Obtaining bundled SAN certificate
févr. 02 23:14:16  caddy[15381]: 2019/02/02 23:14:16 [INFO] [caddy.fr33tux.org] AuthURL: https://acme-v02.api.letsencrypt.org/acme/authz/redacted
févr. 02 23:14:16  caddy[15381]: 2019/02/02 23:14:16 [INFO] [caddy.fr33tux.org] acme: use tls-alpn-01 solver
févr. 02 23:14:16  caddy[15381]: 2019/02/02 23:14:16 [INFO] [caddy.fr33tux.org] acme: Trying to solve TLS-ALPN-01
févr. 02 23:14:18  caddy[15381]: 2019/02/02 23:14:18 http: panic serving 66.133.109.36:34426: runtime error: invalid memory address or nil pointer dereference
févr. 02 23:14:18  caddy[15381]: goroutine 292 [running]:
févr. 02 23:14:18  caddy[15381]: net/http.(*conn).serve.func1(0xc0000ad7c0)
févr. 02 23:14:18  caddy[15381]:         /devel/go/src/net/http/server.go:1746 +0xd0
févr. 02 23:14:18  caddy[15381]: panic(0xe25fe0, 0x2953970)
févr. 02 23:14:18  caddy[15381]:         /devel/go/src/runtime/panic.go:513 +0x1b9
févr. 02 23:14:18  caddy[15381]: github.com/mholt/caddy/vendor/github.com/mholt/certmagic.distributedSolver.challengeTokensPrefix(0x0, 0x0, 0x0, 0x0, 0x203000)
févr. 02 23:14:18  caddy[15381]:         /devel/gosrc/src/github.com/mholt/caddy/vendor/github.com/mholt/certmagic/solvers.go:137 +0x26
févr. 02 23:14:18  caddy[15381]: github.com/mholt/caddy/vendor/github.com/mholt/certmagic.distributedSolver.challengeTokensKey(0x0, 0x0, 0x0, 0xc0004706e0, 0x11, 0xc0005144c8, 0x8)
févr. 02 23:14:18  caddy[15381]:         /devel/gosrc/src/github.com/mholt/caddy/vendor/github.com/mholt/certmagic/solvers.go:143 +0x49
févr. 02 23:14:18  caddy[15381]: github.com/mholt/caddy/vendor/github.com/mholt/certmagic.(*Config).tryDistributedChallengeSolver(0xc0003fadd0, 0xc0000ce000, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, ...)
févr. 02 23:14:18  caddy[15381]:         /devel/gosrc/src/github.com/mholt/caddy/vendor/github.com/mholt/certmagic/handshake.go:398 +0x88
févr. 02 23:14:18  caddy[15381]: github.com/mholt/caddy/vendor/github.com/mholt/certmagic.(*Config).GetCertificate(0xc0003fadd0, 0xc0000ce000, 0x20, 0x20, 0x0)
févr. 02 23:14:18  caddy[15381]:         /devel/gosrc/src/github.com/mholt/caddy/vendor/github.com/mholt/certmagic/handshake.go:56 +0x463
févr. 02 23:14:18  caddy[15381]: github.com/mholt/caddy/vendor/github.com/mholt/certmagic.(*Config).GetCertificate-fm(0xc0000ce000, 0x7f62fea706c0, 0x655d2b, 0xf552c2)
févr. 02 23:14:18  caddy[15381]:         /devel/gosrc/src/github.com/mholt/caddy/caddytls/config.go:164 +0x34
févr. 02 23:14:18  caddy[15381]: crypto/tls.(*Config).getCertificate(0xc0004a7c80, 0xc0000ce000, 0x1, 0xc0001c6140, 0x3)
févr. 02 23:14:18  caddy[15381]:         /devel/go/src/crypto/tls/common.go:726 +0x30f
févr. 02 23:14:18  caddy[15381]: crypto/tls.(*serverHandshakeState).readClientHello(0xc000191c50, 0xc000498c40, 0x13, 0x8)
févr. 02 23:14:18  caddy[15381]:         /devel/go/src/crypto/tls/handshake_server.go:217 +0x3bc
févr. 02 23:14:18  caddy[15381]: crypto/tls.(*Conn).serverHandshake(0xc000087880, 0x20e9ca0, 0xc0000879a0)
févr. 02 23:14:18  caddy[15381]:         /devel/go/src/crypto/tls/handshake_server.go:47 +0xb6
févr. 02 23:14:18  caddy[15381]: crypto/tls.(*Conn).Handshake(0xc000087880, 0x0, 0x0)
févr. 02 23:14:18  caddy[15381]:         /devel/go/src/crypto/tls/conn.go:1274 +0x177
févr. 02 23:14:18  caddy[15381]: net/http.(*conn).serve(0xc0000ad7c0, 0x21e2300, 0xc0001c0840)
févr. 02 23:14:18  caddy[15381]:         /devel/go/src/net/http/server.go:1762 +0x187
févr. 02 23:14:18  caddy[15381]: created by net/http.(*Server).Serve
févr. 02 23:14:18  caddy[15381]:         /devel/go/src/net/http/server.go:2851 +0x2f5
févr. 02 23:14:22  caddy[15381]: 2019/02/02 23:14:22 [ERROR] Restart failed: [caddy.fr33tux.org] failed to obtain certificate: acme: Error -> One or more domains had a problem:
févr. 02 23:14:22  caddy[15381]: [caddy.fr33tux.org] acme: error: 400 :: urn:ietf:params:acme:error:connection :: Error getting validation data, url:
févr. 02 23:14:22  caddy[15381]: 2019/02/02 23:14:22 [ERROR] SIGUSR1: starting with listener file descriptors: [caddy.fr33tux.org] failed to obtain certificate: acme: Error -> One or more domains had a problem:
févr. 02 23:14:22  caddy[15381]: [caddy.fr33tux.org] acme: error: 400 :: urn:ietf:params:acme:error:connection :: Error getting validation data, url:

Build attached.
caddy.zip

@mholt
Copy link
Member

mholt commented Feb 2, 2019

@fr33tux What is your exact Caddyfile? (I can't reproduce it by guessing, which I've already tried, and it works.)

EDIT: Okay it only happens on SIGUSR1 apparently

@fr33tux
Copy link
Author

fr33tux commented Feb 2, 2019

@fr33tux What is your exact Caddyfile? (I can't reproduce it by guessing, which I've already tried, and it works.)

It only contains a serie of vhosts like this one, with the only change between them being the log paths and the proxy port:

https://caddy.fr33tux.org {
        # Logging
        errors /var/log/caddy/errors_caddy.fr33tux.org.log
        log /var/log/caddy/access_caddy.fr33tux.org.log

        # TLS configuration
        tls redacted@redacted {
               must_staple
        }

        proxy / 127.0.0.1:8080 {
                transparent
                insecure_skip_verify
        }

}

I haven't tried by restarted Caddy with the latest build indeed. Do you need that?

@mholt
Copy link
Member

mholt commented Feb 2, 2019

@fr33tux Thanks for the quick report! I was able to repro the bug and I've pushed a fix. Please try again, thanks!

@fr33tux
Copy link
Author

fr33tux commented Feb 2, 2019

I confirm it fixes the issue (or at least my part, can't say for the others as I don't reproduce their):

févr. 03 00:15:47  caddy[17200]: 2019/02/03 00:15:47 [INFO] [caddy.fr33tux.org] acme: Obtaining bundled SAN certificate
févr. 03 00:15:47  caddy[17200]: 2019/02/03 00:15:47 [INFO] [caddy.fr33tux.org] AuthURL: https://acme-v02.api.letsencrypt.org/acme/authz/redacted
févr. 03 00:15:47  caddy[17200]: 2019/02/03 00:15:47 [INFO] [caddy.fr33tux.org] acme: use tls-alpn-01 solver
févr. 03 00:15:47  caddy[17200]: 2019/02/03 00:15:47 [INFO] [caddy.fr33tux.org] acme: Trying to solve TLS-ALPN-01
févr. 03 00:15:53  caddy[17200]: 2019/02/03 00:15:53 [INFO] [caddy.fr33tux.org] The server validated our request
févr. 03 00:15:53  caddy[17200]: 2019/02/03 00:15:53 [INFO] [caddy.fr33tux.org] acme: Validations succeeded; requesting certificates
févr. 03 00:15:54  caddy[17200]: 2019/02/03 00:15:54 [INFO] [caddy.fr33tux.org] Server responded with a certificate.
févr. 03 00:15:55  caddy[17200]: 2019/02/03 00:15:55 [INFO] Reloading complete
févr. 03 00:15:55  caddy[17200]: 2019/02/03 00:15:55 [INFO][FileStorage:/etc/ssl/caddy] Stopped certificate maintenance routine

Thank you for the quick fix!
Let me know if you need additional tests, or if I should close this issue.

@tobya tobya added the in progress 🏃‍♂️ Being actively worked on label Feb 2, 2019
@mholt
Copy link
Member

mholt commented Feb 2, 2019

Okay excellent. Thank you for the fast response!

@mholt mholt added bug 🐞 Something isn't working and removed in progress 🏃‍♂️ Being actively worked on labels Feb 3, 2019
mholt added a commit that referenced this issue Feb 5, 2019
#2452)

* caddytls: Fix empty SNI handling (new -default-sni flag)

vendor: update certmagic, needed to support this

Hopefully fixes #2451, fixes #2438, and fixes #2414

* caddytls: Don't overwrite certmagic Manager (fixes #2407)

Supersedes #2447

* vendor: Update certmagic to fix nil pointer deref and TLS-ALPN cleanup

* Improve -default-sni flag help text
@samcro1967
Copy link

I am seeing this on version 0.11.5 from a Windows 10 machine. Was this fix added to 0.11.5?

@luisalvarado
Copy link

If I may, I think I see several minor issues here that when using -service install should be addressed to make it easier for faster deployment. When using the hook.service, caddy -service install should do the following to fix several issues at once:

  1. When it creates the service (using systemd for example) it should also create a default folder for the SSL certificates and a default for the caddyserver file. Like /etc/caddy/ssl for the ssl files and /etc/caddy for the conf file. (As a bonus it should by default be renamed to caddy.service, not Caddy.service (Capital C there).

  2. The default conf folder is to avoid having to guess or manually add the conf parameter.

  3. The default ssl folder is to avoid having several issues when doing the whole systemd process as a it fails if it cant find the folders.

  4. There should be a default log in /var/log/caddy (default with option to set somewhere else, same for conf and ssl folders).

  5. Did not check but an option to automatically restart if the caddyserver file is updated or changed (same option to check if caddyserver file has correct format, if not, do not restart service but send a warning to the logs. If all is good, automatically restart service)

This are just a couple of things that would help in deployment.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug 🐞 Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants