Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Caddy 2.0 resolving addresses #3518

Closed
p3lim opened this issue Jun 22, 2020 · 10 comments
Closed

Caddy 2.0 resolving addresses #3518

p3lim opened this issue Jun 22, 2020 · 10 comments

Comments

@p3lim
Copy link

p3lim commented Jun 22, 2020

1. Environment

1a. Operating system and version

Alpine Linux 3.12 (virt version)

Although this issue should be present on any distribution.

1b. Caddy version (run caddy version or paste commit SHA)

v2.0.0 h1:pQSaIJGFluFvu8KDGDODV8u4/QRED/OPyIR+MWYYse8=

From the assets on GitHub releases. Also tested with v1.0.3, also from the release assets.

2. Description

2a. What happens (briefly explain what is wrong)

I run caddy run with the following adjacent Caddyfile:

127.0.0.1:8080
file_server browse

And assume that this will run, presenting me with a locally accessible web directory. Caddy fails to start, see the logs in 2c.

2b. Why it's a bug (if it's not obvious)

This used to work in v1.x, with the following Caddyfile:

127.0.0.1:8080
browse

The problem is that with a typical (but depending on the distribution - modified) resolvconf (/etc/resolv.conf) where the search path and/or the nameserver is non-default, Caddy (or a library Caddy/certmagic is using) will try to resolve the address/hostname configured in the Caddyfile before starting.

Assuming this /etc/resolv.conf file:

search local
nameserver 8.8.8.8

Everything works (the local search path is the default, unless DHCP provides one).

Change either of those to something non-default (like the search path to something internal like search mydomain.com, or the nameserver to a DNS that with records that would resolve the address/hostname in the Caddyfile to a different IP than the host's), it fails with the attached error.

The problem is that Caddy (or a library it's using) attempts to resolve the address/hostname before starting, which is unneccessary and will break in a situation like this.

2c. Log output

2020/06/22 17:18:10.105 INFO    using adjacent Caddyfile
run: loading initial config: loading new config: starting caddy administration endpoint: listen tcp 10.0.1.3:2019: bind: cannot assign requested address
start: caddy process exited with error: exit status 1

The host I ran Caddy on was not 10.0.1.3, my internal DNS has a wildcard A record pointing to that adress as a black hole.

Essentially, Caddy tried to resolve 127.0.0.1.lab.

2d. Workaround(s)

  1. Run Caddy 1.x, which works as expected
  2. Use a default search path of local (not feasible/wanted in a properly configured network/environment)
  3. Change the DNS to something that won't resolve the address/hostname or wildcard (like I do)

2e. Relevant links

Referencing this issue: caddyserver/caddy-docker#44

3. Tutorial (minimal steps to reproduce the bug)

Assuming DHCP provides a search path and/or nameserver that is non-standard like descibed above, or manually modify /etc/resolv.conf to provide a custom search path and/or a nameserver that has 127.0.0.1.lab as a configured record.

curl -sSLO https://github.com/caddyserver/caddy/releases/download/v2.0.0/caddy_2.0.0_linux_amd64.tar.gz
tar xzf caddy*.tar.gz
cat >Caddyfile <<EOF
127.0.0.1:8080
file_server browse
EOF
./caddy run
@mholt
Copy link
Member

mholt commented Jun 22, 2020

Thanks for opening an issue! We'll look into this.

It's not immediately clear to me what is going on, so I'll need your help to understand it better.

Ideally, we need to be able to reproduce the bug in the most minimal way possible. This allows us to write regression tests to verify the fix is working. If we can't reproduce it, then you'll have to test our changes for us until it's fixed -- and then we can't add test cases, either.

I've attached a template below that will help make this easier and faster! This will require some effort on your part -- please understand that we will be dedicating time to fix the bug you are reporting if you can just help us understand it and reproduce it easily.

This template will ask for some information you've already provided; that's OK, just fill it out the best you can. 👍 I've also included some helpful tips below the template. Feel free to let me know if you have any questions!

Thank you again for your report, we look forward to resolving it!

Template

## 1. Environment

### 1a. Operating system and version

```
paste here
```


### 1b. Caddy version (run `caddy version` or paste commit SHA)

```
paste here
```


### 1c. Go version (if building Caddy from source; run `go version`)

```
paste here
```


## 2. Description

### 2a. What happens (briefly explain what is wrong)




### 2b. Why it's a bug (if it's not obvious)




### 2c. Log output

```
paste terminal output or logs here
```



### 2d. Workaround(s)




### 2e. Relevant links




## 3. Tutorial (minimal steps to reproduce the bug)




Helpful tips

  1. Environment: Please fill out your OS and Caddy versions, even if you don't think they are relevant. (They are always relevant.) If you built Caddy from source, provide the commit SHA and specify your exact Go version.

  2. Description: Describe at a high level what the bug is. What happens? Why is it a bug? Not all bugs are obvious, so convince readers that it's actually a bug.

    • 2c) Log output: Paste terminal output and/or complete logs in a code block. DO NOT REDACT INFORMATION except for credentials.
    • 2d) Workaround: What are you doing to work around the problem in the meantime? This can help others who encounter the same problem, until we implement a fix.
    • 2e) Relevant links: Please link to any related issues, pull requests, docs, and/or discussion. This can add crucial context to your report.
  3. Tutorial: What are the minimum required specific steps someone needs to take in order to experience the same bug? Your goal here is to make sure that anyone else can have the same experience with the bug as you do. You are writing a tutorial, so make sure to carry it out yourself before posting it. Please:

    • Start with an empty config. Add only the lines/parameters that are absolutely required to reproduce the bug.
    • Do not run Caddy inside containers.
    • Run Caddy manually in your terminal; do not use systemd or other init systems.
    • If making HTTP requests, avoid web browsers. Use a simpler HTTP client instead, like curl.
    • Do not redact any information from your config (except credentials). Domain names are public knowledge and often necessary for quick resolution of an issue!
    • Note that ignoring this advice may result in delays, or even in your issue being closed. 😞 Only actionable issues are kept open, and if there is not enough information or clarity to reproduce the bug, then the report is not actionable.

Example of a tutorial:

Create a config file:
{ ... }

Open terminal and run Caddy:

$ caddy ...

Make an HTTP request:

$ curl ...

Notice that the result is ___ but it should be ___.

@mholt mholt added the needs info 📭 Requires more information label Jun 22, 2020
@p3lim
Copy link
Author

p3lim commented Jun 22, 2020

Updated with more info, feel free to delete this and the template response.

@francislavoie
Copy link
Member

francislavoie commented Jun 22, 2020

The problem is that something in your network stack is causing localhost to resolve to something different than 127.0.0.1 or ::1

Caddy tries to bind to localhost:2019 for its admin API to provide features like graceful config reloading and changing the config on the fly.

You can either fix your DNS to properly make localhost resolve to a loopback address like it should, or use the admin global option to 127.0.0.1:2019 explicitly.

@p3lim
Copy link
Author

p3lim commented Jun 22, 2020

Every other tool I've ever used never tries to resolve 127.0.0.1, ::1 nor localhost outside the host because they are protected.

$ ping -c3 localhost
PING localhost (::1): 56 data bytes
64 bytes from ::1: seq=0 ttl=64 time=0.023 ms
64 bytes from ::1: seq=1 ttl=64 time=0.048 ms
64 bytes from ::1: seq=2 ttl=64 time=0.049 ms

--- localhost ping statistics ---
3 packets transmitted, 3 packets received, 0% packet loss
round-trip min/avg/max = 0.023/0.040/0.049 ms

$ ping -c3 127.0.0.1
PING 127.0.0.1 (127.0.0.1): 56 data bytes
64 bytes from 127.0.0.1: seq=0 ttl=64 time=0.034 ms
64 bytes from 127.0.0.1: seq=1 ttl=64 time=0.042 ms
64 bytes from 127.0.0.1: seq=2 ttl=64 time=0.081 ms

--- 127.0.0.1 ping statistics ---
3 packets transmitted, 3 packets received, 0% packet loss
round-trip min/avg/max = 0.034/0.052/0.081 ms

$ curl -vI localhost:8080
*   Trying ::1:8080...
* connect to ::1 port 8080 failed: Connection refused
*   Trying 127.0.0.1:8080...
* connect to 127.0.0.1 port 8080 failed: Connection refused
* Failed to connect to localhost port 8080: Connection refused
* Closing connection 0
curl: (7) Failed to connect to localhost port 8080: Connection refused

Even if I were to use something else, why is Caddy trying to resolve in the first place?

@mholt
Copy link
Member

mholt commented Jun 25, 2020

Why is your DNS resolver answering with a non-loopback IP? Caddy just does what Go does, and that is respect your DNS settings. (Unlike Chromecast, which hard-codes its DNS resolvers, mmph.)

The admin endpoint binds to localhost by default, so as not to assume either IPv4 or IPv6, but to defer that to your system configuration. That's intentional, so you have more control over Caddy's defaults and over your network stack in general.

I would either:

  • fix your network or system's DNS configuration,
  • change the admin endpoint's listener address
  • disable the admin endpoint entirely, if you don't use it

to solve the problem.

@mholt mholt closed this as completed Jun 25, 2020
@mholt mholt removed the needs info 📭 Requires more information label Jun 25, 2020
@p3lim
Copy link
Author

p3lim commented Jun 25, 2020

I fould this issue from 2017 in the golang repo golang/go#22846, where Go apparently decides if the /etc/hosts file (which defines the loopback address(es)) should be considered or not based on the values of /etc/nsswitch.conf, which Alpine (the distro I've been using) doesn't have, as it's a glibc component, Alpine uses musl. glibc defaults states DNS > files, so Go uses that when nsswitch.conf doesn't exist, but every distro uses the oposite.

By creating this with echo "hosts: files dns" > /etc/nsswitch.conf will let Caddy properly resolve using loopback addresses instead of querying the DNS first, as is expected on any distribution.

A followup of that issue is found at golang/go#35305, where they're considering not following the glibc standard of DNS > /etc/hosts when no /etc/nsswitch.conf is present.

@mholt
Copy link
Member

mholt commented Jun 25, 2020

@p3lim Nice detective work, that's actually really good to know. Thanks for finding that! I'll point people this direction if they report a similar issue.

@francislavoie
Copy link
Member

francislavoie commented Jun 25, 2020

@p3lim so if you make a Dockerfile based on the Caddy docker image and add this line, it fixes the problem? https://github.com/docker-library/golang/blob/master/1.14/alpine3.12/Dockerfile#L9

Looping in @hairyhenderson, if that works we should probably add that to the Caddy image by default. The official golang alpine image includes that workaround as well, so I definitely think there's precedent for doing it in the Caddy image as well if it fixes this sort of problem.

I don't understand how the environment should look exactly to reproduce this issue either. If you're able to describe how the system should be set up to replicate, that would be helpful so we can test workarounds.

(Reopening while we make a decision regarding the docker image, might move this issue to the caddy-docker repo as well)

Edit: For reference, this is what I have in my /etc/nsswitch.conf on Ubuntu 20.04:

# /etc/nsswitch.conf
#
# Example configuration of GNU Name Service Switch functionality.
# If you have the `glibc-doc-reference' and `info' packages installed, try:
# `info libc "Name Service Switch"' for information about this file.

passwd:         files systemd
group:          files systemd
shadow:         files
gshadow:        files

hosts:          files mdns4_minimal [NOTFOUND=return] dns
networks:       files

protocols:      db files
services:       db files
ethers:         db files
rpc:            db files

netgroup:       nis

@francislavoie francislavoie reopened this Jun 25, 2020
@p3lim
Copy link
Author

p3lim commented Jun 25, 2020

@francislavoie Yes, and I've seen quite a few other images do that as well (kubernetes control plane, traefik, fluxcd, influxdata, and many many more).

Providing the hosts specific part should be enough.

Edit: To reproduce, simply use a system that doesn't depend on glibc, like Alpine (or perhaps Void, I think they use musl too), glibc provides /etc/nsswitch.conf, musl doesn't.

For reference, on a Fedora 31 box:

#
# /etc/nsswitch.conf
#
# An example Name Service Switch config file. This file should be
# sorted with the most-used services at the beginning.
#
# The entry '[NOTFOUND=return]' means that the search for an
# entry should stop if the search in the previous entry turned
# up nothing. Note that if the search failed due to some other reason
# (like no NIS server responding) then the search continues with the
# next entry.
#
# Valid entries include:
#
#       nisplus                 Use NIS+ (NIS version 3)
#       nis                     Use NIS (NIS version 2), also called YP
#       dns                     Use DNS (Domain Name Service)
#       files                   Use the local files in /etc
#       db                      Use the pre-processed /var/db files
#       compat                  Use /etc files plus *_compat pseudo-databases
#       hesiod                  Use Hesiod (DNS) for user lookups
#       sss                     Use sssd (System Security Services Daemon)
#       [NOTFOUND=return]       Stop searching if not found so far
#
# 'sssd' performs its own 'files'-based caching, so it should
# generally come before 'files'.

# To use 'db', install the nss_db package, and put the 'db' in front
# of 'files' for entries you want to be looked up first in the
# databases, like this:
#
# passwd:    db files
# shadow:    db files
# group:     db files

passwd:      sss files systemd
shadow:     files sss
group:       sss files systemd

hosts:      files dns myhostname

bootparams: files

ethers:     files
netmasks:   files
networks:   files
protocols:  files
rpc:        files
services:   files sss

netgroup:   sss

publickey:  files

automount:  files sss
aliases:    files

@francislavoie
Copy link
Member

Alright I think we can close this now. Thanks @p3lim for the help 😄

caddyserver/caddy-docker#96

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants