Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

inconsistent behavior for /etc/resolv.conf with stub resolvers #11033

Open
tgross opened this issue Aug 11, 2021 · 3 comments
Open

inconsistent behavior for /etc/resolv.conf with stub resolvers #11033

tgross opened this issue Aug 11, 2021 · 3 comments
Labels
stage/accepted Confirmed, and intend to work on. No timeline committment though. theme/driver theme/networking type/enhancement

Comments

@tgross
Copy link
Member

tgross commented Aug 11, 2021

While chatting with @mattrobenolt about exposing Consul DNS to a Nomad task, we ran into something unexpected where the behavior I described in #8343 (comment) doesn't hold in the case where the client is not using systemd-resolved. It turns out that dockerd (or more properly libnetwork) special-cases how it creates a /etc/resolv.conf file for containers when it thinks systemd-resolved is in use. See moby/libnetwork/resolvconf/resolvconf.go#L18-L21

  • with systemd-resolved, docker tasks get the host's (non-stub) resolv.conf.
  • with systemd-resolved, exec tasks get the stub resolv.conf pointing to 127.0.0.53.
  • without systemd-resolved, both docker and exec tasks get the host's resolv.conf.
  • with a network.dns block, both docker and exec tasks get the Nomad-managed resolv.conf generated by GenerateDNSMount

While the inconsistent behavior is the "fault" of the task driver engine, it makes configuring Consul DNS for tasks in a sensible way challenging and awfully host-specific.

Some proposals:

  • Can Nomad always manage the resolv.conf? This would break backwards compatibility but would make it possible to point tasks to a Nomad-controlled IP; this could be the host's IP or a Consul IP for a stub resolver, etc.
  • Can exec task drivers do the same thing that docker does with the stub resolver? (This is kind of gross and is annoyingly undocumented in Docker, too, as far as we can tell).
  • In lieu of that, a couple of "blessed" configurations for Consul DNS would be super nice to have documented.

Reproduction for the stub resolver behavior.

Using the Vagrant machine found at the root of this repo and running the following jobspec that has both a docker and exec task sharing a network namespace:

Docker jobspec
job "example" {
  datacenters = ["dc1"]

  group "web" {
    network {
      mode = "bridge"
      port "web1" {
        to = 8001
      }
      port "web2" {
        to = 8002
      }
    }

    task "web1" {
      driver = "docker"

      config {
        image   = "busybox:1"
        command = "httpd"
        args    = ["-f", "-h", "/tmp", "-p", "8001"]
        ports   = ["web1"]
      }

      resources {
        cpu    = 256
        memory = 128
      }
    }

    task "web2" {
      driver = "exec"

      config {
        command = "busybox"
        args    = ["httpd", "-f", "-h", "/tmp", "-p", "8002"]
      }

      resources {
        cpu    = 256
        memory = 128
      }
    }

  }
}

The docker driver gets the "real" resolver used by systemd-resolved and found at /run/systemd/resolve/resolv.conf:

$ nomad alloc exec -task web1 d2f cat /etc/resolv.conf
...

nameserver 10.0.2.3
search fios-router.home

But the exec driver gets the stub resolver from the host's /etc/resolv.conf

$ nomad alloc exec -task web2 d2f cat /etc/resolv.conf
...

nameserver 127.0.0.53
options edns0
search fios-router.home

Now replace systemd-resolved's stub resolver with unbound:

$ sudo apt-get install -y unbound
$ echo 'DNSStubListener=no' | sudo tee /etc/systemd/resolved.conf

Restart the VM and make sure the stub listener isn't what's linked to /etc/resolv.conf anymore:

sudo systemctl stop systemd-resolved
sudo rm /etc/resolv.conf
echo 'nameserver 8.8.8.8' | sudo tee /etc/resolv.conf
sudo systemctl start systemd-resolved

Now both of them get the /etc/resolv.conf file from the host:

$ nomad alloc exec -task web1 208 cat /etc/resolv.conf
nameserver 8.8.8.8

$ nomad alloc exec -task web2 208 cat /etc/resolv.conf
nameserver 8.8.8.8
restore your Vagrant VM back to the previous state
sudo rm /etc/resolv.conf
sudo nano /etc/systemd/resolved.conf # remove the DNSStubListener=no line
sudo apt-get remove -y unbound
sudo ln -fs /run/systemd/resolve/stub-resolv.conf /etc/resolv.conf

Then reboot the VM.

@mattrobenolt
Copy link
Contributor

mattrobenolt commented Aug 11, 2021

For a little bit more context on my setup specifically, I disabled systemd-resolved entirely, and install CoreDNS to work as a resolver on the host. I then put nameserver 127.0.0.1 in the host /etc/resolved.conf. To my surprise, docker's resolvedconf stuff also looks for this and removes any 127.0.0.* nameserver lines, which makes sense since they wouldn't work at all, and places defaults of 8.8.8.8 and 8.8.4.4.

So, this was fine. In pure docker world, I set the default DNS nameserver to the docker bridge IP, so it can reach out from the container over the bridge IP to talk to the host resolver.

This depends on me binding now the CoreDNS listener to both 127.0.0.1 and the docker0 bridge IP so it can respond. Where this falls apart is using nomad's bridge mode now is a different IP. This is sorta fine, but nomad seems to lazily create the bridge network when it's first needed. So I can't reliably bind CoreDNS to the nomad bridge network since it doesn't exist yet.

All in all, I ended up working around this by not using 127.0.0.1 for my host /etc/resolve.conf, and instead us the server's private IP, in the 10.0.0.0/8 range. Doing this allows it to be brought into the container untouched and doesn't need to use the bridge IPs at all.

I don't know exactly what nomad can really do here, but having some sensible way to say "I want to resolve DNS with consul on the host" would be good. What really tripped me up is the very undocumented and unexpected behaviors of libnetwork and what docker was doing here.

And just for context, here's the trivial CoreDNS config I'm using:

.:53 {
  bind 127.0.0.1 10.0.0.1
  forward . 147.75.207.207 147.75.207.208
  cache
}

consul.:53 {
  bind 127.0.0.1 10.0.0.1
  forward . 127.0.0.1:8600
}

@apollo13
Copy link
Contributor

I can offer yet another option (docker daemon.json):

{
    "dns": [
        "172.22.3.201"
    ],
    "dns-search": [
        "consul"
    ]
}

where the ip is the private IP of the server. My coredns configuration looks like this:

. {
  forward . /etc/resolv.conf
}

consul {
  forward . dns://127.0.0.1:8600
}

Imo that is kinda the best of both worlds. Docker gets a fixed DNS server and CoreDNS uses whatever was configured on the host.

@lgfa29 lgfa29 added stage/accepted Confirmed, and intend to work on. No timeline committment though. theme/driver theme/networking labels Aug 19, 2021
@lgfa29
Copy link
Contributor

lgfa29 commented Aug 19, 2021

Thanks for the detailed report @tgross!

Also thanks @mattrobenolt and @apollo13 for the additional context. We will investigate this further.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
stage/accepted Confirmed, and intend to work on. No timeline committment though. theme/driver theme/networking type/enhancement
Projects
Status: Needs Roadmapping
Development

No branches or pull requests

4 participants