Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DNS not working when building Dockerfiles FROM alpine:3.13 only in minikube #10830

Closed
wapiflapi opened this issue Mar 15, 2021 · 16 comments
Closed
Labels
co/virtualbox kind/support Categorizes issue or PR as a support question. lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. long-term-support Long-term support issues that can't be fixed in code os/linux

Comments

@wapiflapi
Copy link

wapiflapi commented Mar 15, 2021

I was investigating broken docker builds because of DNS issues and narrowed it down to building FROM alpine:3.13 specifically on minikube inside virtualbox. So I don't know if this is the right place for this bug report.

Starting with a fresh minikube / virtualbox install:

me@myhost: $ minikube start --driver=virtualbox
😄  minikube v1.18.1 on Debian bullseye/sid
    ▪ MINIKUBE_ACTIVE_DOCKERD=minikube
✨  Using the virtualbox driver based on user configuration
👍  Starting control plane node minikube in cluster minikube
🔥  Creating virtualbox VM (CPUs=2, Memory=3900MB, Disk=20000MB) ...
🐳  Preparing Kubernetes v1.20.2 on Docker 20.10.3 ...
    ▪ Generating certificates and keys ...
    ▪ Booting up control plane ...
    ▪ Configuring RBAC rules ...
🔎  Verifying Kubernetes components...
    ▪ Using image gcr.io/k8s-minikube/storage-provisioner:v4
🌟  Enabled addons: default-storageclass, storage-provisioner
🏄  Done! kubectl is now configured to use "minikube" cluster and "default" namespace by default
me@myhost: $ minikube ssh
                         _             _            
            _         _ ( )           ( )           
  ___ ___  (_)  ___  (_)| |/')  _   _ | |_      __  
/' _ ` _ `\| |/' _ `\| || , <  ( ) ( )| '_`\  /'__`\
| ( ) ( ) || || ( ) || || |\`\ | (_) || |_) )(  ___/
(_) (_) (_)(_)(_) (_)(_)(_) (_)`\___/'(_,__/'`\____)

$ docker run -it alpine:3.12.4 ping google.com
PING google.com (172.217.171.238): 56 data bytes
64 bytes from 172.217.171.238: seq=0 ttl=61 time=7.985 ms
64 bytes from 172.217.171.238: seq=1 ttl=61 time=8.862 ms
^C
--- google.com ping statistics ---
2 packets transmitted, 2 packets received, 0% packet loss
round-trip min/avg/max = 7.985/8.423/8.862 ms
$ docker run -it alpine:3.13 ping google.com
Unable to find image 'alpine:3.13' locally
3.13: Pulling from library/alpine
Digest: sha256:a75afd8b57e7f34e4dad8d65e2c7ba2e1975c795ce1ee22fa34f8cf46f96a3be
Status: Downloaded newer image for alpine:3.13
ping: bad address 'google.com'

Testing with debian also doesn't show DNS issues.

What makes me think this might be a minikube issue is the fact that running the same commands on the host (Ubuntu / Pop!_OS 20.10 ) works as expected:

wapiflapi@box$ sudo docker run -it alpine:3.12.4 ping google.com
PING google.com (142.250.74.238): 56 data bytes
64 bytes from 142.250.74.238: seq=0 ttl=114 time=18.783 ms
64 bytes from 142.250.74.238: seq=1 ttl=114 time=19.358 ms
^C
--- google.com ping statistics ---
2 packets transmitted, 2 packets received, 0% packet loss
round-trip min/avg/max = 18.783/19.070/19.358 ms
wapiflapi@box$ sudo docker run -it alpine:3.13 ping google.com
PING google.com (172.217.21.14): 56 data bytes
64 bytes from 172.217.21.14: seq=0 ttl=116 time=9.466 ms
64 bytes from 172.217.21.14: seq=1 ttl=116 time=9.738 ms
64 bytes from 172.217.21.14: seq=2 ttl=116 time=7.548 ms
^C
--- google.com ping statistics ---
3 packets transmitted, 3 packets received, 0% packet loss
round-trip min/avg/max = 7.548/8.917/9.738 ms

I'm not sure how to further investigate this, or if I should submit this as a bug report at alpine?

@wapiflapi
Copy link
Author

OKay, minutes after posting this I found:

Which is probably related.

@afbjorklund
Copy link
Collaborator

afbjorklund commented Mar 15, 2021

What does /etc/resolv.conf look like ? Docker is supposed to handle it, for the containers.

A normal virtualbox setup has an internal eth0 net, that runs a DNS server on 10.0.2.3

                         _             _            
            _         _ ( )           ( )           
  ___ ___  (_)  ___  (_)| |/')  _   _ | |_      __  
/' _ ` _ `\| |/' _ `\| || , <  ( ) ( )| '_`\  /'__`\
| ( ) ( ) || || ( ) || || |\`\ | (_) || |_) )(  ___/
(_) (_) (_)(_)(_) (_)(_)(_) (_)`\___/'(_,__/'`\____)

$ more /etc/resolv.conf 
# This file is managed by man:systemd-resolved(8). Do not edit.
#
# This is a dynamic resolv.conf file for connecting local clients directly to
# all known uplink DNS servers. This file lists all configured search domains.
#
# Third party programs must not access this file directly, but only through the
# symlink at /etc/resolv.conf. To manage man:resolv.conf(5) in a different way,
# replace this symlink by a static file or a different symlink.
#
# See man:systemd-resolved.service(8) for details about the supported modes of
# operation for /etc/resolv.conf.

nameserver 10.0.2.3
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
    link/ether 08:00:27:69:76:19 brd ff:ff:ff:ff:ff:ff
    inet 10.0.2.15/24 brd 10.0.2.255 scope global dynamic eth0
       valid_lft 86299sec preferred_lft 86299sec
Destination     Gateway         Genmask         Flags Metric Ref    Use Iface
0.0.0.0         10.0.2.2        0.0.0.0         UG    1024   0        0 eth0
10.0.2.0        0.0.0.0         255.255.255.0   U     0      0        0 eth0
10.0.2.2        0.0.0.0         255.255.255.255 UH    1024   0        0 eth0

https://www.virtualbox.org/manual/ch06.html#network_nat

@afbjorklund afbjorklund added co/virtualbox os/linux kind/support Categorizes issue or PR as a support question. labels Mar 15, 2021
@afbjorklund
Copy link
Collaborator

afbjorklund commented Mar 15, 2021

I should also mention that systemd hates the virtualbox servers (both DHCP and DNS)

Basically we had to patch it, because it crashed with some of the responses that it got...

@wapiflapi
Copy link
Author

What does /etc/resolv.conf look like ? Docker is supposed to handle it, for the containers.

I checked that and should have put it in the issue: The /etc/resolv.conf file looked the same on the vm (checking from ssh) and in the container. And it is the same as what you mentioned, 10.0.2.3.

So it looks like the virtualbox set-up is as expected, and the resolv.conf in the container is as expected as well. So I suspect this is a bug with alpine's recent release and not with minikube, but since I'm not an expert I don't feel qualified to make that call and I thought I'd warn about it here.

Thank you for the heads-up about systemd & virtualbox DHCP/DNS servers, I'll keep it in mind in the future

wapiflapi@box:~$ minikube ssh
                         _             _            
            _         _ ( )           ( )           
  ___ ___  (_)  ___  (_)| |/')  _   _ | |_      __  
/' _ ` _ `\| |/' _ `\| || , <  ( ) ( )| '_`\  /'__`\
| ( ) ( ) || || ( ) || || |\`\ | (_) || |_) )(  ___/
(_) (_) (_)(_)(_) (_)(_)(_) (_)`\___/'(_,__/'`\____)

$ cat /etc/issue /etc/resolv.conf 
Welcome to minikube
# This file is managed by man:systemd-resolved(8). Do not edit.
#
# This is a dynamic resolv.conf file for connecting local clients directly to
# all known uplink DNS servers. This file lists all configured search domains.
#
# Third party programs must not access this file directly, but only through the
# symlink at /etc/resolv.conf. To manage man:resolv.conf(5) in a different way,
# replace this symlink by a static file or a different symlink.
#
# See man:systemd-resolved.service(8) for details about the supported modes of
# operation for /etc/resolv.conf.

nameserver 10.0.2.3
$ 
$ docker run -it alpine:3.13 cat /etc/issue /etc/resolv.conf
Welcome to Alpine Linux 3.13
Kernel \r on an \m (\l)

# This file is managed by man:systemd-resolved(8). Do not edit.
#
# This is a dynamic resolv.conf file for connecting local clients directly to
# all known uplink DNS servers. This file lists all configured search domains.
#
# Third party programs must not access this file directly, but only through the
# symlink at /etc/resolv.conf. To manage man:resolv.conf(5) in a different way,
# replace this symlink by a static file or a different symlink.
#
# See man:systemd-resolved.service(8) for details about the supported modes of
# operation for /etc/resolv.conf.

nameserver 10.0.2.3
$ 
$ docker run -it alpine:3.13 ping google.com
ping: bad address 'google.com'

@afbjorklund
Copy link
Collaborator

afbjorklund commented Mar 15, 2021

If I call correctly, there were also some quirks when it came to musl and resolving

Alpine uses a different C library, and sometimes it behaves differently

The bug repeats here (on Ubuntu), it works on the host but not in the minikube VM.

Unable to find image 'alpine:3.13' locally
3.13: Pulling from library/alpine
Digest: sha256:a75afd8b57e7f34e4dad8d65e2c7ba2e1975c795ce1ee22fa34f8cf46f96a3be
Status: Downloaded newer image for alpine:3.13
PING google.com (216.58.207.206): 56 data bytes
64 bytes from 216.58.207.206: seq=0 ttl=115 time=18.773 ms
64 bytes from 216.58.207.206: seq=1 ttl=115 time=20.972 ms
64 bytes from 216.58.207.206: seq=2 ttl=115 time=17.891 ms
^C
--- google.com ping statistics ---
3 packets transmitted, 3 packets received, 0% packet loss
round-trip min/avg/max = 17.891/19.212/20.972 ms
$ ping -c 3 google.com
PING google.com (216.58.207.238): 56 data bytes
64 bytes from 216.58.207.238: seq=0 ttl=63 time=15.600 ms
64 bytes from 216.58.207.238: seq=1 ttl=63 time=16.222 ms
64 bytes from 216.58.207.238: seq=2 ttl=63 time=19.676 ms

--- google.com ping statistics ---
3 packets transmitted, 3 packets received, 0% packet loss
round-trip min/avg/max = 15.600/17.166/19.676 ms
$ docker run -it alpine:3.13 ping -c 3 google.com
Unable to find image 'alpine:3.13' locally
3.13: Pulling from library/alpine
ba3557a56b15: Pull complete 
Digest: sha256:a75afd8b57e7f34e4dad8d65e2c7ba2e1975c795ce1ee22fa34f8cf46f96a3be
Status: Downloaded newer image for alpine:3.13
ping: bad address 'google.com'
$ docker run -it debian ping -c 3 google.com
Unable to find image 'debian:latest' locally
latest: Pulling from library/debian
e22122b926a1: Pull complete 
Digest: sha256:9d4ab94af82b2567c272c7f47fa1204cd9b40914704213f1c257c44042f82aac
Status: Downloaded newer image for debian:latest
PING google.com (216.58.207.238) 56(84) bytes of data.
64 bytes from arn09s19-in-f14.1e100.net (216.58.207.238): icmp_seq=1 ttl=61 time=13.10 ms
64 bytes from arn09s19-in-f14.1e100.net (216.58.207.238): icmp_seq=2 ttl=61 time=20.1 ms
64 bytes from arn09s19-in-f14.1e100.net (216.58.207.238): icmp_seq=3 ttl=61 time=16.0 ms

--- google.com ping statistics ---
3 packets transmitted, 3 received, 0% packet loss, time 5ms
rtt min/avg/max/mdev = 13.950/16.694/20.114/2.563 ms

@afbjorklund
Copy link
Collaborator

afbjorklund commented Mar 15, 2021

Docker Machine has flags to toggle this:

   --virtualbox-host-dns-resolver									Use the host DNS resolver [$VIRTUALBOX_HOST_DNS_RESOLVER]
   --virtualbox-no-dns-proxy										Disable proxying all DNS requests to the host [$VIRTUALBOX_NO_DNS_PROXY]

They are available in Minikube as well, as:

      --dns-proxy=false: Enable proxy for NAT DNS requests (virtualbox driver only)
      --host-dns-resolver=true: Enable host resolver for NAT DNS requests (virtualbox driver only)

The machine ticket references these:

https://gitlab.alpinelinux.org/alpine/aports/-/issues/6221

https://www.virtualbox.org/ticket/18171

So it looks like yet another unhappy DNS customer

wapiflapi added a commit to wapiflapi/openfaas-python3-fastapi-template that referenced this issue Mar 16, 2021
When alpine 3.13 was released I had breaking changes when building in minikube.

In general I think it's better to have fixed version so that if it works it doesn't break.

See kubernetes/minikube#10830 for the discussion about the original / upstream bug.
@spowelljr spowelljr added long-term-support Long-term support issues that can't be fixed in code and removed triage/long-term-support labels May 19, 2021
@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue or PR as fresh with /remove-lifecycle stale
  • Mark this issue or PR as rotten with /lifecycle rotten
  • Close this issue or PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Aug 17, 2021
@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue or PR as fresh with /remove-lifecycle rotten
  • Close this issue or PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

@k8s-ci-robot k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Sep 16, 2021
@sharifelgamal sharifelgamal removed the lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. label Sep 22, 2021
@r4j4h
Copy link
Contributor

r4j4h commented Oct 25, 2021

So I ran into this again today while building from a Dockerfile utilizing alpine 3.14 and found no combination of flags between --dns-proxy and --host-dns-resolver that would keep the "DNS lookup error" from happening.

I also tried and couldn't find any working ENV vars or switches for docker build that would were reported to work: I tried DOCKER_OPTS, --dns was not accepted by docker build, and --network=host was accepted but also did not change things either for me.

I did find two ways through modifying docker daemon's dns setting or overwriting /etc/resolv.conf in the RUN step.

Because docker daemon tries to use the host, one could probably just change the minikube VM's resolv configuration but I did not succeed in doing so.

Here is modifying daemon's dns setting. Affects future docker commands. Ported from this answer for minikube like so:

minikube-ssh...
$> sudo vi /etc/docker/daemon.json
# Add
...,
"dns": ["8.8.8.8"]
# Write and save
$> sudo systemctl restart docker

Then the build succeeded.

Here is overwriting the /etc/resolv.conf in the RUN step:

As described in this answer I can confirm that by prefixing my RUN command with echo "nameserver 8.8.8.8" > /etc/resolv.conf && it built successfully, without the above daemon change.

This is nice because it only affects the area that needs it, but requires modification of the Dockerfile itself.

@ShahNewazKhan
Copy link

Ran into this issue and pinning my docker build to Apline v 3.12 worked as per this.

@anand-kulk
Copy link

anand-kulk commented Apr 5, 2022

So I ran into this again today while building from a Dockerfile utilizing alpine 3.14 and found no combination of flags between --dns-proxy and --host-dns-resolver that would keep the "DNS lookup error" from happening.

I also tried and couldn't find any working ENV vars or switches for docker build that would were reported to work: I tried DOCKER_OPTS, --dns was not accepted by docker build, and --network=host was accepted but also did not change things either for me.

I did find two ways through modifying docker daemon's dns setting or overwriting /etc/resolv.conf in the RUN step.

Because docker daemon tries to use the host, one could probably just change the minikube VM's resolv configuration but I did not succeed in doing so.

Here is modifying daemon's dns setting. Affects future docker commands. Ported from this answer for minikube like so:

minikube-ssh...
$> sudo vi /etc/docker/daemon.json
# Add
...,
"dns": ["8.8.8.8"]
# Write and save
$> sudo systemctl restart docker

Then the build succeeded.

Here is overwriting the /etc/resolv.conf in the RUN step:

As described in this answer I can confirm that by prefixing my RUN command with echo "nameserver 8.8.8.8" > /etc/resolv.conf && it built successfully, without the above daemon change.

This is nice because it only affects the area that needs it, but requires modification of the Dockerfile itself.

@r4j4h I ran into same issue with docker in minikube, updating daemon.json by ssh worked like charm! Thanks.

@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue or PR as fresh with /remove-lifecycle stale
  • Mark this issue or PR as rotten with /lifecycle rotten
  • Close this issue or PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jul 4, 2022
@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue or PR as fresh with /remove-lifecycle rotten
  • Close this issue or PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

@k8s-ci-robot k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Aug 3, 2022
@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Reopen this issue or PR with /reopen
  • Mark this issue or PR as fresh with /remove-lifecycle rotten
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close

@k8s-ci-robot
Copy link
Contributor

@k8s-triage-robot: Closing this issue.

In response to this:

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Reopen this issue or PR with /reopen
  • Mark this issue or PR as fresh with /remove-lifecycle rotten
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@jstangroome
Copy link
Contributor

This is a bug in Virtualbox, finally fixed in v6.1.36.

NAT: Prevent issue when host resolver incorrectly returned NXDOMAIN for unsupported queries (bug #20977)

This issue was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
co/virtualbox kind/support Categorizes issue or PR as a support question. lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. long-term-support Long-term support issues that can't be fixed in code os/linux
Projects
None yet
Development

No branches or pull requests

10 participants