Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DNS resolution fails with dnsPolicy: ClusterFirstWithHostNet and hostNetwork: true #1827

Closed
runningman84 opened this issue May 25, 2020 · 16 comments

Comments

@runningman84
Copy link

Version:

k3s version v1.17.4+k3s1 (3eee8ac)
ubuntu 20.04

K3s arguments:

--no-deploy traefik --no-deploy=servicelb --kubelet-arg containerd=/run/k3s/containerd/containerd.sock

Describe the bug

The dns resolution does not work for my container which is running using these settings:

      dnsPolicy: ClusterFirstWithHostNet
      hostNetwork: true

The dns resolution works just fine if I do not use hostNetwork and don't change the dns policy.

The core dns service looks fine:

kubectl describe service kube-dns -n kube-system                                                                                                                     ±[master]
Name:              kube-dns
Namespace:         kube-system
Labels:            k8s-app=kube-dns
                   kubernetes.io/cluster-service=true
                   kubernetes.io/name=CoreDNS
                   objectset.rio.cattle.io/hash=bce283298811743a0386ab510f2f67ef74240c57
Annotations:       objectset.rio.cattle.io/applied:
                     {"apiVersion":"v1","kind":"Service","metadata":{"annotations":{"objectset.rio.cattle.io/id":"","objectset.rio.cattle.io/owner-gvk":"k3s.ca...
                   objectset.rio.cattle.io/id: 
                   objectset.rio.cattle.io/owner-gvk: k3s.cattle.io/v1, Kind=Addon
                   objectset.rio.cattle.io/owner-name: coredns
                   objectset.rio.cattle.io/owner-namespace: kube-system
                   prometheus.io/port: 9153
                   prometheus.io/scrape: true
Selector:          k8s-app=kube-dns
Type:              ClusterIP
IP:                10.43.0.10
Port:              dns  53/UDP
TargetPort:        53/UDP
Endpoints:         10.42.0.141:53,10.42.1.117:53,10.42.3.202:53
Port:              dns-tcp  53/TCP
TargetPort:        53/TCP
Endpoints:         10.42.0.141:53,10.42.1.117:53,10.42.3.202:53
Port:              metrics  9153/TCP
TargetPort:        9153/TCP
Endpoints:         10.42.0.141:9153,10.42.1.117:9153,10.42.3.202:9153
Session Affinity:  None
Events:            <none>

As you can see I can sucessfuly query the single instances of coredns but the cluster ip access fails:

bash-5.0# dig @10.42.1.117 www.heise.de

; <<>> DiG 9.14.8 <<>> @10.42.1.117 www.heise.de
; (1 server found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 59239
;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4096
;; QUESTION SECTION:
;www.heise.de.                  IN      A

;; ANSWER SECTION:
www.heise.de.           30      IN      A       193.99.144.85

;; Query time: 23 msec
;; SERVER: 10.42.1.117#53(10.42.1.117)
;; WHEN: Mon May 25 21:43:20 UTC 2020
;; MSG SIZE  rcvd: 69

bash-5.0# dig @10.42.3.202 www.heise.de

; <<>> DiG 9.14.8 <<>> @10.42.3.202 www.heise.de
; (1 server found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 36239
;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4096
;; QUESTION SECTION:
;www.heise.de.                  IN      A

;; ANSWER SECTION:
www.heise.de.           30      IN      A       193.99.144.85

;; Query time: 14 msec
;; SERVER: 10.42.3.202#53(10.42.3.202)
;; WHEN: Mon May 25 21:43:31 UTC 2020
;; MSG SIZE  rcvd: 69

bash-5.0# dig @10.43.0.10 www.heise.de

; <<>> DiG 9.14.8 <<>> @10.43.0.10 www.heise.de
; (1 server found)
;; global options: +cmd
;; connection timed out; no servers could be reached

To Reproduce

Run a pod with host network and dns policy ClusterFirstWithHostNet.

Expected behavior

DNS resolution should work fine

Actual behavior

DNS resolution does not work at all

Additional context / logs

DNS resolution works fine with container network:

~ $ nslookup  www.heise.de
nslookup: can't resolve '(null)': Name does not resolve

Name:      www.heise.de
Address 1: 193.99.144.85 www.heise.de
Address 2: 2a02:2e0:3fe:1001:7777:772e:2:85 www.heise.de

@ikaruswill
Copy link

ikaruswill commented May 26, 2020

+1, though this seems to happen exclusively on my arm64 nodes.

I'm running image: homeassistant/home-assistant:0.110.2 which requires hostNetwork: true and dnsPolicy: clusterFirstWithHostNet for discovery of local network smart-home devices.

To clarify, I've had this problem for a while now, but I face no such issue on my armv7l devices, it only happens to arm64 devices.

Node 1 information

$ lsb_release -a
No LSB modules are available.
Distributor ID:	Debian
Description:	Debian GNU/Linux 10 (buster)
Release:	10
Codename:	buster
$ uname -r
5.4.32-rockchip64
$ sudo iptables -V
iptables v1.8.2 (legacy)
$ lscpu
Architecture:        aarch64
Byte Order:          Little Endian
CPU(s):              6
On-line CPU(s) list: 0-5
Thread(s) per core:  1
Core(s) per socket:  3
Socket(s):           2
NUMA node(s):        1
Vendor ID:           ARM
Model:               4
Model name:          Cortex-A53
Stepping:            r0p4
CPU max MHz:         2016.0000
CPU min MHz:         408.0000
BogoMIPS:            48.00
NUMA node0 CPU(s):   0-5
Flags:               fp asimd evtstrm aes pmull sha1 sha2 crc32 cpuid

Node 2 Information

$ lsb_release -a
No LSB modules are available.
Distributor ID:	Ubuntu
Description:	Ubuntu 18.04.4 LTS
Release:	18.04
Codename:	bionic
$ uname -r
5.4.28-rockchip64
$ sudo iptables -V
iptables v1.6.1
No LSB modules are available.
Distributor ID:	Ubuntu
Description:	Ubuntu 18.04.4 LTS
Release:	18.04
Codename:	bionic
pi@l2:~$ lscpu
Architecture:        aarch64
Byte Order:          Little Endian
CPU(s):              6
On-line CPU(s) list: 0-5
Thread(s) per core:  1
Core(s) per socket:  3
Socket(s):           2
NUMA node(s):        1
Vendor ID:           ARM
Model:               4
Model name:          Cortex-A53
Stepping:            r0p4
CPU max MHz:         2016.0000
CPU min MHz:         408.0000
BogoMIPS:            48.00
NUMA node0 CPU(s):   0-5
Flags:               fp asimd evtstrm aes pmull sha1 sha2 crc32 cpuid

@runningman84
Copy link
Author

I have three amd64 nodes who suffer from this issue...

@brandond
Copy link
Member

As you can see I can sucessfuly query the single instances of coredns but the cluster ip access fails

So the real issue here is that you cannot access ClusterIP services when using hostNetwork: true.

Are you using Ubuntu's ufw or any other host-based firewall that might be interfering with this traffic?

@samirsss
Copy link

We had a similar issue even with 1.18.2 and we initially tried the host-gw option for flannel but that didnt help and then followed the proposed alternatives from here: #751

Specifically followed this:
#751 (comment)

@runningman84
Copy link
Author

runningman84 commented May 27, 2020

I do not have any specific firewall rules or ufw in place:

root@cubi001:~# iptables -L
Chain INPUT (policy ACCEPT)
target     prot opt source               destination         
KUBE-SERVICES  all  --  anywhere             anywhere             ctstate NEW /* kubernetes service portals */
KUBE-EXTERNAL-SERVICES  all  --  anywhere             anywhere             ctstate NEW /* kubernetes externally-visible service portals */
KUBE-FIREWALL  all  --  anywhere             anywhere            

Chain FORWARD (policy ACCEPT)
target     prot opt source               destination         
KUBE-FORWARD  all  --  anywhere             anywhere             /* kubernetes forwarding rules */
KUBE-SERVICES  all  --  anywhere             anywhere             ctstate NEW /* kubernetes service portals */
ACCEPT     all  --  cubi001/16           anywhere            
ACCEPT     all  --  anywhere             cubi001/16          

Chain OUTPUT (policy ACCEPT)
target     prot opt source               destination         
KUBE-SERVICES  all  --  anywhere             anywhere             ctstate NEW /* kubernetes service portals */
KUBE-FIREWALL  all  --  anywhere             anywhere            

Chain KUBE-EXTERNAL-SERVICES (1 references)
target     prot opt source               destination         

Chain KUBE-FIREWALL (2 references)
target     prot opt source               destination         
DROP       all  --  anywhere             anywhere             /* kubernetes firewall for dropping marked packets */ mark match 0x8000/0x8000

Chain KUBE-FORWARD (1 references)
target     prot opt source               destination         
DROP       all  --  anywhere             anywhere             ctstate INVALID
ACCEPT     all  --  anywhere             anywhere             /* kubernetes forwarding rules */ mark match 0x4000/0x4000
ACCEPT     all  --  cubi001/16           anywhere             /* kubernetes forwarding conntrack pod source rule */ ctstate RELATED,ESTABLISHED
ACCEPT     all  --  anywhere             cubi001/16           /* kubernetes forwarding conntrack pod destination rule */ ctstate RELATED,ESTABLISHED

Chain KUBE-KUBELET-CANARY (0 references)
target     prot opt source               destination         

Chain KUBE-PROXY-CANARY (0 references)
target     prot opt source               destination         

Chain KUBE-SERVICES (3 references)
target     prot opt source               destination         
REJECT     tcp  --  anywhere             10.43.190.52         /* hass/influxdb:rpc has no endpoints */ tcp dpt:omniorb reject-with icmp-port-unreachable
REJECT     tcp  --  anywhere             10.43.142.88         /* hass/home-assistant:api has no endpoints */ tcp dpt:8123 reject-with icmp-port-unreachable
REJECT     tcp  --  anywhere             10.43.142.88         /* hass/home-assistant:vscode has no endpoints */ tcp dpt:8888 reject-with icmp-port-unreachable

root@cubi001:~# ufw status
Status: inactive

@runningman84
Copy link
Author

the node local dns cache fixed it for me... it would be great if node dns local cache would be included in k3s by default or the underlying issue would be fixed... I think running hostNetwork pods is not uncommon...

@brandond
Copy link
Member

Are you unable to reach ANY ClusterIP service when using host network, or is it only an issue with the coredns service?

@ikaruswill
Copy link

In my case it was unable to reach ANY ClusterIP service.

@oskapt
Copy link

oskapt commented May 30, 2020

(NOTE: Workaround/Solution at the end of this comment)

I'm also up against this, running k3s on 3 x86 VMs under Proxmox. I get no DNS resolution at all from a Pod running with hostNetwork: true and ClusterFirstWithHostNet for Plex. I have no firewalls restricting communication between nodes.

From the affected Pod I'm able to do DNS queries out to the LAN DNS server (10.68.0.2) but not to the cluster DNS server (10.43.0.10). I can query ClusterIP services whose Pods are on other nodes if I do so by IP address.

DNS Queries for Internal and External Hosts from Sidecar

This shows queries to 10.43.0.10 failing, while queries to 10.68.0.2 succeed.

root@k01:/# cat /etc/resolv.conf
search infrastructure.svc.cluster.local svc.cluster.local cluster.local cl.monach.us
nameserver 10.43.0.10
options ndots:5
root@k01:/# host plex.tv 10.43.0.10
^Croot@k01:/# host unifi 
^Croot@k01:/# host -v unifi
Trying "unifi.infrastructure.svc.cluster.local"
;; connection timed out; no servers could be reached
root@k01:/# host -v monach.us.
Trying "monach.us"
;; connection timed out; no servers could be reached
root@k01:/# host -v monach.us.
Trying "monach.us"
;; connection timed out; no servers could be reached
root@k01:/# ping 10.43.0.10
PING 10.43.0.10 (10.43.0.10) 56(84) bytes of data.
^C
--- 10.43.0.10 ping statistics ---
2 packets transmitted, 0 received, 100% packet loss, time 1009ms

root@k01:/# ping 10.68.0.2
PING 10.68.0.2 (10.68.0.2) 56(84) bytes of data.
64 bytes from 10.68.0.2: icmp_seq=1 ttl=64 time=0.892 ms
64 bytes from 10.68.0.2: icmp_seq=2 ttl=64 time=0.462 ms
^C
--- 10.68.0.2 ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 1016ms
rtt min/avg/max/mdev = 0.462/0.677/0.892/0.215 ms
root@k01:/# host -v monach.us. 10.68.0.2
Trying "monach.us"
Using domain server:
Name: 10.68.0.2
Address: 10.68.0.2#53
Aliases: 

;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 38437
;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 0

;; QUESTION SECTION:
;monach.us.                     IN      A

;; ANSWER SECTION:
monach.us.              7200    IN      A       159.89.221.68

Received 43 bytes from 10.68.0.2#53 in 102 ms
Trying "monach.us"
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 26803
;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 1, ADDITIONAL: 0

;; QUESTION SECTION:
;monach.us.                     IN      AAAA

;; AUTHORITY SECTION:
monach.us.              900     IN      SOA     ns-1071.awsdns-05.org. awsdns-hostmaster.amazon.com. 1 7200 900 1209600 86400

Received 112 bytes from 10.68.0.2#53 in 341 ms
Trying "monach.us"
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 3141
;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 0

;; QUESTION SECTION:
;monach.us.                     IN      MX

;; ANSWER SECTION:
monach.us.              300     IN      MX      10 mail.monach.us.

Received 48 bytes from 10.68.0.2#53 in 110 ms
root@k01:/# host -v monach.us. 10.43.0.10
Trying "monach.us"
^C

GET Request to ClusterIP Service by IP / Hostname

cURL requests to IP succeed, but DNS resolution fails when making the request to a ClusterIP hostname (10.43.71.169 is the ClusterIP for the vault Service).

root@k01:/# curl -I 10.43.71.169:8200
HTTP/1.1 307 Temporary Redirect
Cache-Control: no-store
Content-Type: text/html; charset=utf-8
Location: /ui/
Date: Sat, 30 May 2020 16:38:02 GMT

root@k01:/# curl -I vault.vault.svc.cluster.local:8200
curl: (6) Could not resolve host: vault.vault.svc.cluster.local

Digging around led me to this which led me to this, in which the OP says that there is no route to 10.43.0.0/16 via the local cni0 address.

Sure enough, if this is added, it works:

# On the host

root@k01:~# ip route add 10.43.0.0/16 via 10.42.0.1

# In the container (same as the last command above that failed)

root@k01:/# curl -I vault.vault.svc.cluster.local:8200
HTTP/1.1 307 Temporary Redirect
Cache-Control: no-store
Content-Type: text/html; charset=utf-8
Location: /ui/
Date: Sat, 30 May 2020 16:54:49 GMT

Workaround / Solution

Following the suggestions in this comment I switched the flannel backend to host-gw, and this problem went away.

There's a lot of good troubleshooting later in that ticket about checksum validation changes in 1.17 versus 1.16, and it leads me to believe that this is Flannel's issue to resolve, not Kubernetes/K3s/Rancher's issue.

billimek added a commit to billimek/homelab-infrastructure that referenced this issue Jun 18, 2020
Signed-off-by: Jeff Billimek <jeff@billimek.com>
@ikaruswill
Copy link

Sharing my findings: I've upgraded from v1.17.5+k3s1 to v1.18.4+k3s1 2 days ago and have since stopped observing that issue.

@DoGab
Copy link

DoGab commented Jun 28, 2020

Sharing my findings: I've upgraded from v1.17.5+k3s1 to v1.18.4+k3s1 2 days ago and have since stopped observing that issue.

I can't confirm that. I'm also on the v1.18.4+k3s1 release and still having this issue.

@wpfnihao
Copy link

wpfnihao commented Jul 10, 2020

I'm also up against this. After adding --flannel-backend=host-gw argument during k3s deployment, all pods cannot connect to clusterip services.

@brandond
Copy link
Member

@wpfnihao what are you deployed on? Does your infrastructure support host-gw?

@stale
Copy link

stale bot commented Jul 31, 2021

This repository uses a bot to automatically label issues which have not had any activity (commit/comment/label) for 180 days. This helps us manage the community issues better. If the issue is still relevant, please add a comment to the issue so the bot can remove the label and we know it is still valid. If it is no longer relevant (or possibly fixed in the latest release), the bot will automatically close the issue in 14 days. Thank you for your contributions.

@stale stale bot added the status/stale label Jul 31, 2021
@stale stale bot closed this as completed Aug 14, 2021
@oliviermichaelis
Copy link
Contributor

In case anyone else stumbles on this - I had the same problem after upgrading from v1.23.x to v1.25.4+k3s1 with the vxlan flannel backend. Upgrading to v1.25.5+k3s2 fixed the problem for me :)

@jonassvatos
Copy link

For the record, upgrading from v1.25.4+k3s1 to v1.25.6+k3s1 also fixed the problem here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

9 participants