DNS lookup timeouts #667

juan-lee · 2018-09-26T21:30:20Z

Symptoms
Outbound requests from pods can see a 5 second delay during DNS lookups. This is known to impact containers based off of the alpine image.

Root Cause
https://www.weave.works/blog/racy-conntrack-and-dns-lookup-timeouts

Workaround
Add the following to your impacted pod's manifest.

postStart:
  exec:
    command:
    - /bin/sh
    - -c
    - "/bin/echo 'options single-request-reopen' >> /etc/resolv.conf"

What is single-request-reopen?

single-request-reopen (since glibc 2.9)
  Sets RES_SNGLKUPREOP in _res.options.  The resolver
  uses the same socket for the A and AAAA requests.  Some
  hardware mistakenly sends back only one reply.  When
  that happens the client system will sit and wait for
  the second reply.  Turning this option on changes this
  behavior so that if two requests from the same port are
  not handled correctly it will close the socket and open
  a new one before sending the second request.

The text was updated successfully, but these errors were encountered:

devteng · 2018-09-26T21:45:26Z

FYI, I have not testing this personally -- this supposedly will not work for Alpine based containers since MUSL lib that Alpine uses does not support this option.

kubernetes/kubernetes#62628 (comment)

epoyraz · 2018-10-01T08:01:57Z

We had the same issue using alpine image.
It's not a AKS problem, but a general kubernetes issue: See
kubernetes/kubernetes#56903

I would recommend to read this: https://blog.quentin-machu.fr/2018/06/24/5-15s-dns-lookups-on-kubernetes/

We changed our base image to jessie-slim and added this to the pod manifest.

dnsConfig:  
  options:  
    - name: single-request-reopen

more details : https://kubernetes.io/docs/concepts/services-networking/dns-pod-service/

I think it is a cleaner solution than adding a postStart hook.

timwebster9 · 2018-10-04T16:50:03Z

I'm not sure what to think about this. Is the advice going to be: 'you can't run Alpine and need to customise DNS' if you expect proper DNS performance?

Could someone from Microsoft weigh in here?

AXington · 2018-10-05T21:17:20Z

This also works, and it works on alpine pods:

https://kubernetes.io/docs/concepts/services-networking/dns-pod-service/#pod-s-dns-policy
service/networking/custom-dns.yaml
apiVersion: v1
kind: Pod
metadata:
namespace: default
name: dns-example
spec:
containers:
- name: test
image: nginx
dnsPolicy: "None"
dnsConfig:
nameservers:
- 1.2.3.4
searches:
- ns1.svc.cluster.local
- my.dns.search.suffix
options:
- name: ndots
value: "2"
- name: edns0

wutingbupt · 2018-11-14T07:22:01Z

We have the same problem, will AKS rollout a solution for us? I don't see an elegant workaround so far.

wutingbupt · 2018-11-15T10:06:31Z

@AXington Your solution reduces the delay from 5 seconds to 2.5 :), but still not fully work.

wutingbupt · 2018-11-15T10:07:23Z

@juan-lee your workaround didn't work perfectly for the alpine based images, any better idea?

juan-lee · 2018-11-15T19:43:56Z

@juan-lee your workaround didn't work perfectly for the alpine based images, any better idea?

My understanding is that we've rolled the fix out during this week's release. You will have to upgrade your cluster to a newer version of k8s to get the changes.

wutingbupt · 2018-11-15T20:17:37Z

Thanks very much, @juan-lee, I will let you know how does it work.
Br,
Tim

wutingbupt · 2018-11-15T21:42:29Z

@juan-lee The 1.11.4 solves this problem, thanks for your help.

Br,
Tim

motarski · 2018-11-16T08:43:26Z

I checked official change log can't find specific change that addresses this bug in the change log, but amazingly the upgrade seems to fix this issue for our AKS cluster. I hope it will not come back on next version bump.

juan-lee · 2018-11-19T16:28:33Z

I checked official change log can't find specific change that addresses this bug in the change log, but amazingly the upgrade seems to fix this issue for our AKS cluster. I hope it will not come back on next version bump.

The fix isn't in k8s, but rather a kernel patch that is now default for new agent nodes. The reason you're seeing the problem go away with an upgrade is because your old agent nodes are replaced with ones that have the new kernel.

motarski · 2018-11-20T10:34:15Z

Ok I see, thanks for explanation@juan-lee

timwebster9 · 2018-11-22T13:22:47Z

@juan-lee is the fix only present in the new 'MobyImage' that you have to register for, or is it present in all 1.11.4 images?

weinong · 2018-12-04T17:33:46Z

Hi people!
As it stands today, all new VMs (through new cluster, scale, or upgrade) have the kernel fix '"netfilter: nf_conntrack: resolve clash for matching conntracks" fixes the 1st race (accepted).` described in https://www.weave.works/blog/racy-conntrack-and-dns-lookup-timeouts

Thanks

nodeselector · 2019-02-22T00:59:51Z

@weinong any info on when the second race will be addressed?

kim0 · 2019-06-18T22:09:31Z

I just hit this, and single-request-reopen does seem to make it go away! This is on a recent fresh 1.13.5 aks cluster. Not sure why the DNS server gives the wrong answer to begin with. Is this a coredns issue ?
Thanks @juan-lee for the work-around 👍

kim0 · 2019-06-19T11:42:31Z

Hi @weinong .. I deployed a cluster a few days back, and still hitting this. So I think it's still not fixed. Is there any update when a fix is expected to land.
Also taking a chance and pinging @jnoller

tapanhalani · 2019-06-30T13:08:50Z

Hi people. Is this issue fixed with kubernetes 1.13.5? I wanted to upgrade my AKS cluster to the said version, but I am using multiple deployments with alpine-based images.

jemag · 2019-11-20T16:42:44Z

Still seeing the issue with kernel-version 4.15.0-1063-azure

Vandersteen · 2019-11-25T11:16:27Z

I also am seeing this issue in kernel version 4.15.0-1060-azure

juan-lee · 2019-11-25T16:35:02Z

@jemag @Vandersteen Recently I've seen DNS timeouts being caused by high resource usage on the nodes where the coredns pods reside. Could either of you check your monitoring and see how saturated your CPU, memory, and disk are?

jemag · 2019-11-25T16:42:10Z

@juan-lee I have replicated this problem under various loads. Currently with all nodes under 18% cpu usage and memory at around 70% for each node the problem still happens intermittently. Disk usage is extremely low.

Vandersteen · 2019-11-25T16:44:59Z

@juan-lee I have replicated this on 2 different clusters. 1 has 40% cpu usage & 60% memory usage. The other one has 20% cpu usage and 42% memory usage.

juan-lee · 2019-12-04T18:17:42Z

@jemag @Vandersteen we are looking into it. By chance are your clusters using azure cni? I'm only able to reproduce the issue with azure cni clusters.

jemag · 2019-12-04T18:20:40Z

@juan-lee mine are indeed using azure cni

Vandersteen · 2019-12-04T18:49:40Z

@juan-lee Yes we are using azure cni

juan-lee · 2019-12-05T22:02:39Z

We are still working to get to the bottom of this issue. In the meantime, adding the following to your specs in most cases can help.

dnsConfig:  
  options:  
    - name: single-request-reopen

guitmz · 2019-12-06T10:02:58Z

This started to happen for me when we upgraded the cluster to v.1.14 (from 1.13). We had the same problem a year ago and it was fixed after the kernel was patched with the mitigations from https://www.weave.works/blog/racy-conntrack-and-dns-lookup-timeouts

is there a way we can run coredns in AKS as daemonset to try to fix this (at least try)? We need azure cni and we cant use single-request-reopen for all pods because some are alpine based

jnoller · 2020-01-08T16:02:43Z

Please also see this issue for intermittent nodenotready, DNS latency and other crashed related to system load: #1373

kwaazaar · 2020-07-06T13:22:49Z

We are still working to get to the bottom of this issue. In the meantime, adding the following to your specs in most cases can help.
dnsConfig:  
  options:  
    - name: single-request-reopen

I tried this, but keep getting this issue. I have a container doing a curl statement in a loop with a 2 second pause in between attempt and get it about once every 20 attempts. I seems to happen on 3 clusters (OT, ACC and PROD). On those last two this seems to help pretty well. On the OT-cluster (1.16.9) it does not. All clusters are using Azure CNI and use private networking and a VPN connection to on-prem DNS-servers.

neil-south · 2020-09-30T09:41:55Z

This is now effecting us. the workaround

dnsConfig:  
  options:  
    - name: single-request-reopen

seems to work, but obviously this is effecting lots of people (most of which wont even realise), and needs sorting!

Ive even tried creating a new v1.19 AKS cluster, and using the official mcr.microsoft.com/dotnet/core/aspnet:3.1 docker base image (as of 29/9/2020) and the issue is still present.

davesmits · 2020-09-30T12:16:01Z

We used the dnsCOnfig work around and initially it didnt work for us. It turned out that it only works with the non-alpine images.

sikri-eic · 2020-10-09T10:47:05Z

Is there a solution/workaround for this issue for Alpine based images? We tried dnsPolicy: Default to no avail. We are also using Azure CNI (we have a 2500 ms delay on nslookup).

Kubernetes cluster version: 1.18.8
OS Image: Ubuntu 16.04.7 LTS

A newer version of OS image (18.04) has been deployed very recently, would it help to upgrade?

timja · 2020-10-09T10:52:28Z

@sikri-eic deploying nodelocaldns fixed it for us.
18.04 didn't solve it in our case

sikri-eic · 2020-10-09T11:04:22Z

@timja Thank you for the prompt response. I will take a look at nodelocaldns. The documentation sounds like it may actually address this problem.

Vandersteen · 2020-10-09T14:23:52Z

@timja Did you have to do anything special to install nodelocaldns on aks ? Seems like you need to mess around with IPtables or something

timja · 2020-10-09T14:30:04Z

It depends if you're using kubenet or azure cni,

kubenet works with just the standard config,

azure cni:

add the -setupebtables flag

ref:
#1642

example config for azure cni (note it took months for a release after our changes so we're using a forked image but looks like upstream have released now):
https://github.com/hmcts/cnp-flux-config/blob/master/k8s/namespaces/kube-system/nodelocaldns/nodelocaldns.yaml

curtdept · 2020-10-10T14:18:44Z

Here is a version for AKS, runs on everything except virtual node.

https://github.com/curtdept/aks_nodelocaldns/blob/main/nodelocaldns.yaml

This worked amazing btw, cleaned up tons of issues I had with heartbeats and clustered services.

palma21 · 2020-10-20T04:23:47Z

Marking these as duplicate/known-issue and adding the in-progress feature for it as well as well as @curtdept current workaround (thanks!)

https://github.com/curtdept/aks_nodelocaldns/blob/main/nodelocaldns.yaml

#1492

ghost · 2020-10-21T09:01:28Z

This issue has been marked as duplicate and has not had any activity for 1 day. It will be closed for housekeeping purposes.

juan-lee added the known-issue label Sep 26, 2018

zuehlke-dea mentioned this issue Nov 8, 2019

AKS clusters are impacted by kube dns issue #56903 #632

Closed

jluk assigned juan-lee Dec 5, 2019

jluk self-assigned this Dec 5, 2019

guitmz mentioned this issue Dec 12, 2019

Existing AKS Cluster Suddenly not Resolving DNS #1320

Closed

MichaCo mentioned this issue Feb 14, 2020

Intermittent "Record reader index out of sync." error on SRV record Resolve. MichaCo/DnsClient.NET#51

Closed

ghost added the action-required label Aug 21, 2020

timja mentioned this issue Oct 12, 2020

DNS lookup from aks pods takes more than 5 seconds #1326

Closed

palma21 added the resolution/duplicate label Oct 20, 2020

ghost removed the action-required label Oct 20, 2020

ghost closed this as completed Oct 21, 2020

ghost locked as resolved and limited conversation to collaborators Nov 20, 2020

This issue was closed.

DNS lookup timeouts #667

DNS lookup timeouts #667

Comments

juan-lee commented Sep 26, 2018

devteng commented Sep 26, 2018

epoyraz commented Oct 1, 2018 • edited Loading

timwebster9 commented Oct 4, 2018

AXington commented Oct 5, 2018

wutingbupt commented Nov 14, 2018

wutingbupt commented Nov 15, 2018

wutingbupt commented Nov 15, 2018

juan-lee commented Nov 15, 2018

wutingbupt commented Nov 15, 2018

wutingbupt commented Nov 15, 2018

motarski commented Nov 16, 2018

juan-lee commented Nov 19, 2018

motarski commented Nov 20, 2018

timwebster9 commented Nov 22, 2018

weinong commented Dec 4, 2018

nodeselector commented Feb 22, 2019

kim0 commented Jun 18, 2019

kim0 commented Jun 19, 2019

tapanhalani commented Jun 30, 2019

jemag commented Nov 20, 2019

Vandersteen commented Nov 25, 2019

juan-lee commented Nov 25, 2019

jemag commented Nov 25, 2019

Vandersteen commented Nov 25, 2019

juan-lee commented Dec 4, 2019

jemag commented Dec 4, 2019

Vandersteen commented Dec 4, 2019

juan-lee commented Dec 5, 2019

guitmz commented Dec 6, 2019

jnoller commented Jan 8, 2020

kwaazaar commented Jul 6, 2020

neil-south commented Sep 30, 2020

davesmits commented Sep 30, 2020

sikri-eic commented Oct 9, 2020 • edited Loading

timja commented Oct 9, 2020

sikri-eic commented Oct 9, 2020

Vandersteen commented Oct 9, 2020

timja commented Oct 9, 2020

curtdept commented Oct 10, 2020 • edited Loading

palma21 commented Oct 20, 2020

ghost commented Oct 21, 2020

epoyraz commented Oct 1, 2018 •

edited

Loading

sikri-eic commented Oct 9, 2020 •

edited

Loading

curtdept commented Oct 10, 2020 •

edited

Loading