Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DNS resolution for externalName services broken in v1.14.9 #224

Closed
ApsOps opened this issue Apr 13, 2018 · 8 comments
Closed

DNS resolution for externalName services broken in v1.14.9 #224

ApsOps opened this issue Apr 13, 2018 · 8 comments

Comments

@ApsOps
Copy link

ApsOps commented Apr 13, 2018

After updating to v1.14.9, externalName services are not resolving anymore. I tested with v1.14.8 and it works.

/ # dig @100.102.0.17 mysql-external.default.svc.cluster.local

; <<>> DiG 9.11.2-P1 <<>> @100.102.0.17 mysql-external.default.svc.cluster.local
; (1 server found)
;; global options: +cmd
;; Got answer:
;; WARNING: .local is reserved for Multicast DNS
;; You are currently testing what happens when an mDNS query is leaked to DNS
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 45659
;; flags: qr aa rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 1, ADDITIONAL: 0

;; QUESTION SECTION:
;mysql-external.default.svc.cluster.local.	IN A

;; AUTHORITY SECTION:
cluster.local.		60	IN	SOA	ns.dns.cluster.local. hostmaster.cluster.local. 1523602800 28800 7200 604800 60

;; Query time: 1 msec
;; SERVER: 100.102.0.17#53(100.102.0.17)
;; WHEN: Fri Apr 13 07:58:15 UTC 2018
;; MSG SIZE  rcvd: 110

kube-dns pod logs (I've replaced the correct domain with example.com but it exists and resolves normally in v1.14.8):

[kube-dns-aps-78b75f775c-b2j8p kubedns] I0413 08:12:22.197655       1 dns.go:612] Query for "mysql-external.default.svc.cluster.local.", exact: false
[kube-dns-aps-78b75f775c-b2j8p kubedns] I0413 08:12:22.197682       1 dns.go:841] Not a federation query: len(["mysql-external" "default" "svc" "cluster" "local"]) != 4+len(["local" "cluster"])
[kube-dns-aps-78b75f775c-b2j8p kubedns] I0413 08:12:22.197700       1 dns.go:732] Found 1 records for [local cluster svc default mysql-external] in the cache
[kube-dns-aps-78b75f775c-b2j8p kubedns] I0413 08:12:22.197715       1 dns.go:739] getRecordsForPath retval=[{Host:db.aurora.external.x.example.com Port:0 Priority:10 Weight:10 Text: Mail:false Ttl:30 TargetStrip:0 Group: Key:/skydns/local/cluster/svc/default/mysql-external}], path=[local cluster svc default mysql-external]
[kube-dns-aps-78b75f775c-b2j8p kubedns] I0413 08:12:22.197739       1 dns.go:641] Records for mysql-external.default.svc.cluster.local.: [{db.aurora.external.x.example.com 0 10 10  false 30 0  /skydns/local/cluster/svc/default/mysql-external}]
[kube-dns-aps-78b75f775c-b2j8p kubedns] I0413 08:12:22.197754       1 dns.go:612] Query for "db.aurora.external.x.example.com.", exact: false
[kube-dns-aps-78b75f775c-b2j8p kubedns] I0413 08:12:22.197789       1 dns.go:860] Not a federation query: "x" != "svc" (serviceSubdomain)
[kube-dns-aps-78b75f775c-b2j8p kubedns] I0413 08:12:22.197797       1 dns.go:732] Found 0 records for [com example x external aurora db] in the cache
[kube-dns-aps-78b75f775c-b2j8p kubedns] I0413 08:12:22.197807       1 dns.go:739] getRecordsForPath retval=[], path=[com example x external aurora db]
[kube-dns-aps-78b75f775c-b2j8p kubedns] I0413 08:12:22.197816       1 dns.go:645] No record found for db.aurora.external.x.example.com.
[kube-dns-aps-78b75f775c-b2j8p kubedns] I0413 08:12:22.197830       1 logs.go:41] skydns: incomplete CNAME chain from "db.aurora.external.x.example.com.": no nameservers configured can not lookup name

I initially thought it's regression from #210 but I'm not sure. Please let me know how I can help with any more debug info.

@MrHohn
Copy link
Member

MrHohn commented Apr 13, 2018

I can confirm DNS resolution for externalName service is broken in 1.14.9 but working in 1.14.8.

Though with 1.14.9, I can also see the externalName record is generated in kubedns logs:

I0413 18:43:54.632820       1 dns.go:593] newExternalNameService: storing key test1 with value &{www.google.com 0 10 10  false 30 0  } as test1.default.svc.cluster.local. under [local cluster svc default]

Will spend more time.

@MrHohn
Copy link
Member

MrHohn commented Apr 13, 2018

Built an image with d522d10 (one commit before #220) and the externalName service DNS resolution is working.
@grayluck

@MrHohn
Copy link
Member

MrHohn commented Apr 13, 2018

Got some pointers, seems like the nameserver for skydns somehow gets reset to empty, instead of using the ones listed in /etc/resolv.conf.

I0413 23:13:58.916300       1 dns.go:612] Query for "test1.default.svc.cluster.local.", exact: false
I0413 23:13:58.916350       1 dns.go:841] Not a federation query: len(["test1" "default" "svc" "cluster" "local"]) != 4+len(["local" "cluster"])
I0413 23:13:58.916400       1 dns.go:732] Found 1 records for [local cluster svc default test1] in the cache
I0413 23:13:58.916413       1 dns.go:739] getRecordsForPath retval=[{Host:www.google.com Port:0 Priority:10 Weight:10 Text: Mail:false Ttl:30 TargetStrip:0 Group: Key:/skydns/local/cluster/svc/default/test1}], p
ath=[local cluster svc default test1]
I0413 23:13:58.916462       1 dns.go:641] Records for test1.default.svc.cluster.local.: [{www.google.com 0 10 10  false 30 0  /skydns/local/cluster/svc/default/test1}]
I0413 23:13:58.916485       1 dns.go:612] Query for "www.google.com.", exact: false
I0413 23:13:58.916493       1 dns.go:841] Not a federation query: len(["www" "google" "com"]) != 4+len(["local" "cluster"])
I0413 23:13:58.916505       1 dns.go:732] Found 0 records for [com google www] in the cache
I0413 23:13:58.916516       1 dns.go:739] getRecordsForPath retval=[], path=[com google www]
I0413 23:13:58.916523       1 dns.go:645] No record found for www.google.com.
I0413 23:13:58.916555       1 logs.go:41] skydns: incomplete CNAME chain from "www.google.com.": no nameservers configured can not lookup name

@MrHohn
Copy link
Member

MrHohn commented Apr 13, 2018

Also found out why our e2e test didn't catch this. Turned out in the externalName test we deliberately dig for just CNAME record. So the CNAME -> A (or AAAA) record path in upstream server is not examined.

@MrHohn
Copy link
Member

MrHohn commented Apr 13, 2018

@grayluck is working on a fix.

@MrHohn
Copy link
Member

MrHohn commented Apr 16, 2018

For the record, query for PTR records that only exist in upstream server seems to be broken as well.

@MrHohn
Copy link
Member

MrHohn commented Apr 17, 2018

kube-dns 1.14.10 is released with the fix.

@MrHohn MrHohn closed this as completed Apr 17, 2018
@ApsOps
Copy link
Author

ApsOps commented Apr 18, 2018

Confirming that it works with v1.14.10.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants