[upgrade/postupgrade] FATAL post-upgrade error: unable to create/update the DNS service: services "kube-dns" not found (using coredns) #2358

VannTen · 2020-12-03T08:55:50Z

Is this a BUG REPORT or FEATURE REQUEST?

BUG REPORT

Versions

kubeadm version (use kubeadm version):kubeadm version: &version.Info{Major:"1", Minor:"15", GitVersion:"v1.15.11", GitCommit:"d94a81c724ea8e1ccc9002d89b7fe81d58f89ede", GitTreeState:"clean", BuildDate:"2020-03-12T21:06:11Z", GoVersion:"go1.12.17", Compiler:"gc", Platform:"linux/amd64"}

Environment:

Kubernetes version (use kubectl version): 1.14.1 (upgrading to 1.15.11)
Cloud provider or hardware configuration: VM
OS (e.g. from /etc/os-release): Centos 7
Kernel (e.g. uname -a): Linux 3.10.0-862.14.4.el7.x86_64 kubeadm join on slave node fails preflight checks #1 SMP Wed Sep 26 15:12:11 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
Others: kubeadm used by kubespray (but the error was reproduced with the cli kubeadm directly on the master
After searching the kubernetes slack, we're not the first to encouter that problem :
(slack requires an account registration)
https://kubernetes.slack.com/archives/C2V9WJSJD/p1589988061457900
https://kubernetes.slack.com/archives/C2V9WJSJD/p1570707565021200
https://kubernetes.slack.com/archives/CDQSC941H/p1597386500263300

What happened? /usr/local/bin/kubeadm upgrade apply -y v1.15.11 --config=/etc/kubernetes/kubeadm-config.yaml --ignore-preflight-errors=all --allow-experimental-upgrades --allow-release-candidate-upgrades --etcd-upgrade=false --force

After launching the following command :

usr/local/bin/kubeadm upgrade apply -y v1.15.11 --config=/etc/kubernetes/kubeadm-config.yaml --ignore-preflight-errors=all --allow-experimental-upgrades --allow-release-candidate-upgrades --etcd-upgrade=false --force

kubeadm fail at the post-upgrade stage with the following error :
[upgrade/postupgrade] FATAL post-upgrade error: unable to create/update the DNS service: services "kube-dns" not found
But it is configured to use coreDNS. Furthermore, the cluster was already using coreDNS, and there was no kube-dns Deployment or Service.

What you expected to happen?

kubeadm correctly recognizes coredns and does not try to find a kube-dns service.

How to reproduce it (as minimally and precisely as possible)?

The reproduction seems quite hard. The linked slack messages mentions that this happens sometimes, sometines nos (!).
We are updating several clusters with mostly the same method, and only (as of now) encoutered that problem on one. And the previous upgrade (1.13 -> 1.14) went fine, even though coredns was already used.

However, some facts that might be related :
After seeing that kubeadm set clusterDNS for kubelet config to the address 10.x.x.10 (x depending from the subnet services), I constated that this clusterIP was taken by another service (from an application running on the cluster), and the creation date of that service was between the previous upgrade and the one where we encoutered the error (so at least the timing makes sense). That setting of clusterDNS was not actually used by the kubelets, because kubespray handle its the kubelets configuration (I think), and use the third ip in the range (10.x.x.3). But maybe this somehow confuse kubeadm ?

I did not had the time to setup a reproducing scenario unfortunately. If I do, I will update the issue.

Anything else we need to know?

The workaround which is mentioned in one of the slack messages works well, and allowed us to perform our upgrade. I'll note it here since it's probably more accessible for future users of kubeadm which could stumble upon that :

Workaround

Copy the service coredns. Create a new service kube-dns from the copy, changing the name ("kube-dns") and forcing the clusterIP to 10.x.x.10 ( -> matching what is in your kubeadm config). Then relaunch your command

The text was updated successfully, but these errors were encountered:

neolit123 · 2020-12-03T18:42:31Z

hi,

[upgrade/postupgrade] FATAL post-upgrade error: unable to create/update the DNS service: services "kube-dns" not found

the service that coredns uses is also called kube-dns, because that was part of the original transition plan in k8s from kube-dns to coredns. #sig-network on k8s slack know more about this topic.

the coredns / kube-dns manifests are here:
https://github.com/kubernetes/kubernetes/blob/master/cmd/kubeadm/app/phases/addons/dns/manifests.go

Kubernetes version (use kubectl version): 1.14.1 (upgrading to 1.15.11)

this version is not supported. you'd have to be at 1.18 soon as older versions are going out of support.
if you are able to reproduce the problem with a test cluster that is 1.17 or newer versions we could have a look at it.

if so please re-open the ticket.

neolit123 · 2020-12-03T18:46:18Z

btw, the error is coming from here:
https://github.com/kubernetes/kubernetes/blob/98bc258bf5516b6c60860e06845b899eab29825d/cmd/kubeadm/app/phases/addons/dns/dns.go#L363-L365

the hard to reproduce aspect here only means that somehow the service is not available at that particular moment, which is bad.
we could make some of the operations in the function to be retried, but instead it feels like there is an external problem that has to be better understood.

lgtm87 · 2020-12-17T15:02:53Z

@neolit123 I have faced exactly the same issue during cluster upgrade from 1.16.3 to 1.17.11 version.
kubespray v2.14 was used for cluster upgrade, kubeadm command and output are the following:
"module_args":

{ "_raw_params": "timeout -k 600s 600s /usr/local/bin/kubeadm upgrade apply -y v1.17.11 --config=/etc/kubernetes/kubeadm-config.yaml --ignore-preflight-errors=all --allow-experimental-upgrades --etcd-upgrade=false --force", "_uses_shell": false, "argv": null, "chdir": null, "creates": null, "executable": null, "removes": null, "stdin": null, "stdin_add_newline": true, "strip_empty_ends": true, "warn": true }
},
"msg": "non-zero return code",
"rc": 1,
"start": "2020-12-17 07:50:21.835844",
"stderr": "W1217 07:50:21.875273 5279 strict.go:54] error unmarshaling configuration schema.GroupVersionKind{Group:"kubeproxy.config.k8s.io", Version:"v1alpha1", Kind:"KubeProxyConfiguration"}: error unmarshaling JSON: while decoding JSON: json: unknown field "tcpFinTimeout"\nW1217 07:50:21.878019 5279 defaults.go:186] The recommended value for "clusterDNS" in "KubeletConfiguration" is: [10.239.0.10]; the provided value is: [169.254.25.10]\nW1217 07:50:21.878113 5279 validation.go:28] Cannot validate kube-proxy config - no validator is available\nW1217 07:50:21.878121 5279 validation.go:28] Cannot validate kubelet config - no validator is available\nW1217 07:50:21.888242 5279 common.go:94] WARNING: Usage of the --config flag for reconfiguring the cluster during upgrade is not recommended!\nW1217 07:50:21.890289 5279 strict.go:54] error unmarshaling configuration schema.GroupVersionKind{Group:"kubeproxy.config.k8s.io", Version:"v1alpha1", Kind:"KubeProxyConfiguration"}: error unmarshaling JSON: while decoding JSON: json: unknown field "tcpFinTimeout"\nW1217 07:50:21.890652 5279 defaults.go:186] The recommended value for "clusterDNS" in "KubeletConfiguration" is: [10.239.0.10]; the provided value is: [169.254.25.10]\nW1217 07:50:21.890719 5279 validation.go:28] Cannot validate kube-proxy config - no validator is available\nW1217 07:50:21.890726 5279 validation.go:28] Cannot validate kubelet config - no validator is available\n\t[WARNING CoreDNSUnsupportedPlugins]: start version '1.6.7' not supported\n\t[WARNING CoreDNSMigration]: CoreDNS will not be upgraded: start version '1.6.7' not supported\nW1217 07:50:25.293871 5279 manifests.go:214] the default kube-apiserver authorization-mode is "Node,RBAC"; using "Node,RBAC"\nW1217 07:50:27.524497 5279 dns.go:246] the CoreDNS Configuration was not migrated: unable to migrate CoreDNS ConfigMap: start version '1.6.7' not supported. The existing CoreDNS Corefile configuration has been retained.\n[upgrade/postupgrade] FATAL post-upgrade error: unable to create/update the DNS service: services "kube-dns" not found\nTo see the stack trace of this error execute with --v=5 or higher",
"stderr_lines": [
"W1217 07:50:21.875273 5279 strict.go:54] error unmarshaling configuration schema.GroupVersionKind{Group:"kubeproxy.config.k8s.io", Version:"v1alpha1", Kind:"KubeProxyConfiguration"}: error unmarshaling JSON: while decoding JSON: json: unknown field "tcpFinTimeout"",
"W1217 07:50:21.878019 5279 defaults.go:186] The recommended value for "clusterDNS" in "KubeletConfiguration" is: [10.239.0.10]; the provided value is: [169.254.25.10]",
"W1217 07:50:21.878113 5279 validation.go:28] Cannot validate kube-proxy config - no validator is available",
"W1217 07:50:21.878121 5279 validation.go:28] Cannot validate kubelet config - no validator is available",
"W1217 07:50:21.888242 5279 common.go:94] WARNING: Usage of the --config flag for reconfiguring the cluster during upgrade is not recommended!",
"W1217 07:50:21.890289 5279 strict.go:54] error unmarshaling configuration schema.GroupVersionKind{Group:"kubeproxy.config.k8s.io", Version:"v1alpha1", Kind:"KubeProxyConfiguration"}: error unmarshaling JSON: while decoding JSON: json: unknown field "tcpFinTimeout"",
"W1217 07:50:21.890652 5279 defaults.go:186] The recommended value for "clusterDNS" in "KubeletConfiguration" is: [10.239.0.10]; the provided value is: [169.254.25.10]",
"W1217 07:50:21.890719 5279 validation.go:28] Cannot validate kube-proxy config - no validator is available",
"W1217 07:50:21.890726 5279 validation.go:28] Cannot validate kubelet config - no validator is available",
"\t[WARNING CoreDNSUnsupportedPlugins]: start version '1.6.7' not supported",
"\t[WARNING CoreDNSMigration]: CoreDNS will not be upgraded: start version '1.6.7' not supported",
"W1217 07:50:25.293871 5279 manifests.go:214] the default kube-apiserver authorization-mode is "Node,RBAC"; using "Node,RBAC"",
"W1217 07:50:27.524497 5279 dns.go:246] the CoreDNS Configuration was not migrated: unable to migrate CoreDNS ConfigMap: start version '1.6.7' not supported. The existing CoreDNS Corefile configuration has been retained.",
"[upgrade/postupgrade] FATAL post-upgrade error: unable to create/update the DNS service: services "kube-dns" not found",
"To see the stack trace of this error execute with --v=5 or higher"

Cluster is also using coredns before upgrade, can you please advice how to resolve it?

neolit123 · 2020-12-17T16:06:36Z

"\t[WARNING CoreDNSUnsupportedPlugins]: start version '1.6.7' not supported",

my understanding of this problem:

you have 1.16 and try to upgrade to 1.17
your 1.16 cluster has a coredns version that is not supported by the upgrade tooling that is included in 1.17.
1.16 -> 1.17 upgrade fails.

you can try editing the CoreDNS ConfigMap and Deployment to use 1.6.2 (downgrade CoreDNS):
https://github.com/kubernetes/kubernetes/blob/release-1.16/cmd/kubeadm/app/constants/constants.go#L336

if 1.6.7 is something that kubespray installs, please contact the kubespray team.

lgtm87 · 2020-12-18T09:00:13Z

Thank you for update, the funny thing is that even if upgrade process fails - coredns 1.6.7 is installed (there is 2 replicasets, one is failing with corefile-backup configmap, and one is working fine).
I have upgraded another cluster with the same kubespray version and coredns 1.6.7 (also 1.16->1.17) and it finished successfully.
I don't think its an option for me to downgrade coredns version if we already have 1.6.7, looking at logs it's just a warning, but fatal message is about "kube-dns service not found"?

VannTen · 2021-02-22T12:33:14Z

I encountered the same issue (with the same cluster) upgrading from 1.17.12 to 1.18.10, using kubespray v2.14.2.
The workaround was to create a copy of the coredns svc named kube-dns, and free the 10.x.0.10 service address (it was used by another service).
@neolit123 Could we reopen this issue ? I lack the permission to do so.

neolit123 closed this as completed Dec 3, 2020

juliohm1978 mentioned this issue Dec 25, 2020

FATAL post-upgrade error: unable to create/update the DNS service: services "kube-dns" not found kubernetes-sigs/kubespray#7083

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[upgrade/postupgrade] FATAL post-upgrade error: unable to create/update the DNS service: services "kube-dns" not found (using coredns) #2358

[upgrade/postupgrade] FATAL post-upgrade error: unable to create/update the DNS service: services "kube-dns" not found (using coredns) #2358

VannTen commented Dec 3, 2020

neolit123 commented Dec 3, 2020

neolit123 commented Dec 3, 2020

lgtm87 commented Dec 17, 2020

neolit123 commented Dec 17, 2020 •

edited

Loading

lgtm87 commented Dec 18, 2020

VannTen commented Feb 22, 2021 •

edited

Loading

[upgrade/postupgrade] FATAL post-upgrade error: unable to create/update the DNS service: services "kube-dns" not found (using coredns) #2358

[upgrade/postupgrade] FATAL post-upgrade error: unable to create/update the DNS service: services "kube-dns" not found (using coredns) #2358

Comments

VannTen commented Dec 3, 2020

Is this a BUG REPORT or FEATURE REQUEST?

Versions

What happened? /usr/local/bin/kubeadm upgrade apply -y v1.15.11 --config=/etc/kubernetes/kubeadm-config.yaml --ignore-preflight-errors=all --allow-experimental-upgrades --allow-release-candidate-upgrades --etcd-upgrade=false --force

What you expected to happen?

How to reproduce it (as minimally and precisely as possible)?

Anything else we need to know?

Workaround

neolit123 commented Dec 3, 2020

neolit123 commented Dec 3, 2020

lgtm87 commented Dec 17, 2020

neolit123 commented Dec 17, 2020 • edited Loading

lgtm87 commented Dec 18, 2020

VannTen commented Feb 22, 2021 • edited Loading

neolit123 commented Dec 17, 2020 •

edited

Loading

VannTen commented Feb 22, 2021 •

edited

Loading