Kubeadm init fails with "Error writing Crisocket information for the control-plane node: timed out waiting for the condition" #1587

Ankit-rana · 2019-05-31T18:32:30Z

Is this a BUG REPORT or FEATURE REQUEST?

BUG REPORT

Versions

kubeadm version (use kubeadm version): sudo kubeadm version
kubeadm version: &version.Info{Major:"1", Minor:"13", GitVersion:"v1.13.2", GitCommit:"cff46ab41ff0bb44d8584413b598ad8360ec1def", GitTreeState:"clean", BuildDate:"2019-01-29T12:00:00Z", GoVersion:"go1.11.10", Compiler:"gc", Platform:"linux/amd64"}

Environment:

Kubernetes version (use kubectl version):~> kubectl version
Client Version: version.Info{Major:"1", Minor:"13", GitVersion:"v1.13.6", GitCommit:"abdda3f9fefa29172298a2e42f5102e777a8ec25", GitTreeState:"clean", BuildDate:"2019-05-08T13:53:53Z", GoVersion:"go1.11.5", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"13", GitVersion:"v1.13.6", GitCommit:"abdda3f9fefa29172298a2e42f5102e777a8ec25", GitTreeState:"clean", BuildDate:"2019-05-08T13:46:28Z", GoVersion:"go1.11.5", Compiler:"gc", Platform:"linux/amd64"}
Cloud provider or hardware configuration:
OS (e.g. from /etc/os-release):
cat /etc/os-release
NAME="SLES"
VERSION="15"
VERSION_ID="15"
PRETTY_NAME="SUSE Linux Enterprise Server 15"
ID="sles"
ID_LIKE="suse"
ANSI_COLOR="0;32"
CPE_NAME="cpe:/o:suse:sles:15"
Kernel (e.g. uname -a):Linux master-2 4.12.14-150.17-default kubeadm join on slave node fails preflight checks #1 SMP Thu May 2 15:15:46 UTC 2019 (bf13fb8) x86_64 x86_64 x86_64 GNU/Linux

What happened?

kubeadm failed with error "Error writing Crisocket information for the control-plane node: timed out waiting for the condition"

What you expected to happen?

"sudo kubeadm init --pod-network-cidr 10.248.0.0/16" command should have setup all the component of master succesfully.

How to reproduce it (as minimally and precisely as possible)?

crayadm@master-2:~> sudo kubeadm init --pod-network-cidr 10.248.0.0/16
I0531 18:16:32.726064 5686 version.go:237] remote version is much newer: v1.14.2; falling back to: stable-1.13
[init] Using Kubernetes version: v1.13.6
[preflight] Running pre-flight checks
[preflight] Pulling images required for setting up a Kubernetes cluster
[preflight] This might take a minute or two, depending on the speed of your internet connection
[preflight] You can also perform this action in beforehand using 'kubeadm config images pull'
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Activating the kubelet service
[certs] Using certificateDir folder "/etc/kubernetes/pki"
[certs] Generating "etcd/ca" certificate and key
[certs] Generating "apiserver-etcd-client" certificate and key
[certs] Generating "etcd/server" certificate and key
[certs] etcd/server serving cert is signed for DNS names [master-2 localhost] and IPs [10.248.0.210 127.0.0.1 ::1]
[certs] Generating "etcd/peer" certificate and key
[certs] etcd/peer serving cert is signed for DNS names [master-2 localhost] and IPs [10.248.0.210 127.0.0.1 ::1]
[certs] Generating "etcd/healthcheck-client" certificate and key
[certs] Generating "ca" certificate and key
[certs] Generating "apiserver" certificate and key
[certs] apiserver serving cert is signed for DNS names [master-2 kubernetes kubernetes.default kubernetes.default.svc kubernetes.default.svc.cluster.local] and IPs [10.96.0.1 10.248.0.210]
[certs] Generating "apiserver-kubelet-client" certificate and key
[certs] Generating "front-proxy-ca" certificate and key
[certs] Generating "front-proxy-client" certificate and key
[certs] Generating "sa" key and public key
[kubeconfig] Using kubeconfig folder "/etc/kubernetes"
[kubeconfig] Writing "admin.conf" kubeconfig file
[kubeconfig] Writing "kubelet.conf" kubeconfig file
[kubeconfig] Writing "controller-manager.conf" kubeconfig file
[kubeconfig] Writing "scheduler.conf" kubeconfig file
[control-plane] Using manifest folder "/etc/kubernetes/manifests"
[control-plane] Creating static Pod manifest for "kube-apiserver"
[control-plane] Creating static Pod manifest for "kube-controller-manager"
[control-plane] Creating static Pod manifest for "kube-scheduler"
[etcd] Creating static Pod manifest for local etcd in "/etc/kubernetes/manifests"
[wait-control-plane] Waiting for the kubelet to boot up the control plane as static Pods from directory "/etc/kubernetes/manifests". This can take up to 4m0s
[apiclient] All control plane components are healthy after 14.003255 seconds
[uploadconfig] storing the configuration used in ConfigMap "kubeadm-config" in the "kube-system" Namespace
[kubelet] Creating a ConfigMap "kubelet-config-1.13" in namespace kube-system with the configuration for the kubelets in the cluster
[patchnode] Uploading the CRI Socket information "/var/run/dockershim.sock" to the Node API object "master-2" as an annotation
[kubelet-check] Initial timeout of 40s passed.
error execution phase upload-config/kubelet: Error writing Crisocket information for the control-plane node: timed out waiting for the condition.

Anything else we need to know?

~> cat /etc/crictl.yaml
runtime-endpoint: unix:///var/run/dockershim.sock
image-endpoint: unix:///var/run/dockershim.sock
timeout: 10
debug: true
~> crictl pods
FATA[0010] failed to connect: failed to connect: context deadline exceeded
~> systemctl status docker
● docker.service - Docker Application Container Engine
Loaded: loaded (/usr/lib/systemd/system/docker.service; enabled; vendor preset: disabled)
Active: active (running) since Sat 2019-05-25 14:29:49 UTC; 6 days ago
Docs: http://docs.docker.com
Main PID: 19508 (dockerd)
Tasks: 120
Memory: 103.3M
CPU: 48min 59.396s
CGroup: /system.slice/docker.service
├─ 5878 docker-containerd-shim -namespace moby -workdir /var/lib/docker/containerd/daemon/io.containerd.runtime.v1.linux/moby/50ba657c71cfeccf8ffd3544334ab2fa9f0576>
├─ 5880 docker-containerd-shim -namespace moby -workdir /var/lib/docker/containerd/daemon/io.containerd.runtime.v1.linux/moby/830a32f38fec61cd909a31543bc65d2b9b6abe>
├─ 5883 docker-containerd-shim -namespace moby -workdir /var/lib/docker/containerd/daemon/io.containerd.runtime.v1.linux/moby/95bd179c9c1c90c0f3f97c8e8ee22203b75ecc>
├─ 5897 docker-containerd-shim -namespace moby -workdir /var/lib/docker/containerd/daemon/io.containerd.runtime.v1.linux/moby/3ccce84564cc38018df299ca780f0929b63fa0>
├─ 5934 /pause
├─ 5941 /pause
├─ 5946 /pause
├─ 5968 /pause
├─ 6045 docker-containerd-shim -namespace moby -workdir /var/lib/docker/containerd/daemon/io.containerd.runtime.v1.linux/moby/9d66fe3c91d7154e3115620d2c6bc4334548a8>
├─ 6059 docker-containerd-shim -namespace moby -workdir /var/lib/docker/containerd/daemon/io.containerd.runtime.v1.linux/moby/41fca5033552e43d50a53433228a5f7c3e413c>
├─ 6060 docker-containerd-shim -namespace moby -workdir /var/lib/docker/containerd/daemon/io.containerd.runtime.v1.linux/moby/a39bffdf8469140338eaabc2d2b490aeaf013b>
├─ 6061 docker-containerd-shim -namespace moby -workdir /var/lib/docker/containerd/daemon/io.containerd.runtime.v1.linux/moby/2de48a98c6d58d3e7b9322939bd542a9e6a273>
├─ 6094 kube-apiserver --authorization-mode=Node,RBAC --advertise-address=10.248.0.210 --allow-privileged=true --client-ca-file=/etc/kubernetes/pki/ca.crt --enable->
├─ 6111 etcd --advertise-client-urls=https://10.248.0.210:2379 --cert-file=/etc/kubernetes/pki/etcd/server.crt --client-cert-auth=true --data-dir=/var/lib/etcd --in>
├─ 6130 kube-controller-manager --address=127.0.0.1 --allocate-node-cidrs=true --authentication-kubeconfig=/etc/kubernetes/controller-manager.conf --authorization-k>
├─ 6140 kube-scheduler --address=127.0.0.1 --kubeconfig=/etc/kubernetes/scheduler.conf --leader-elect=true
├─19508 /usr/bin/dockerd --add-runtime oci=/usr/sbin/docker-runc
└─19515 docker-containerd --config /var/run/docker/containerd/containerd.toml

The text was updated successfully, but these errors were encountered:

neolit123 · 2019-06-01T01:44:49Z

do you see anything suspicions in the kubelet logs?
this might be an issue that has to be moved to k/k instead of k/kubeadm.

Ankit-rana · 2019-06-02T06:45:49Z

`~> sudo systemctl status kubelet
● kubelet.service - Kubernetes Kubelet Server
Loaded: loaded (/usr/lib/systemd/system/kubelet.service; enabled; vendor preset: disabled)
Active: active (running) since Sun 2019-06-02 06:30:57 UTC; 8min ago
Docs: https://github.com/GoogleCloudPlatform/kubernetes
Main PID: 14593 (hyperkube)
Tasks: 16 (limit: 131072)
Memory: 110.2M
CPU: 14.669s
CGroup: /system.slice/kubelet.service
└─14593 /usr/bin/hyperkube kubelet --logtostderr=true --v=2 --hostname-override=127.0.0.1 --allow-privileged=false --config=/etc/kubernetes/kubelet-config.yaml --volume-plugin-dir=/usr/lib/kubernetes/kubelet-plugins

Jun 02 06:38:50 master-2 hyperkube[14593]: I0602 06:38:50.049825 14593 kubelet_node_status.go:446] Recording NodeHasNoDiskPressure event message for node 127.0.0.1
Jun 02 06:38:50 master-2 hyperkube[14593]: I0602 06:38:50.049841 14593 kubelet_node_status.go:446] Recording NodeHasSufficientPID event message for node 127.0.0.1
Jun 02 06:38:56 master-2 hyperkube[14593]: I0602 06:38:56.046519 14593 kubelet_node_status.go:278] Setting node annotation to enable volume controller attach/detach
Jun 02 06:38:56 master-2 hyperkube[14593]: I0602 06:38:56.049899 14593 kubelet_node_status.go:446] Recording NodeHasSufficientMemory event message for node 127.0.0.1
Jun 02 06:38:56 master-2 hyperkube[14593]: I0602 06:38:56.049936 14593 kubelet_node_status.go:446] Recording NodeHasNoDiskPressure event message for node 127.0.0.1
Jun 02 06:38:56 master-2 hyperkube[14593]: I0602 06:38:56.049951 14593 kubelet_node_status.go:446] Recording NodeHasSufficientPID event message for node 127.0.0.1
Jun 02 06:38:57 master-2 hyperkube[14593]: I0602 06:38:57.046597 14593 kubelet_node_status.go:278] Setting node annotation to enable volume controller attach/detach
Jun 02 06:38:57 master-2 hyperkube[14593]: I0602 06:38:57.051870 14593 kubelet_node_status.go:446] Recording NodeHasSufficientMemory event message for node 127.0.0.1
Jun 02 06:38:57 master-2 hyperkube[14593]: I0602 06:38:57.052543 14593 kubelet_node_status.go:446] Recording NodeHasNoDiskPressure event message for node 127.0.0.1
Jun 02 06:38:57 master-2 hyperkube[14593]: I0602 06:38:57.053043 14593 kubelet_node_status.go:446] Recording NodeHasSufficientPID event message for node 127.0.0.1`

Kubelet logs: link

yelongyu · 2019-06-18T13:21:33Z

Got the same issue:

I0618 20:52:58.262122   31415 round_trippers.go:419] curl -k -v -XGET  -H "Accept: application/json, */*" -H "User-Agent: kubeadm/v1.13.4 (linux/amd64) kubernetes/c27b913" 'https://k8s-master:60443/api/v1/nodes/10-10-40-93'
I0618 20:52:58.265602   31415 round_trippers.go:438] GET https://k8s-master:60443/api/v1/nodes/10-10-40-93 404 Not Found in 3 milliseconds
I0618 20:52:58.265620   31415 round_trippers.go:444] Response Headers:
I0618 20:52:58.265626   31415 round_trippers.go:447]     Content-Type: application/json
I0618 20:52:58.265632   31415 round_trippers.go:447]     Content-Length: 192
I0618 20:52:58.265636   31415 round_trippers.go:447]     Date: Tue, 18 Jun 2019 12:52:58 GMT
I0618 20:52:58.265684   31415 request.go:942] Response Body: {"kind":"Status","apiVersion":"v1","metadata":{},"status":"Failure","message":"nodes \"10-10-40-93\" not found","reason":"NotFound","details":{"name":"10-10-40-93","kind":"nodes"},"code":404}
I0618 20:52:58.265837   31415 round_trippers.go:419] curl -k -v -XGET  -H "User-Agent: kubeadm/v1.13.4 (linux/amd64) kubernetes/c27b913" -H "Accept: application/json, */*" 'https://k8s-master:60443/api/v1/nodes/10-10-40-93'
I0618 20:52:58.268370   31415 round_trippers.go:438] GET https://k8s-master:60443/api/v1/nodes/10-10-40-93 404 Not Found in 2 milliseconds
I0618 20:52:58.268389   31415 round_trippers.go:444] Response Headers:
I0618 20:52:58.268396   31415 round_trippers.go:447]     Content-Type: application/json
I0618 20:52:58.268402   31415 round_trippers.go:447]     Content-Length: 192
I0618 20:52:58.268409   31415 round_trippers.go:447]     Date: Tue, 18 Jun 2019 12:52:58 GMT
I0618 20:52:58.268430   31415 request.go:942] Response Body: {"kind":"Status","apiVersion":"v1","metadata":{},"status":"Failure","message":"nodes \"10-10-40-93\" not found","reason":"NotFound","details":{"name":"10-10-40-93","kind":"nodes"},"code":404}
error execution phase upload-config/kubelet: Error writing Crisocket information for the control-plane node: timed out waiting for the condition

neolit123 · 2019-08-03T17:33:23Z

there are multiple aspects at play here, but i don't know the exact cause.

cat /etc/crictl.yaml
runtime-endpoint: unix:///var/run/dockershim.sock

why not use /run/containerd/containerd.sock?

/usr/bin/hyperkube

hyperkube is not something we have e2e test for, so i wouldn't say the kubeadm team supports it.

please try newer versions of k8s and re-open this ticket if the problem persists or if you have found the cause. our e2e test signal for 1.13 is green using containerd and docker.

xsaardo · 2019-11-12T22:22:12Z

Kubernetes version

Client Version: v1.15.4
Server Version: v1.15.4

Kubeadm version

kubeadm version: &version.Info{Major:"1", Minor:"15", GitVersion:"v1.15.4", GitCommit:"67d2fcf276fcd9cf743ad4be9a9ef5828adc082f", GitTreeState:"clean", BuildDate:"2019-09-18T14:48:18Z", GoVersion:"go1.12.9", Compiler:"gc", Platform:"linux/amd64"}

Kubectl get pods:

NAMESPACE     NAME                                       READY   STATUS    RESTARTS   AGE
kube-system   calico-kube-controllers-7697bc9b99-j9zd8   1/1     Running   0          43m
kube-system   calico-node-jwrg8                          1/1     Running   0          41m
kube-system   calico-node-r52d8                          1/1     Running   0          43m
kube-system   calico-node-zpzxb                          1/1     Running   0          42m
kube-system   coredns-5c98db65d4-764bm                   1/1     Running   0          48m
kube-system   coredns-5c98db65d4-z8h78                   1/1     Running   0          48m
kube-system   etcd-e1n1-g                                1/1     Running   0          47m
kube-system   kube-apiserver-e1n1-g                      1/1     Running   0          47m
kube-system   kube-controller-manager-e1n1-g             1/1     Running   0          47m
kube-system   kube-proxy-f6px5                           1/1     Running   0          41m
kube-system   kube-proxy-njjcp                           1/1     Running   0          42m
kube-system   kube-proxy-nn6df                           1/1     Running   0          48m
kube-system   kube-scheduler-e1n1-g                      1/1     Running   0          46m

kubectl get nodes -o wide

NAME     STATUS   ROLES    AGE   VERSION   INTERNAL-IP       EXTERNAL-IP   OS-IMAGE               KERNEL-VERSION               CONTAINER-RUNTIME
e1n1-g   Ready    master   48m   v1.15.4   192.168.142.101   <none>        OpenShift Enterprise   3.10.0-957.21.3.el7.x86_64   docker://18.9.8
e2n1-g   Ready    <none>   43m   v1.15.4   192.168.142.102   <none>        OpenShift Enterprise   3.10.0-957.21.3.el7.x86_64   docker://18.9.8
e3n1-g   Ready    <none>   42m   v1.15.4   192.168.142.103   <none>        OpenShift Enterprise   3.10.0-957.21.3.el7.x86_64   docker://18.9.8

We have been trying to add an additional fourth node to the cluster using kubeadm join command
kubeadm join 192.168.142.101:6443 --token rp0dqg.t7jdtltndurri2hh --discovery-token-ca-cert-hash sha256:b740086b5dfba97b4e416a95816b19383181b5785b9bc0b2480c43c4dfd5b1d7 -v 256
with the following error

I1112 16:57:54.572465   54028 round_trippers.go:419] curl -k -v -XGET  -H "Accept: application/json, */*" -H "User-Agent: kubeadm/v1.15.4 (linux/amd64) kubernetes/67d2fcf" 'https://192.168.142.101:6443/api/v1/nodes/e4n1-g'
I1112 16:57:54.573492   54028 round_trippers.go:438] GET https://192.168.142.101:6443/api/v1/nodes/e4n1-g 401 Unauthorized in 1 milliseconds
I1112 16:57:54.573520   54028 round_trippers.go:444] Response Headers:
I1112 16:57:54.573533   54028 round_trippers.go:447]     Content-Type: application/json
I1112 16:57:54.573545   54028 round_trippers.go:447]     Content-Length: 129
I1112 16:57:54.573556   54028 round_trippers.go:447]     Date: Tue, 12 Nov 2019 21:56:31 GMT
I1112 16:57:54.573586   54028 request.go:947] Response Body: {"kind":"Status","apiVersion":"v1","metadata":{},"status":"Failure","message":"Unauthorized","reason":"Unauthorized","code":401}
error execution phase kubelet-start: error uploading crisocket: timed out waiting for the condition

Kubelet service

[root@e4n1-g ~]# systemctl  -l status kubelet
● kubelet.service - kubelet: The Kubernetes Node Agent
   Loaded: loaded (/usr/lib/systemd/system/kubelet.service; enabled; vendor preset: disabled)
  Drop-In: /usr/lib/systemd/system/kubelet.service.d
           └─10-kubeadm.conf
   Active: active (running) since Tue 2019-11-12 17:44:02 EST; 53s ago
     Docs: https://kubernetes.io/docs/
 Main PID: 59061 (kubelet)
    Tasks: 58
   Memory: 52.5M
   CGroup: /system.slice/kubelet.service
           └─59061 /usr/bin/kubelet --bootstrap-kubeconfig=/etc/kubernetes/bootstrap-kubelet.conf --kubeconfig=/etc/kubernetes/kubelet.conf --config=/var/lib/kubelet/config.yaml --cgroup-driver=systemd --network-plugin=cni --pod-infra-container-image=k8s.gcr.io/pause:3.1 --cgroup-driver=systemd

Nov 12 17:44:56 e4n1-g kubelet[59061]: E1112 17:44:56.015008   59061 kubelet.go:2252] node "e4n1-g" not found
Nov 12 17:44:56 e4n1-g kubelet[59061]: E1112 17:44:56.115448   59061 kubelet.go:2252] node "e4n1-g" not found
Nov 12 17:44:56 e4n1-g kubelet[59061]: E1112 17:44:56.188180   59061 reflector.go:125] k8s.io/client-go/informers/factory.go:133: Failed to list *v1beta1.CSIDriver: Unauthorized
Nov 12 17:44:56 e4n1-g kubelet[59061]: E1112 17:44:56.215887   59061 kubelet.go:2252] node "e4n1-g" not found
Nov 12 17:44:56 e4n1-g kubelet[59061]: E1112 17:44:56.316037   59061 kubelet.go:2252] node "e4n1-g" not found
Nov 12 17:44:56 e4n1-g kubelet[59061]: E1112 17:44:56.387478   59061 reflector.go:125] k8s.io/client-go/informers/factory.go:133: Failed to list *v1beta1.RuntimeClass: Unauthorized
Nov 12 17:44:56 e4n1-g kubelet[59061]: E1112 17:44:56.416251   59061 kubelet.go:2252] node "e4n1-g" not found
Nov 12 17:44:56 e4n1-g kubelet[59061]: E1112 17:44:56.516633   59061 kubelet.go:2252] node "e4n1-g" not found
Nov 12 17:44:56 e4n1-g kubelet[59061]: E1112 17:44:56.587736   59061 reflector.go:125] k8s.io/kubernetes/pkg/kubelet/kubelet.go:454: Failed to list *v1.Node: Unauthorized
Nov 12 17:44:56 e4n1-g kubelet[59061]: E1112 17:44:56.616816   59061 kubelet.go:2252] node "e4n1-g" not found

We have seen the error node not found from the other nodes that have been successfully added to teh cluster so we are not sure if that is part of the problem. We tried the suggestions from other related issues but we are still unable to add the node to the cluster. Any help is appreciated

neolit123 · 2019-11-12T22:46:05Z

error execution phase kubelet-start: error uploading crisocket: timed out waiting for the condition

could this be a case where the bootstrap token used for join is expired?

xsaardo · 2019-11-12T23:10:44Z

@neolist123 we have tried recreating the token so it should still be valid

neolit123 · 2019-11-12T23:22:56Z

try enabling --v=10 for "join" and observe the API call failures.
this might give a better indication of what is going on.

xsaardo · 2019-11-12T23:44:10Z

Here's the output of kubeadm join
kubeadm-join.log

neolit123 · 2019-11-12T23:58:57Z

the --v=10 logs just confirm that it's retrying.
right after this happens can you dump a kubelet log using journalctl -xeu kubelet too?

what is different about this node?
what other nodes do you have in the cluster?

vikramkhatri · 2019-11-13T14:31:19Z

@neolit123 - Here is the output.

# journalctl -xeu kubelet
Nov 13 09:24:04 e4n1-g kubelet[68502]: E1113 09:24:04.723961   68502 reflector.go:125] k8s.io/kubernetes/pkg/kubelet/kubelet.go:454: FNov 13 09:24:04 e4n1-g kubelet[68502]: E1113 09:24:04.752982   68502 kubelet.go:2252] node "e4n1-g" not found
Nov 13 09:24:04 e4n1-g kubelet[68502]: E1113 09:24:04.853385   68502 kubelet.go:2252] node "e4n1-g" not found
Nov 13 09:24:04 e4n1-g kubelet[68502]: E1113 09:24:04.923760   68502 reflector.go:125] k8s.io/client-go/informers/factory.go:133: FailNov 13 09:24:04 e4n1-g kubelet[68502]: E1113 09:24:04.953533   68502 kubelet.go:2252] node "e4n1-g" not found
Nov 13 09:24:05 e4n1-g kubelet[68502]: E1113 09:24:05.053956   68502 kubelet.go:2252] node "e4n1-g" not found
Nov 13 09:24:05 e4n1-g kubelet[68502]: E1113 09:24:05.123712   68502 reflector.go:125] k8s.io/client-go/informers/factory.go:133: FailNov 13 09:24:05 e4n1-g kubelet[68502]: E1113 09:24:05.154214   68502 kubelet.go:2252] node "e4n1-g" not found
Nov 13 09:24:05 e4n1-g kubelet[68502]: E1113 09:24:05.254451   68502 kubelet.go:2252] node "e4n1-g" not found
Nov 13 09:24:05 e4n1-g kubelet[68502]: E1113 09:24:05.324240   68502 reflector.go:125] k8s.io/kubernetes/pkg/kubelet/kubelet.go:445: FNov 13 09:24:05 e4n1-g kubelet[68502]: E1113 09:24:05.354904   68502 kubelet.go:2252] node "e4n1-g" not found
...
Nov 13 09:26:06 e4n1-g kubelet[68502]: E1113 09:26:06.725968   68502 reflector.go:125] k8s.io/kubernetes/pkg/kubelet/kubelet.go:445: FNov 13 09:26:06 e4n1-g kubelet[68502]: E1113 09:26:06.754334   68502 kubelet.go:2252] node "e4n1-g" not found
Nov 13 09:26:06 e4n1-g kubelet[68502]: E1113 09:26:06.854619   68502 kubelet.go:2252] node "e4n1-g" not found
Nov 13 09:26:06 e4n1-g kubelet[68502]: E1113 09:26:06.874603   68502 controller.go:125] failed to ensure node lease exists, will retryNov 13 09:26:06 e4n1-g kubelet[68502]: E1113 09:26:06.926154   68502 reflector.go:125] k8s.io/kubernetes/pkg/kubelet/config/apiserver.Nov 13 09:26:06 e4n1-g kubelet[68502]: E1113 09:26:06.954875   68502 kubelet.go:2252] node "e4n1-g" not found
Nov 13 09:26:07 e4n1-g kubelet[68502]: E1113 09:26:07.055283   68502 kubelet.go:2252] node "e4n1-g" not found

I appreciate your help. We have searched all possible google suggestions but unable to resolve this. This started when we did kubeadm reset to rebuild the cluster. This node was the part of the cluster before but after the last reset - something is preventing it to join. Your insight will be very helpful. I

vikramkhatri · 2019-11-13T15:28:51Z

Ok - Finally, this is solved. The issue was this.

When we did kubeadm reset and it worked. It said that it deleted the directory.

[reset] Deleting contents of stateful directories: [/var/lib/kubelet /etc/cni/net.d /var/lib/dockershim /var/run/kubernetes]

And we assumed that it did what it said. When I checked /var/lib/kubelet - it was still there. I tried deleting this manually, and it failed for the pods directory that was there from previous Kubernetes installation. The reason - it was not able to delete mount directory due to read only attribute set to it by Portworx previously. I had to use chattr -i mount, and then I was able to delete the pods directory from /var/lib/kubelet directory. The kubeadm join worked perfectly after that.

It consumed our whole day. When kubeadm reset says that it deleted some directories - Make sure that those are gone in fact. I hope that this helps someone else who might have similar issues. The kubeadm should throw an error if a directory deletion was not successful.

neolit123 · 2019-11-13T15:52:18Z

It consumed our whole day. When kubeadm reset says that it deleted some directories - Make sure that those are gone in fact. I

what is your kubeadm version?

ultimately kubeadm reset is a best effort command and if something is protected we don't want to touch it. what we can fix here is giving a better indication if a delete failed.

vikramkhatri · 2019-11-13T16:08:19Z

we can fix here is giving a better indication if a delete failed

Thank you as that will be very helpful to know that something did not go right.

# kubeadm version
kubeadm version: &version.Info{Major:"1", Minor:"15", GitVersion:"v1.15.4", GitCommit:"67d2fcf276fcd9cf743ad4be9a9ef5828adc082f", GitTreeState:"clean", BuildDate:"2019-09-18T14:48:18Z", GoVersion:"go1.12.9", Compiler:"gc", Platform:"linux/amd64"}

neolit123 · 2019-11-13T16:14:54Z

i should have asked for the output from reset too.
did it not show a warning at least?

we did some refactoring for reset in 1.16 and it's not clear to me whether this is already fixed or not.

vikramkhatri · 2019-11-13T16:21:23Z

I was able to scroll up and get the output for kubeadm reset when this directory was there. It did not show a warning for not able to delete a file or a directory.

# kubeadm reset
[reset] WARNING: Changes made to this host by 'kubeadm init' or 'kubeadm join' will be reverted.
[reset] Are you sure you want to proceed? [y/N]: y
[preflight] Running pre-flight checks
W1113 09:56:37.221105  130278 removeetcdmember.go:79] [reset] No kubeadm config, using etcd pod spec to get data directory
[reset] No etcd config found. Assuming external etcd
[reset] Please, manually reset etcd to prevent further issues
[reset] Stopping the kubelet service
[reset] Unmounting mounted directories in "/var/lib/kubelet"
[reset] Deleting contents of config directories: [/etc/kubernetes/manifests /etc/kubernetes/pki]
[reset] Deleting files: [/etc/kubernetes/admin.conf /etc/kubernetes/kubelet.conf /etc/kubernetes/bootstrap-kubelet.conf /etc/kubernetes/controller-manager.conf /etc/kubernetes/scheduler.conf]
[reset] Deleting contents of stateful directories: [/var/lib/kubelet /etc/cni/net.d /var/lib/dockershim /var/run/kubernetes]

The reset process does not reset or clean up iptables rules or IPVS tables.
If you wish to reset iptables, you must do so manually.
For example:
iptables -F && iptables -t nat -F && iptables -t mangle -F && iptables -X

If your cluster was setup to utilize IPVS, run ipvsadm --clear (or similar)
to reset your system's IPVS tables.

The reset process does not clean your kubeconfig files and you must remove them manually.
Please, check the contents of the $HOME/.kube/config file.

After above:

# ls -l /var/lib/kubelet/
total 32
-rw-r--r-- 1 root root 1744 Nov 13 10:01 config.yaml
-rw------- 1 root root   62 Oct  3 15:49 cpu_manager_state
drwxr-xr-x 2 root root 4096 Nov 13 10:01 device-plugins
drwxr-xr-x 2 root root 4096 Oct  3 15:49 pki
drwx------ 2 root root 4096 Oct  3 15:49 plugin-containers
drwxr-x--- 3 root root 4096 Nov 12 13:35 plugins
drwxr-x--- 2 root root 4096 Oct  4 12:51 plugins_registry
drwxr-x--- 4 root root 4096 Nov 12 13:31 pods
[root@e4n1-g ~]# rm -fr /var/lib/kubelet
rm: cannot remove ‘/var/lib/kubelet/pods/f5d42183-0eb2-433c-9feb-0e531dbd28ef/volumes/kubernetes.io~csi/pvc-6a8ccb7a-7262-41cf-9c28-5e2e59755445/mount’: Operation not permitted
rm: cannot remove ‘/var/lib/kubelet/pods/a104a67d-cca8-442d-a19d-ea29c1df4185/volumes/kubernetes.io~csi/pvc-e7897257-cca6-45da-a46b-8eab54cb13dc/mount’: Operation not permitted
rm: cannot remove ‘/var/lib/kubelet/pods/a104a67d-cca8-442d-a19d-ea29c1df4185/volumes/kubernetes.io~csi/pvc-11a62da0-cbf9-4d5c-8978-acbedd2d96a4/mount’: Operation not permitted
[root@e4n1-g ~]# ls -l /var/lib/kubelet/
total 4
drwxr-x--- 4 root root 4096 Nov 12 13:31 pods

neolit123 · 2019-11-13T16:28:34Z

i will log an issue with your details and investigate if this is fixed in 1.16
thanks.

Tcarters · 2021-03-02T13:20:25Z

Hi Please the issue is resolved???
I have the same here when launching with the command: "kubeadm init --pod-network-cidr=10.244.0.0/16 --ignore-preflight-errors=NumCPU --ignore-preflight-errors=Mem --node-name=Master"

[kubelet] Creating a ConfigMap "kubelet-config-1.20" in namespace kube-system with the configuration for the kubelets in the cluster error execution phase upload-config/kubelet: Error writing Crisocket information for the control-plane node: timed out waiting for the condition
To see the stack trace of this error execute with --v=5 or higher

neolit123 · 2021-03-02T13:35:50Z

for questions please use #kubeadm and the support channels:
https://github.com/kubernetes/kubernetes/blob/master/SUPPORT.md

Tcarters · 2021-03-02T13:46:26Z

for questions please use #kubeadm and the support channels:
https://github.com/kubernetes/kubernetes/blob/master/SUPPORT.md

I am not asking a question i have issue with kubeadm init ...can you check the error for how to resolve it ?

neolit123 · 2021-03-02T13:55:54Z

the links i gave are the place to ask for support too.

you seem to be ignoring the CPU and Memory checks, so probably the machine doesn't have enough memory and the apiserver cannot start properly.

stevanbangle · 2022-01-12T10:21:10Z

sudo kubeadm reset
worked for me

GitZhangChi · 2022-07-17T16:06:44Z

sudo kubeadm reset worked for me

gooooooooood

JiayangZhou · 2023-03-13T21:30:11Z

This fixed for me. #1438 (comment)
drop 20-etcd-service-manager.conf under /etc/systemd/system/kubelet.service.d

neolit123 added priority/awaiting-more-evidence Lowest priority. Possibly useful, but not yet enough support to actually get it done. sig/node Categorizes an issue or PR as relevant to SIG Node. labels Jun 1, 2019

neolit123 closed this as completed Aug 3, 2019

neolit123 mentioned this issue Nov 13, 2019

kubeadm reset does not warn if it cannot delete folders #1914

Closed

summer908 mentioned this issue Nov 2, 2023

I,m try use sealer Clusterfile install k8s ,is failed, sealerio/sealer#2315

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Kubeadm init fails with "Error writing Crisocket information for the control-plane node: timed out waiting for the condition" #1587

Kubeadm init fails with "Error writing Crisocket information for the control-plane node: timed out waiting for the condition" #1587

Ankit-rana commented May 31, 2019 •

edited

Loading

neolit123 commented Jun 1, 2019

Ankit-rana commented Jun 2, 2019 •

edited

Loading

yelongyu commented Jun 18, 2019

neolit123 commented Aug 3, 2019

xsaardo commented Nov 12, 2019 •

edited

Loading

neolit123 commented Nov 12, 2019

xsaardo commented Nov 12, 2019

neolit123 commented Nov 12, 2019

xsaardo commented Nov 12, 2019

neolit123 commented Nov 12, 2019

vikramkhatri commented Nov 13, 2019

vikramkhatri commented Nov 13, 2019

neolit123 commented Nov 13, 2019

vikramkhatri commented Nov 13, 2019

neolit123 commented Nov 13, 2019

vikramkhatri commented Nov 13, 2019

neolit123 commented Nov 13, 2019

Tcarters commented Mar 2, 2021

neolit123 commented Mar 2, 2021

Tcarters commented Mar 2, 2021

neolit123 commented Mar 2, 2021 •

edited

Loading

stevanbangle commented Jan 12, 2022

GitZhangChi commented Jul 17, 2022

JiayangZhou commented Mar 13, 2023

Kubeadm init fails with "Error writing Crisocket information for the control-plane node: timed out waiting for the condition" #1587

Kubeadm init fails with "Error writing Crisocket information for the control-plane node: timed out waiting for the condition" #1587

Comments

Ankit-rana commented May 31, 2019 • edited Loading

Is this a BUG REPORT or FEATURE REQUEST?

Versions

What happened?

What you expected to happen?

How to reproduce it (as minimally and precisely as possible)?

Anything else we need to know?

neolit123 commented Jun 1, 2019

Ankit-rana commented Jun 2, 2019 • edited Loading

yelongyu commented Jun 18, 2019

neolit123 commented Aug 3, 2019

xsaardo commented Nov 12, 2019 • edited Loading

neolit123 commented Nov 12, 2019

xsaardo commented Nov 12, 2019

neolit123 commented Nov 12, 2019

xsaardo commented Nov 12, 2019

neolit123 commented Nov 12, 2019

vikramkhatri commented Nov 13, 2019

vikramkhatri commented Nov 13, 2019

neolit123 commented Nov 13, 2019

vikramkhatri commented Nov 13, 2019

neolit123 commented Nov 13, 2019

vikramkhatri commented Nov 13, 2019

neolit123 commented Nov 13, 2019

Tcarters commented Mar 2, 2021

neolit123 commented Mar 2, 2021

Tcarters commented Mar 2, 2021

neolit123 commented Mar 2, 2021 • edited Loading

stevanbangle commented Jan 12, 2022

GitZhangChi commented Jul 17, 2022

JiayangZhou commented Mar 13, 2023

Ankit-rana commented May 31, 2019 •

edited

Loading

Ankit-rana commented Jun 2, 2019 •

edited

Loading

xsaardo commented Nov 12, 2019 •

edited

Loading

neolit123 commented Mar 2, 2021 •

edited

Loading