Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Kubeadm join fail #344

Closed
PLoic opened this issue Jul 12, 2017 · 12 comments
Closed

Kubeadm join fail #344

PLoic opened this issue Jul 12, 2017 · 12 comments

Comments

@PLoic
Copy link

PLoic commented Jul 12, 2017

What happened?

Hello, I have an issue when I try to do a kubeadm join it appear to be succeed because I see :

[preflight] Starting the kubelet service
[discovery] Trying to connect to API Server "X.X.X.X:6443"
[discovery] Created cluster-info discovery client, requesting info from "https://X.X.X.X:6443"
[discovery] Cluster info signature and contents are valid, will use API Server "https://X.X.X.X:6443"
[discovery] Successfully established connection with API Server "X.X.X.X:6443"
[bootstrap] Detected server version: v1.7.0
[bootstrap] The server supports the Certificates API (certificates.k8s.io/v1beta1)
[csr] Created API client to obtain unique certificate for this node, generating keys and certificate signing request
[csr] Received signed certificate from the API server, generating KubeConfig...
[kubeconfig] Wrote KubeConfig file to disk: "/etc/kubernetes/kubelet.conf"

Node join complete:

  • Certificate signing request sent to master and response
    received.
  • Kubelet informed of new secure connection details.

But on master node when I do a kubectl get nodes I only see my master node.

Versions

kubeadm version (use kubeadm version): v1.7.0

Environment:

  • Kubernetes version (use kubectl version): v1.7.0
  • OS (e.g. from /etc/os-release): CentOS 7
  • Kernel (e.g. uname -a): 3.10.0-327.36.3.el7.x86_64

What you expected to happen?

The worker node connected to the master

Thanks for your help

@luxas
Copy link
Member

luxas commented Jul 12, 2017

Can you output the logs of the kubelet?
journalctl -xeu kubelet

@PLoic
Copy link
Author

PLoic commented Jul 13, 2017

Problem solved ! :)
With the journalctl -xeu kubelet I see :
error: failed to run Kubelet: failed to create kubelet: misconfiguration: kubelet cgroup driver: "systemd" is different from docker cgroup driver: "cgroupfs"

I simply change my docker configuration file to add :

ExecStart=
ExecStart=/usr/bin/dockerd --exec-opt native.cgroupdriver=systemd

After restart docker and retry a kubeadm join the worker appear when I do a kubectl get nodes

Thanks for your help !

@piersbarrios I have created an issue relative to an issue like that : kubernetes/kubernetes#48798

But by removing the $KUBELET_NETWORK_ARGS in/etc/systemd/system/kubelet.service.d/10-kubeadm.conf it seem to be working

@PLoic PLoic closed this as completed Jul 13, 2017
@piersbarrios
Copy link

piersbarrios commented Jul 13, 2017

@PLoic : I believe I have another issue here because my logs now show :

Jul 13 15:23:52 k8s-Node171 kubelet[5443]: I0713 15:23:52.666752    5443 kubelet_node_status.go:247] Setting node annotation to enable volume controller attach/detach
Jul 13 15:23:52 k8s-Node171 kubelet[5443]: I0713 15:23:52.669988    5443 kubelet_node_status.go:82] Attempting to register node k8s-node171
Jul 13 15:23:52 k8s-Node171 kubelet[5443]: E0713 15:23:52.671312    5443 kubelet_node_status.go:106] Unable to register node "k8s-node171" with API server: nodes "k8s-node171" is forbidden: node k8s-Node171 cannot modify node k8s-node171
Jul 13 15:23:59 k8s-Node171 kubelet[5443]: I0713 15:23:59.671580    5443 kubelet_node_status.go:247] Setting node annotation to enable volume controller attach/detach

etc...

I am using weave btw

@ghost
Copy link

ghost commented Jul 14, 2017

Same error with piersbarrios

Jul 14 13:23:27 SZV1000204813 kubelet[120170]: I0714 13:23:27.314297 120170 kubelet_node_status.go:247] Setting node annotation to enable volume controller attach/detach Jul 14 13:23:27 SZV1000204813 kubelet[120170]: I0714 13:23:27.316602 120170 kubelet_node_status.go:82] Attempting to register node szv1000204813 Jul 14 13:23:27 SZV1000204813 kubelet[120170]: E0714 13:23:27.318670 120170 kubelet_node_status.go:106] Unable to register node "szv1000204813" with API server: nodes "szv1000204813" is forb Jul 14 13:23:27 SZV1000204813 kubelet[120170]: E0714 13:23:27.571840 120170 eviction_manager.go:238] eviction manager: unexpected err: failed GetNode: node 'szv1000204813' not found
I using Ubuntu16.04 and Flannel

@PLoic PLoic reopened this Jul 14, 2017
@luxas luxas closed this as completed Jul 14, 2017
@luxas
Copy link
Member

luxas commented Jul 14, 2017

This should be fixed now. Please use v1.7.1 and reopen if you can still reproduce the issue...

@praparn
Copy link

praparn commented Jul 18, 2017

Dear all,

For v.1.7.1 we facing problem for join node to cluster with ip address. by command "kubeadm --token 8c2350.f55343444a6ffc46 join X.X.X.X:6443" with Error like below:

kubeadm join kubernetes-ms:6443 --token 8c2350.f55343444a6ffc46
[kubeadm] WARNING: kubeadm is in beta, please do not use it for production clusters.
[preflight] Running pre-flight checks
[preflight] WARNING: docker version is greater than the most recently validated version. Docker version: 17.06.0-ce. Max validated version: 1.12
[preflight] WARNING: hostname "" could not be reached
[preflight] WARNING: hostname "" lookup : no such host
[preflight] Some fatal errors occurred:
hostname "" a DNS-1123 subdomain must consist of lower case alphanumeric characters, '-' or '.', and must start and end with an alphanumeric character (e.g. 'example.com', regex used for validation is 'a-z0-9?(.a-z0-9?)*')
[preflight] If you know what you are doing, you can skip pre-flight checks with --skip-preflight-checks

Lab Description (All node had been install docker/kubelet/kubectl/kubeadm):
Machine name Roles: IP Address:
kubeserve-ms Master 192.168.99.200
kubeserve-1 NodePort 192.168.99.201
kubeserve-2 NodePort 192.168.99.202

  1. (kubeserve_ms) initial cluster by command (su to root):
    kubeadm init --pod-network-cidr=10.244.0.0/16 --token 8c2350.f55343444a6ffc46

  2. (kubeserve_ms) setup run cluster system by command (Regular User):
    mkdir -p $HOME/.kube
    sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
    sudo chown $(id -u):$(id -g) $HOME/.kube/config

  3. (kubeserve_ms) init cluster by command:
    sudo su -
    kubeadm init --pod-network-cidr=10.244.0.0/16 --token 8c2350.f55343444a6ffc46

  4. (kubeserve_ms) apply weave network module by command:
    kubectl apply -n kube-system -f "https://cloud.weave.works/k8s/net?k8s-version=$(kubectl version | base64 | tr -d '\n')"

  5. (kubeserve-1,kuberserve-2) start join node by command:
    kubeadm --token 8c2350.f55343444a6ffc46 join 192.168.99.200:6443
    Result
    kubeadm join kubernetes-ms:6443 --token 8c2350.f55343444a6ffc46
    [kubeadm] WARNING: kubeadm is in beta, please do not use it for production clusters.
    [preflight] Running pre-flight checks
    [preflight] WARNING: docker version is greater than the most recently validated version. Docker version: 17.06.0-ce. Max validated version: 1.12
    [preflight] WARNING: hostname "" could not be reached
    [preflight] WARNING: hostname "" lookup : no such host
    [preflight] Some fatal errors occurred: hostname "" a DNS-1123 subdomain must consist of lower case alphanumeric characters, '-' or '.', and must start and end with an alphanumeric character (e.g. 'example.com', regex used for validation is 'a-z0-9?(.a-z0-9?)*')
    [preflight] If you know what you are doing, you can skip pre-flight checks with --skip-preflight-checks

Currently my workaround is only switch to use v1.7.0 that it work fine

@luxas
Copy link
Member

luxas commented Jul 18, 2017

@praparn See: #347
It will be fixed in v1.7.2
Meanwhile, you can just set --skip-preflight-checks

@praparn
Copy link

praparn commented Jul 20, 2017

@luxas Note with thanks krab

@piersbarrios
Copy link

@luxas : Wasn't fixed in v1.7.2
(I don't know if it's your will or not)

@gicappa
Copy link

gicappa commented Jul 28, 2017

not fixed neither for me.

kubeadm version: &version.Info{Major:"1", Minor:"7", GitVersion:"v1.7.2", GitCommit:"922a86cfcd65915a9b2f69f3f193b8907d741d9c", GitTreeState:"clean", BuildDate:"2017-07-21T08:08:00Z", GoVersion:"go1.8.3", Compiler:"gc", Platform:"linux/amd64"}

uname -a
Linux k8-node-01 4.4.0-87-generic #110-Ubuntu SMP Tue Jul 18 12:55:35 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux

@codesnk
Copy link

codesnk commented Jul 30, 2017

Ditto... Running 1.7.2 and the join fails. Giving --skip-preflight-checks seems to indicate that the join was successful, but the master never detects it and none of the required images on worker nodes get downloaded, so I image it must also be a bug.

Ps: I am joining a RPi3 with a x86 master.. I went ahead and created the cluster with kuberentes v1.7.0, which seems to work OK.

@jiangpengcheng
Copy link

I met this error too, but not related to the cgroup driver.

When the kubelet service is configured with

--feature-gates=RotateKubeletClientCertificate=true,RotateKubeletClientCertificate=true

the command kubeadm join looks succeed, but the master node can not find the join node,

and journalctl -efu kubelet gives the following messages:

Aug 25 04:14:12 storage3 systemd[1]: Started kubelet: The Kubernetes Node Agent.
Aug 25 04:14:12 storage3 systemd[1]: Starting kubelet: The Kubernetes Node Agent...
Aug 25 04:14:12 storage3 kubelet[14516]: I0825 04:14:12.663893   14516 feature_gate.go:144] feature gates: map[RotateKubeletClientCertificate:true]
Aug 25 04:14:12 storage3 kubelet[14516]: I0825 04:14:12.682075   14516 certificate_manager.go:355] Requesting new certificate.

looks like it's hanging on the requesting for new certificates

It's not an emergency as those two features are in alpha, after disable it everything is ok, just for the sake of recording

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants