Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

龙蜥anolis8.9 使用kube-vip模式安装失败 #2473

Open
emf1002 opened this issue Dec 6, 2024 · 5 comments
Open

龙蜥anolis8.9 使用kube-vip模式安装失败 #2473

emf1002 opened this issue Dec 6, 2024 · 5 comments
Labels
bug Something isn't working

Comments

@emf1002
Copy link

emf1002 commented Dec 6, 2024

What is version of KubeKey has the issue?

kk version: &version.Info{Major:"3", Minor:"1", GitVersion:"v3.1.7", GitCommit:"da475c670813fc8a4dd3b1312aaa36e96ff01a1f", GitTreeState:"clean", BuildDate:"2024-10-30T09:41:20Z", GoVersion:"go1.19.2", Compiler:"gc", Platform:"linux/amd64"}

What is your os environment?

龙蜥 anolis 8.9

KubeKey config file

apiVersion: kubekey.kubesphere.io/v1alpha2
kind: Cluster
metadata:
  name: qianmo
spec:
  hosts: 
  ##You should complete the ssh information of the hosts
  - {name: node1, address: 192.168.154.189, password: "rootadmin"}
  - {name: node2, address: 192.168.154.188, password: "rootadmin"}
  - {name: node3, address: 192.168.154.187, password: "rootadmin"}
  roleGroups:
    etcd:
    - node1
    master:
    - node1
    worker:
    - node[1:3]
  controlPlaneEndpoint:
    ##Internal loadbalancer for apiservers
    internalLoadbalancer: kube-vip
    externalDNS: false
    address: "192.168.154.186"
    port: 6443
  system:
    ntpServers:
      - time1.cloud.tencent.com
      - ntp.aliyun.com
      - node1 # Set the node name in `hosts` as ntp server if no public ntp servers access.
    timezone: "Asia/Shanghai"
  kubernetes:
    version: v1.31.2
    containerManager: containerd
    clusterName: cluster.local
    apiserverArgs:
    - service-node-port-range=80-65535
  network:
    plugin: calico
    kubePodsCIDR: 10.233.64.0/18
    kubeServiceCIDR: 10.233.0.0/18
  registry:
    privateRegistry: ""
    registryMirrors: ['http://192.168.154.128:8081']
    insecureRegistries:
    - http://192.168.154.128:8081

A clear and concise description of what happend.

使用 internalLoadbalancer: kube-vip 安装失败,使用haproxy安装成功

Relevant log output

[wait-control-plane] Waiting for the kubelet to boot up the control plane as static Pods from directory "/etc/kubernetes/manifests"
[kubelet-check] Waiting for a healthy kubelet at http://127.0.0.1:10248/healthz. This can take up to 4m0s
[kubelet-check] The kubelet is healthy after 1.002366938s
[api-check] Waiting for a healthy API server. This can take up to 4m0s
[api-check] The API server is not healthy after 4m0.001034363s

Unfortunately, an error has occurred:
        context deadline exceeded

This error is likely caused by:
        - The kubelet is not running
        - The kubelet is unhealthy due to a misconfiguration of the node in some way (required cgroups disabled)

If you are on a systemd-powered system, you can try to troubleshoot the error with the following commands:
        - 'systemctl status kubelet'
        - 'journalctl -xeu kubelet'

Additionally, a control plane component may have crashed or exited when started by the container runtime.
To troubleshoot, list all containers using your preferred container runtimes CLI.
Here is one example how you may list all running Kubernetes containers by using crictl:
        - 'crictl --runtime-endpoint unix:///run/containerd/containerd.sock ps -a | grep kube | grep -v pause'
        Once you have found the failing container, you can inspect its logs with:
        - 'crictl --runtime-endpoint unix:///run/containerd/containerd.sock logs CONTAINERID'
error execution phase wait-control-plane: could not initialize a Kubernetes cluster
To see the stack trace of this error execute with --v=5 or higher
15:13:46 CST stdout: [node1]
[reset] Reading configuration from the cluster...
[reset] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -o yaml'
W1206 15:13:46.410906   20266 reset.go:123] [reset] Unable to fetch the kubeadm-config ConfigMap from cluster: failed to get config map: Get "https://lb.kubesphere.local:6443/api/v1/namespaces/kube-system/configmaps/kubeadm-config?timeout=10s": dial tcp 192.168.154.186:6443: connect: no route to host
[preflight] Running pre-flight checks
W1206 15:13:46.410973   20266 removeetcdmember.go:106] [reset] No kubeadm config, using etcd pod spec to get data directory
[reset] Deleted contents of the etcd data directory: /var/lib/etcd
[reset] Stopping the kubelet service
[reset] Unmounting mounted directories in "/var/lib/kubelet"
[reset] Deleting contents of directories: [/etc/kubernetes/manifests /var/lib/kubelet /etc/kubernetes/pki]
[reset] Deleting files: [/etc/kubernetes/admin.conf /etc/kubernetes/super-admin.conf /etc/kubernetes/kubelet.conf /etc/kubernetes/bootstrap-kubelet.conf /etc/kubernetes/controller-manager.conf /etc/kubernetes/scheduler.conf]

The reset process does not clean CNI configuration. To do so, you must remove /etc/cni/net.d

The reset process does not reset or clean up iptables rules or IPVS tables.
If you wish to reset iptables, you must do so manually by using the "iptables" command.

If your cluster was setup to utilize IPVS, run ipvsadm --clear (or similar)
to reset your system's IPVS tables.

The reset process does not clean your kubeconfig files and you must remove them manually.
Please, check the contents of the $HOME/.kube/config file.

Additional information

No response

@emf1002 emf1002 added the bug Something isn't working label Dec 6, 2024
@junlintianxiazhifulinzhongguo

我这边测试可以,没有问题,

基础环境

[root@master2 ~]# uname -a
Linux master2 5.10.134-17.2.an8.x86_64 #1 SMP Fri Aug 9 15:52:23 CST 2024 x86_64 x86_64 x86_64 GNU/Linux

[root@master2 ~]# cat /etc/os-release
NAME="Anolis OS"
VERSION="8.9"
ID="anolis"
ID_LIKE="rhel fedora centos"
VERSION_ID="8.9"
PLATFORM_ID="platform:an8"
PRETTY_NAME="Anolis OS 8.9"
ANSI_COLOR="0;31"
HOME_URL="https://openanolis.cn/"

[root@master2 ~]#
[root@master2 ~]# kk version
kk version: &version.Info{Major:"3", Minor:"1", GitVersion:"v3.1.7", GitCommit:"da475c670813fc8a4dd3b1312aaa36e96ff01a1f", GitTreeState:"clean", BuildDate:"2024-10-30T09:41:20Z", GoVersion:"go1.19.2", Compiler:"gc", Platform:"linux/amd64"}

配置文件

apiVersion: kubekey.kubesphere.io/v1alpha2
kind: Cluster
metadata:
  name: sample
spec:
  hosts:
  - {name: master1, address: 192.168.5.124, internalAddress: 192.168.5.124, user: root, privateKeyPath: "~/.ssh/id_rsa"}
  - {name: master2, address: 192.168.5.125, internalAddress: 192.168.5.125, user: root, privateKeyPath: "~/.ssh/id_rsa"}
  - {name: master3, address: 192.168.5.127, internalAddress: 192.168.5.127, user: root, privateKeyPath: "~/.ssh/id_rsa"}
  roleGroups:
    etcd:
    - master[1:3]
    control-plane:
    - master[1:3]
    worker:
    - master[1:3]
  controlPlaneEndpoint:
    ## Internal loadbalancer for apiservers
    internalLoadbalancer: kube-vip
    externalDNS: false
    domain: lb.kubesphere.local
    address: "192.168.5.123"
    port: 6443
  system:
    # The ntp servers of chrony.
    ntpServers:
      - ntp.aliyun.com
      - master1 # Set the node name in `hosts` as ntp server if no public ntp servers access.
    timezone: "Asia/Shanghai"
  kubernetes:
    version: v1.26.15
    clusterName: cluster.local
    autoRenewCerts: true
    containerManager: containerd
    apiserverArgs:
    - service-node-port-range=10000-65535
    # maxPods is the number of Pods that can run on this Kubelet. [Default: 110]
    maxPods: 110
    # Specify which proxy mode to use. [Default: ipvs]
    proxyMode: ipvs
  etcd:
    type: kubekey
  network:
    plugin: calico
    kubePodsCIDR: 10.233.64.0/18
    kubeServiceCIDR: 10.233.0.0/18
    ## multus support. https://github.com/k8snetworkplumbingwg/multus-cni
    multusCNI:
      enabled: false
  registry:
    privateRegistry: ""
    namespaceOverride: ""
    registryMirrors: []
    insecureRegistries: []
  addons: []

效果

[root@master2 ~]# kubectl get nodes -o wide
NAME      STATUS   ROLES                  AGE   VERSION    INTERNAL-IP     EXTERNAL-IP   OS-IMAGE        KERNEL-VERSION             CONTAINER-RUNTIME
master1   Ready    control-plane,worker   15h   v1.26.15   192.168.5.124   <none>        Anolis OS 8.9   5.10.134-16.2.an8.x86_64   containerd://1.7.13
master2   Ready    control-plane,worker   15h   v1.26.15   192.168.5.125   <none>        Anolis OS 8.9   5.10.134-17.2.an8.x86_64   containerd://1.7.13
master3   Ready    control-plane,worker   15h   v1.26.15   192.168.5.127   <none>        Anolis OS 8.9   5.10.134-17.2.an8.x86_64   containerd://1.7.13

[root@master2 ~]# kubectl get pods -A
NAMESPACE     NAME                                       READY   STATUS    RESTARTS        AGE
kube-system   calico-kube-controllers-57db949bd8-6gbdf   1/1     Running   0               15h
kube-system   calico-node-47kxm                          1/1     Running   0               15h
kube-system   calico-node-6gf9c                          1/1     Running   0               15h
kube-system   calico-node-g2q68                          1/1     Running   1 (3m10s ago)   15h
kube-system   coredns-5b486d6f8b-jf9zv                   1/1     Running   0               15h
kube-system   coredns-5b486d6f8b-z85ct                   1/1     Running   0               15h
kube-system   kube-apiserver-master1                     1/1     Running   0               15h
kube-system   kube-apiserver-master2                     1/1     Running   0               15h
kube-system   kube-apiserver-master3                     1/1     Running   1 (3m10s ago)   15h
kube-system   kube-controller-manager-master1            1/1     Running   0               15h
kube-system   kube-controller-manager-master2            1/1     Running   0               15h
kube-system   kube-controller-manager-master3            1/1     Running   1 (3m10s ago)   15h
kube-system   kube-proxy-cpfcr                           1/1     Running   0               15h
kube-system   kube-proxy-m6x9r                           1/1     Running   1 (3m10s ago)   15h
kube-system   kube-proxy-ng7vc                           1/1     Running   0               15h
kube-system   kube-scheduler-master1                     1/1     Running   0               15h
kube-system   kube-scheduler-master2                     1/1     Running   0               15h
kube-system   kube-scheduler-master3                     1/1     Running   1 (3m10s ago)   15h
kube-system   kube-vip-master1                           1/1     Running   3 (15h ago)     15h
kube-system   kube-vip-master2                           1/1     Running   3 (14h ago)     15h
kube-system   kube-vip-master3                           1/1     Running   1 (3m10s ago)   15h
kube-system   nodelocaldns-4j5kv                         1/1     Running   0               15h
kube-system   nodelocaldns-v25hm                         1/1     Running   0               15h
kube-system   nodelocaldns-xhffm                         1/1     Running   2 (3m10s ago)   15h
[root@master2 ~]#

[root@master1 ~]# ip ad
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host
       valid_lft forever preferred_lft forever
2: enp6s18: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000
    link/ether bc:24:11:b2:e0:0e brd ff:ff:ff:ff:ff:ff
    inet 192.168.5.124/24 brd 192.168.5.255 scope global noprefixroute enp6s18
       valid_lft forever preferred_lft forever
    inet 192.168.5.123/32 scope global enp6s18
       valid_lft forever preferred_lft forever
    inet6 fe80::be24:11ff:feb2:e00e/64 scope link noprefixroute
       valid_lft forever preferred_lft forever

@junlintianxiazhifulinzhongguo

我看了下kube-vip 上的问题,应该是在k8s v1.28.x及以下 版本没有问题,从v1.29.x开始就不行了

@emf1002
Copy link
Author

emf1002 commented Dec 19, 2024

我看了下kube-vip 上的问题,应该是在k8s v1.28.x及以下 版本没有问题,从v1.29.x开始就不行了

好的 感谢

@redscholar
Copy link
Collaborator

kubernetes/kubeadm#2414
kubeadm 1.29中admin.conf文件权限受限了。而kube-vip还是再使用admin.conf这个文件,导致kube-vip获取lease资源报错
这里有个解决方案:kube-vip/kube-vip#684 (comment)

@graphenn
Copy link

一样的问题#2375

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

4 participants