You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
During the initial installation of a cluster using RKE2 version 1.27.1+rke2r1, kubevip, cilium and kube proxy disabled, the first node is stuck in the NOTREADY state preventing the successful completion of the cluster installation process.
The workaround I found :
Connect to the first server with SSH
Manually set the $rke2_api_ip on the network interface ip a a 192.0.2.20 dev ens224
Restart rke2 service systemctl restart rke2-server.service
Not sure why this is happening so far, possibly due to the disabling of kube proxy.
Until I find the reason for this, so I can identify the appropriate conditions for a patch, I have applied a temporary workaround by integrating it into the pre-tasks of my playbook.
---
- hosts: kubernetes_masters
gather_facts: true
remote_user: ubuntu
become: true
pre_tasks:
# https://docs.cilium.io/en/v1.13/operations/system_requirements/#systemd-based-distributions
- name: Do not manage foreign routes
ansible.builtin.blockinfile:
path: /etc/systemd/networkd.conf
insertafter: "^\\[Network\\]"
block: |
ManageForeignRoutes=no
ManageForeignRoutingPolicyRules=no
register: networkd_patch
- name: Force systemd to reread configs
ansible.builtin.systemd:
daemon_reload: true
when: networkd_patch.changed
# https://github.com/lablabs/ansible-role-rke2/issues/157
- name: Check if {{ rke2_api_ip }} is pingable
ansible.builtin.shell: "ping -c 1 {{ rke2_api_ip }}"
register: ping_result
ignore_errors: yes
- name: Add the {{ rke2_api_ip }} address to the first node if no ICMP reply
ansible.builtin.shell: "ip addr add {{ rke2_api_ip }}/32 dev {{ rke2_interface }}"
when:
- ping_result.failed
- inventory_hostname == groups[rke2_servers_group_name].0
roles:
- ansible-role-rke2
It's a chicken or the egg scenario. Celium without kube-proxy needs to talk to the kube-api that's being loadbalanced by kube-vip that needs a working cni to contact the kube-api's internal k8s service. As a workaround I set up cilium with:
k8sServiceHost: rke2 first manager's external ip
And after the cluster is up and running you can change it back to:
k8sServiceHost: kube-vip ip
Summary
During the initial installation of a cluster using RKE2 version 1.27.1+rke2r1, kubevip, cilium and kube proxy disabled, the first node is stuck in the NOTREADY state preventing the successful completion of the cluster installation process.
The workaround I found :
ip a a 192.0.2.20 dev ens224
systemctl restart rke2-server.service
Not sure why this is happening so far, possibly due to the disabling of kube proxy.
Issue Type
Bug Report
Ansible Version
Ansible 2.14.8
Steps to Reproduce
Deploy RKE2 with the following variables :
Here is the content of rke2-cilium-proxy.yaml :
Expected Results
The first server should at some point be in the READY state, so the installation of the cluster succeed.
Actual Results
The text was updated successfully, but these errors were encountered: