-
Notifications
You must be signed in to change notification settings - Fork 50
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Arktos-Mizar-Integration] Nginx Pods scheduled to the new worker node unable to enter in Running state in two-node Arktos scale-up cluster #562
Comments
This issue does not occur with Mizar in upstream code (1.21.0). So it is arktos specific issue.
|
Similar issue happened for k8s 1.21 cluster set up with kubeadm: Setting up k8s 1.21 with kubeadmin: Success criterial: |
To check case 4 - add worker node into existing arktos cluster, I did the following steps:
Logs for both daemonset pods are accessible. worker daemon log normal. master daemon log has error:
Note: if there is any issue, using https://github.com/q131172019/arktos/blob/CarlXie_singleNodeArktosCluster/docs/setup-guide/multi-node-dev-scale-up-cluster.md as a reference. |
One more test, steps are same as above, there are one pod on worker could not start. Error log in daemonset:
|
This should be working now. Closing as per discussion in Dec 29th Network SIG meeting. |
What happened:
During Arktos and Mizar integration, Arktos team wants to test this case - Worker node joining: new worker node should be able to join cluster, and basic pod connectivity should be provided.
In two-node Arktos scale-up cluster with Mizar, new worker node is able to join cluster and enters in Ready state. However, when nginx application is deployed, pods scheduled to the new worker node unable to enter in Running state but stuck in ContainerCreating state. The kubelet log on new worker node shows cni problem,
What you expected to happen:
The nginx application pods scheduled to the new worker node should enter in Running state,
How to reproduce it (as minimally and precisely as possible):
Create single-node arktos cluster with Mizar using the procedure at https://github.com/Click2Cloud-Centaurus/arktos/blob/default-cni-mizar/docs/setup-guide/arktos-with-mizar-cni.md and apply PR 1114 at Support for Mizar CNI in arktos-up arktos#1114; and verify health status with procedure at https://github.com/CentaurusInfra/mizar/wiki/Mizar-Cluster-Health-Criteria.
Create worker node running AWS Ubuntu 18.04, SSH/SCP should work on master and worker node; The corresponding ports should be opened in the security group on both nodes; upgrade kernel to 5.6-rc2, clone the Arktos repository and install the required dependencies. The follow up the step 3 and step 4 at https://github.com/q131172019/arktos/blob/CarlXie_singleNodeArktosCluster/docs/setup-guide/multi-node-dev-scale-up-cluster.md to join cluster.
Check the status of two nodes
grep nginx-5d79788459-48l6q /tmp/kubelet.worker.log |tail -3
Anything else we need to know?:
Environment:
kubectl version
): 0.9.0cat /etc/os-release
): Ubuntu 18.04uname -a
): 5.6.0-rc2The text was updated successfully, but these errors were encountered: