Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix --wait's failure to work on coredns pods #19748

Merged
merged 2 commits into from
Jan 8, 2025

Conversation

ComradeProgrammer
Copy link
Member

@ComradeProgrammer ComradeProgrammer commented Oct 3, 2024

FIX #19288
Before: minikube start --wait=all may end when coredns is not ready
After: minikube start --wait=all will be able to wait until coredns is completly ready

The situation mentioned in #19288 was actually introduced by the HA-cluster PR. By default coredns has a deployment consists 2 coredns pods. However in pkg/minikube/node/start.go:158 it was manyally scaled down to 1. This happens before we start to wait for those essential nodes(minikube waits for nodes at line 236).

When minikube waits for system pods, there were 2 checks which will check the system pods's status:

  • WaitExtra (pkg/minikube/bootstrapper/bsutil/kverify/pod_ready.go) will list all pods with the given labels, and check whether they are ready
  • ExpectAppsRunning will list all the pods in kube-system namespace (pkg/minikube/bootstrapper/bsutil/kverify/system_pods.go:91 , in function ExpectAppsRunning), and check whether there are at least 1 running pod for some essential labels. But the bug is that it only check the running state, and do not check the ready state

After the HA-cluster PR was introduced, when minikube run WaitExtra funtion(the 1st check), one of the coredns pod's status can be Succeed. WaitExtra don't recognize this state and will print an error and break the checking loop. This logic is written at pkg/minikube/bootstrapper/bsutil/kverify/pod_ready.go line 99 and line 69.

The error I see is

E1003 22:37:04.296076   17140 pod_ready.go:66] WaitExtra: waitPodCondition: pod "coredns-7db6d8ff4d-nljst" in "kube-system" namespace has status phase "Succeeded" (skipping!): {Phase:Succeeded Conditions:[{Type:PodReadyToStartContainers Status:False LastProbeTime:0001-01-01 00:00:00 +0000 UTC LastTransitionTime:2024-10-03 22:37:04 +0200 CEST Reason: Message:} {Type:Initialized Status:True LastProbeTime:0001-01-01 00:00:00 +0000 UTC LastTransitionTime:2024-10-03 22:36:53 +0200 CEST Reason:PodCompleted Message:} {Type:Ready Status:False LastProbeTime:0001-01-01 00:00:00 +0000 UTC LastTransitionTime:2024-10-03 22:36:53 +0200 CEST Reason:PodCompleted Message:} {Type:ContainersReady Status:False LastProbeTime:0001-01-01 00:00:00 +0000 UTC LastTransitionTime:2024-10-03 22:36:53 +0200 CEST Reason:PodCompleted Message:} {Type:PodScheduled Status:True LastProbeTime:0001-01-01 00:00:00 +0000 UTC LastTransitionTime:2024-10-03 22:36:53 +0200 CEST Reason: Message:}] Message: Reason: NominatedNodeName: HostIP:192.168.49.2 HostIPs:[{IP:192.168.49.2}] PodIP: PodIPs:[] StartTime:2024-10-03 22:36:53 +0200 CEST InitContainerStatuses:[] ContainerStatuses:[{Name:coredns State:{Waiting:nil Running:nil Terminated:&ContainerStateTerminated{ExitCode:0,Signal:0,Reason:Completed,Message:,StartedAt:2024-10-03 22:36:54 +0200 CEST,FinishedAt:2024-10-03 22:37:04 +0200 CEST,ContainerID:docker://281a8c3106510cdc16a6d3f91ad2e9f5d7aa1609fac4f0e8f7494af67cb8b5d6,}} LastTerminationState:{Waiting:nil Running:nil Terminated:nil} Ready:false RestartCount:0 Image:registry.k8s.io/coredns/coredns:v1.11.1 ImageID:docker-pullable://registry.k8s.io/coredns/coredns@sha256:1eeb4c7316bacb1d4c8ead65571cd92dd21e27359f0d4917f1a5822a73b75db1 ContainerID:docker://281a8c3106510cdc16a6d3f91ad2e9f5d7aa1609fac4f0e8f7494af67cb8b5d6 Started:0x14001e58e00 AllocatedResources:map[] Resources:nil VolumeMounts:[]}] QOSClass:Burstable EphemeralContainerStatuses:[] Resize: ResourceClaimStatuses:[]}

The when minikube arrives at ExpectAppsRunning(the 2nd check), it doesn't check the ready state, so it believes that all pods are ok. This causes the #19288

So the fix is to make ExpectAppsRunning(the 2nd check) check the ready state as well.

(The reason why I didn't make the 1st check function to recognized the Succeed state is that: if for some reason there is a job (e.g. init job for some containers) in kube-system namespace, and we change the WaitExtra's logic to reject Succeed state, there will be problems.)

@k8s-ci-robot k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. labels Oct 3, 2024
@ComradeProgrammer
Copy link
Member Author

/ok-to-test

@k8s-ci-robot k8s-ci-robot added the ok-to-test Indicates a non-member PR verified by an org member that is safe to test. label Oct 3, 2024
@minikube-pr-bot

This comment has been minimized.

@minikube-pr-bot

This comment has been minimized.

@prezha
Copy link
Contributor

prezha commented Oct 6, 2024

hey @ComradeProgrammer thanks for looking into this

here's a bit more context that might help you:

wait for system-critical pods' Ready condition implementation was intentionally made alongside the Running checks, ie, the former would be called only if wait was explicitly requested and would add some startup delay, whereas the latter would almost always be called for quickest startup time

so, as the ExpectAppsRunning is called by WaitForAppsRunning, which, in turn, is called by WaitForNode that is always called (with the exception for the first ha control plane), by adding the Ready check to the ExpectAppsRunning, we'd effectively always wait, which is not the intention

on the other hand, only if wait is required, WaitExtra is called by WaitForNode (but before the WaitForAppsRunning) or by restartPrimaryControlPlane

also, for ha + wait, iirc the idea was that we'd be ok with waiting for at least one coredns pod to be ready, as the kube-dns service would take care of routing requests to the pod(s) that can process them

example from #19288 shows only one coredns, so it's not a ha cluster, but makes a very good point:

sometimes the list of pods is pulled before some of the pods have even been created, resulting in them not being in the waiting check

i think that the fix should be made in WaitExtra, and there we could eg, invert the logic so not to list all pods once and then loop through it waiting for each pod that's also on a system-critical list to become Ready (as we do now), but instead to wait until all system-critical pods (which is a fixed list: kverify.CorePodsLabels) became Ready, re-fetching the pod's status as needed

as for Succeed status phase, that means that "All containers in the Pod have terminated in success, and will not be restarted", so it is handled - by skipping it (ie, it will never become Ready, so no point waiting for it)

Copy link
Member

@spowelljr spowelljr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just tried this but coredns still wasn't ready.

$ minikube start --wait=all
😄  minikube v1.34.0 on Debian rodete (kvm/amd64)
✨  Automatically selected the docker driver
📌  Using Docker driver with root privileges
👍  Starting "minikube" primary control-plane node in "minikube" cluster
🚜  Pulling base image v0.0.45-1727731891-master ...
🔥  Creating docker container (CPUs=2, Memory=26100MB) ...
🐳  Preparing Kubernetes v1.31.1 on Docker 27.3.1 ...
    ▪ Generating certificates and keys ...
    ▪ Booting up control plane ...
    ▪ Configuring RBAC rules ...
🔗  Configuring bridge CNI (Container Networking Interface) ...
🔎  Verifying Kubernetes components...
    ▪ Using image gcr.io/k8s-minikube/storage-provisioner:v5
🌟  Enabled addons: storage-provisioner, default-storageclass
🏄  Done! kubectl is now configured to use "minikube" cluster and "default" namespace by default

$ kubectl get pods -A
NAMESPACE     NAME                               READY   STATUS    RESTARTS   AGE
kube-system   coredns-7c65d6cfc9-qjrwh           0/1     Running   0          9s
kube-system   etcd-minikube                      1/1     Running   0          14s
kube-system   kube-apiserver-minikube            1/1     Running   0          14s
kube-system   kube-controller-manager-minikube   1/1     Running   0          15s
kube-system   kube-proxy-c8mdp                   1/1     Running   0          9s
kube-system   kube-scheduler-minikube            1/1     Running   0          16s
kube-system   storage-provisioner                1/1     Running   0          9s

@k8s-ci-robot k8s-ci-robot added size/S Denotes a PR that changes 10-29 lines, ignoring generated files. and removed size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. labels Dec 11, 2024
@minikube-pr-bot

This comment has been minimized.

@minikube-pr-bot
Copy link

Here are the number of top 10 failed tests in each environments with lowest flake rate.

Environment Test Name Flake Rate
Docker_macOS (1 failed) TestMultiControlPlane/serial/StartCluster(gopogh) 0.00% (chart)
Docker_Linux (1 failed) TestMultiControlPlane/serial/StartCluster(gopogh) 0.00% (chart)
Docker_Linux_containerd (1 failed) TestMultiControlPlane/serial/StartCluster(gopogh) 0.00% (chart)
Docker_Linux_crio (3 failed) TestMultiControlPlane/serial/StartCluster(gopogh) 0.00% (chart)
Docker_Linux_crio_arm64 (5 failed) TestMultiControlPlane/serial/StartCluster(gopogh) 0.00% (chart)
Docker_Linux_crio_arm64 (5 failed) TestFunctional/parallel/PersistentVolumeClaim(gopogh) 1.10% (chart)
Docker_Linux_crio_arm64 (5 failed) TestScheduledStopUnix(gopogh) 2.20% (chart)
KVM_Linux_crio (10 failed) TestMultiControlPlane/serial/StartCluster(gopogh) 0.00% (chart)
KVM_Linux_crio (10 failed) TestStartStop/group/newest-cni/serial/SecondStart(gopogh) 0.00% (chart)
Docker_Linux_docker_arm64 (1 failed) TestMultiControlPlane/serial/StartCluster(gopogh) 0.00% (chart)
KVM_Linux_containerd (3 failed) TestMultiControlPlane/serial/StartCluster(gopogh) 0.00% (chart)
KVM_Linux_containerd (3 failed) TestStartStop/group/no-preload/serial/SecondStart(gopogh) 0.00% (chart)
KVM_Linux_containerd (3 failed) TestStartStop/group/default-k8s-diff-port/serial/SecondStart(gopogh) 0.00% (chart)
Hyper-V_Windows (9 failed) TestMultiControlPlane/serial/StartCluster(gopogh) 0.00% (chart)
Hyper-V_Windows (9 failed) TestPause/serial/VerifyDeletedResources(gopogh) 0.00% (chart)
Docker_Linux_containerd_arm64 (1 failed) TestMultiControlPlane/serial/StartCluster(gopogh) 0.00% (chart)

Besides the following environments also have failed tests:

To see the flake rates of all tests by environment, click here.

@k8s-ci-robot k8s-ci-robot added size/M Denotes a PR that changes 30-99 lines, ignoring generated files. and removed size/S Denotes a PR that changes 10-29 lines, ignoring generated files. labels Dec 22, 2024
@minikube-pr-bot

This comment has been minimized.

@k8s-ci-robot k8s-ci-robot added size/S Denotes a PR that changes 10-29 lines, ignoring generated files. and removed size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Dec 22, 2024
@ComradeProgrammer
Copy link
Member Author

I just tried this but coredns still wasn't ready.

$ minikube start --wait=all
😄  minikube v1.34.0 on Debian rodete (kvm/amd64)
✨  Automatically selected the docker driver
📌  Using Docker driver with root privileges
👍  Starting "minikube" primary control-plane node in "minikube" cluster
🚜  Pulling base image v0.0.45-1727731891-master ...
🔥  Creating docker container (CPUs=2, Memory=26100MB) ...
🐳  Preparing Kubernetes v1.31.1 on Docker 27.3.1 ...
    ▪ Generating certificates and keys ...
    ▪ Booting up control plane ...
    ▪ Configuring RBAC rules ...
🔗  Configuring bridge CNI (Container Networking Interface) ...
🔎  Verifying Kubernetes components...
    ▪ Using image gcr.io/k8s-minikube/storage-provisioner:v5
🌟  Enabled addons: storage-provisioner, default-storageclass
🏄  Done! kubectl is now configured to use "minikube" cluster and "default" namespace by default

$ kubectl get pods -A
NAMESPACE     NAME                               READY   STATUS    RESTARTS   AGE
kube-system   coredns-7c65d6cfc9-qjrwh           0/1     Running   0          9s
kube-system   etcd-minikube                      1/1     Running   0          14s
kube-system   kube-apiserver-minikube            1/1     Running   0          14s
kube-system   kube-controller-manager-minikube   1/1     Running   0          15s
kube-system   kube-proxy-c8mdp                   1/1     Running   0          9s
kube-system   kube-scheduler-minikube            1/1     Running   0          16s
kube-system   storage-provisioner                1/1     Running   0          9s

could you plz have a try again to see if it works now? thx

@ComradeProgrammer
Copy link
Member Author

/test pull-minikube-build

@medyagh
Copy link
Member

medyagh commented Dec 30, 2024

/ok-to-test

@minikube-pr-bot

This comment has been minimized.

@minikube-pr-bot

This comment has been minimized.

@minikube-pr-bot
Copy link

kvm2 driver with docker runtime

+----------------+----------+---------------------+
|    COMMAND     | MINIKUBE | MINIKUBE (PR 19748) |
+----------------+----------+---------------------+
| minikube start | 54.6s    | 53.3s               |
| enable ingress | 17.9s    | 19.3s               |
+----------------+----------+---------------------+

Times for minikube start: 57.0s 54.0s 51.9s 54.8s 55.1s
Times for minikube (PR 19748) start: 52.8s 52.7s 53.1s 53.8s 54.0s

Times for minikube ingress: 17.3s 18.2s 17.2s 17.7s 19.3s
Times for minikube (PR 19748) ingress: 21.3s 18.1s 19.7s 17.7s 19.7s

docker driver with docker runtime

+----------------+----------+---------------------+
|    COMMAND     | MINIKUBE | MINIKUBE (PR 19748) |
+----------------+----------+---------------------+
| minikube start | 24.9s    | 23.8s               |
| enable ingress | 13.0s    | 13.0s               |
+----------------+----------+---------------------+

Times for minikube start: 25.4s 25.4s 24.8s 23.5s 25.5s
Times for minikube (PR 19748) start: 22.4s 23.0s 26.1s 21.7s 25.6s

Times for minikube ingress: 12.4s 13.4s 13.5s 13.4s 12.4s
Times for minikube (PR 19748) ingress: 13.4s 13.3s 12.4s 13.4s 12.5s

docker driver with containerd runtime

+----------------+----------+---------------------+
|    COMMAND     | MINIKUBE | MINIKUBE (PR 19748) |
+----------------+----------+---------------------+
| minikube start | 23.5s    | 22.1s               |
| enable ingress | 22.9s    | 22.9s               |
+----------------+----------+---------------------+

Times for minikube start: 25.0s 23.9s 22.3s 24.6s 21.5s
Times for minikube (PR 19748) start: 21.7s 22.1s 20.9s 21.6s 24.1s

Times for minikube ingress: 22.9s 22.8s 22.9s 22.8s 22.9s
Times for minikube (PR 19748) ingress: 22.9s 22.9s 22.9s 22.8s 23.0s

@medyagh
Copy link
Member

medyagh commented Jan 6, 2025

@ComradeProgrammer Docker_Linux_containerd seems weird, there is no failure in gopogh but it exited as failure, could the None Zero code be related to waiting for DNS? https://storage.googleapis.com/minikube-builds/logs/19748/37784/Docker_Linux_containerd.html

it shows as
Docker_Linux_containerd — Jenkins: completed with 0 / 282 failures in 120.01 minutes.
but it lists them as Red (total test exited with non zero)

@medyagh
Copy link
Member

medyagh commented Jan 8, 2025

/lgtm

@medyagh medyagh merged commit 3fef3ea into kubernetes:master Jan 8, 2025
24 of 37 checks passed
@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jan 8, 2025
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: ComradeProgrammer, medyagh

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jan 8, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. size/S Denotes a PR that changes 10-29 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

--wait flag sometimes misses pods
6 participants