Skip to content
This repository has been archived by the owner on Jan 11, 2023. It is now read-only.

K8s agents entering a NotReady State #2153

Closed
gardlt opened this issue Jan 25, 2018 · 2 comments
Closed

K8s agents entering a NotReady State #2153

gardlt opened this issue Jan 25, 2018 · 2 comments

Comments

@gardlt
Copy link

gardlt commented Jan 25, 2018

Is this a request for help?: Yes


Is this an ISSUE or FEATURE REQUEST? (choose one): ISSUE


What version of acs-engine?: 0.11.*


Orchestrator and version (e.g. Kubernetes, DC/OS, Swarm)
Kubernetes 1.8.2

What happened:
yesterday at some point we have started to notice that agent nodes in two of our clusters started to enter into NotReady. We tried tor just restart the vms, but it did not to bring up the nodes.

When looking for information via the kubectl command we could not find anything.

Taking a deeper dive into the kubelet we started to see these messages:

Jan 24 17:09:48 k8s-master-39231420-0 docker[2407]: E0124 17:09:48.611047    2483 fsHandler.go:121] failed to collect filesystem stats - rootDiskErr: du command failed on /var/lib/docker/overlay2/23e67a280e772edd18c1903137e72e10c30c144df6e6fdefc80fbf5023aee4a1 with output stdout: 496176        /var/lib/docker/overlay2/23e67a280e772edd18Jan 24 17:09:48 k8s-master-39231420-0 docker[2407]: , stderr: du: cannot access '/var/lib/docker/overlay2/23e67a280e772edd18c1903137e72e10c30c144df6e6fdefc80fbf5023aee4a1/merged/proc/5319/task/5319/fd/4': No such file or directory
Jan 24 17:09:48 k8s-master-39231420-0 docker[2407]: du: cannot access '/var/lib/docker/overlay2/23e67a280e772edd18c1903137e72e10c30c144df6e6fdefc80fbf5023aee4a1/merged/proc/5319/task/5319/fdinfo/4': No such file or directory
Jan 24 17:09:48 k8s-master-39231420-0 docker[2407]: du: cannot access '/var/lib/docker/overlay2/23e67a280e772edd18c1903137e72e10c30c144df6e6fdefc80fbf5023aee4a1/merged/proc/5319/fd/3': No such file or directory
Jan 24 17:09:48 k8s-master-39231420-0 docker[2407]: du: cannot access '/var/lib/docker/overlay2/23e67a280e772edd18c1903137e72e10c30c144df6e6fdefc80fbf5023aee4a1/merged/proc/5319/fdinfo/3': No such file or directory
Jan 24 17:09:48 k8s-master-39231420-0 docker[2407]:  - exit status 1, rootInodeErr: <nil>, extraDiskErr: <nil>

We still do not know what the root issue is.
I have a ACS deployment via the azure portal and have seen this issue.
if there is any other logs that could help this issue be resolve i will try my best to get them.

Thank you ahead of time.

What you expected to happen:

How to reproduce it (as minimally and precisely as possible):

deploy environment and wait for cluster enter this state.

Anything else we need to know:

@jamoham
Copy link

jamoham commented Jan 30, 2018

Is this a duplicate of #863?

@gardlt
Copy link
Author

gardlt commented Jan 31, 2018

Closing due to dup

@gardlt gardlt closed this as completed Jan 31, 2018
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants