Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Failing e2e: restart kube-controller-manager fails #68246

Closed
msau42 opened this issue Sep 4, 2018 · 3 comments
Closed

Failing e2e: restart kube-controller-manager fails #68246

msau42 opened this issue Sep 4, 2018 · 3 comments
Assignees
Labels
kind/bug Categorizes issue or PR as related to a bug. priority/critical-urgent Highest priority. Must be actively worked on as someone's top priority right now. sig/auth Categorizes an issue or PR as relevant to SIG Auth. sig/storage Categorizes an issue or PR as relevant to SIG Storage.

Comments

@msau42
Copy link
Member

msau42 commented Sep 4, 2018

Is this a BUG REPORT or FEATURE REQUEST?:
@kubernetes/sig-storage-bugs
@kubernetes/sig-auth-bugs

What happened:
Since 8/31 10am, the following test case fails. It kills kube-controller-manager and waits for it to restart. However, it seems like kube-controller-manager failed to come back up:
[sig-storage] NFSPersistentVolumes[Disruptive][Flaky] when kube-controller-manager restarts should delete a bound PVC from a clientPod, restart the kube-control-manager, and ensure the kube-controller-manager does not crash

Testgrid
Logs

kube-controller-manager failed to come up with this error:

I0904 17:53:07.775832       1 serving.go:293] Generated self-signed cert (/var/run/kubernetes/kube-controller-manager.crt, /var/run/kubernetes/kube-controller-manager.key)
failed to get delegated authentication kubeconfig: failed to get delegated authentication kubeconfig: failed to read token file "/var/run/secrets/kubernetes.io/serviceaccount/token": open /var/run/secrets/kubernetes.io/serviceaccount/token: no such file or directory
@k8s-ci-robot k8s-ci-robot added sig/storage Categorizes an issue or PR as relevant to SIG Storage. kind/bug Categorizes issue or PR as related to a bug. sig/auth Categorizes an issue or PR as relevant to SIG Auth. labels Sep 4, 2018
@liggitt
Copy link
Member

liggitt commented Sep 4, 2018

likely #64149

/assign @sttts

@msau42 msau42 changed the title Flaky e2e: restart kube-controller-manager fails Failing e2e: restart kube-controller-manager fails Sep 4, 2018
@msau42
Copy link
Member Author

msau42 commented Sep 4, 2018

The other test cases that test kubelet restart also become very flaky at the same time. I suspect those tests will fail setup if they are run after this kube-controller restart test case

@liggitt liggitt added the priority/critical-urgent Highest priority. Must be actively worked on as someone's top priority right now. label Sep 4, 2018
@sttts
Copy link
Contributor

sttts commented Sep 5, 2018

It looks like the kube-controller-manager is started as static pod in the gci e2e test. When it first starts, there is no kubernetes service and the KUBERNETES_SERVICE_HOST and KUBERNETES_SERVICE_PORT env vars are not set inside the container.

When the kube-controller-manager is restarted, the KUBERNETES_SERVICE_HOST and KUBERNETES_SERVICE_PORT env vars are set, but the static pod has no service account, i.e. /var/run/secrets/kubernetes.io/serviceaccount/token does not exist. We made the later fatal in rest.InClusterConfig and its use to setup delegated authn/z.

k8s-github-robot pushed a commit that referenced this issue Sep 5, 2018
Automatic merge from submit-queue (batch tested with PRs 68265, 68273). If you want to cherry-pick this change to another branch, please follow the instructions here: https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md.

apiserver: make InClusterConfig errs for delegated authn/z non-fatal

Fixes #68246:

Background:

In gci e2e tests the kube-controller-manager is started as static pod. When it first starts, there is no kubernetes service and the KUBERNETES_SERVICE_HOST and KUBERNETES_SERVICE_PORT env vars are not set inside the container.

When the kube-controller-manager is restarted, the KUBERNETES_SERVICE_HOST and KUBERNETES_SERVICE_PORT env vars are set, but the static pod has no service account, i.e. /var/run/secrets/kubernetes.io/serviceaccount/token does not exist. We made the later fatal in rest.InClusterConfig and its use to setup delegated authn/z.
k8s-publishing-bot added a commit to kubernetes/apiserver that referenced this issue Sep 5, 2018
Automatic merge from submit-queue (batch tested with PRs 68265, 68273). If you want to cherry-pick this change to another branch, please follow the instructions here: https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md.

apiserver: make InClusterConfig errs for delegated authn/z non-fatal

Fixes kubernetes/kubernetes#68246:

Background:

In gci e2e tests the kube-controller-manager is started as static pod. When it first starts, there is no kubernetes service and the KUBERNETES_SERVICE_HOST and KUBERNETES_SERVICE_PORT env vars are not set inside the container.

When the kube-controller-manager is restarted, the KUBERNETES_SERVICE_HOST and KUBERNETES_SERVICE_PORT env vars are set, but the static pod has no service account, i.e. /var/run/secrets/kubernetes.io/serviceaccount/token does not exist. We made the later fatal in rest.InClusterConfig and its use to setup delegated authn/z.

Kubernetes-commit: 2c933695fa61d57d1c6fa5defb89caed7d49f773
sttts pushed a commit to sttts/apiserver that referenced this issue Sep 5, 2018
Automatic merge from submit-queue (batch tested with PRs 68265, 68273). If you want to cherry-pick this change to another branch, please follow the instructions here: https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md.

apiserver: make InClusterConfig errs for delegated authn/z non-fatal

Fixes kubernetes/kubernetes#68246:

Background:

In gci e2e tests the kube-controller-manager is started as static pod. When it first starts, there is no kubernetes service and the KUBERNETES_SERVICE_HOST and KUBERNETES_SERVICE_PORT env vars are not set inside the container.

When the kube-controller-manager is restarted, the KUBERNETES_SERVICE_HOST and KUBERNETES_SERVICE_PORT env vars are set, but the static pod has no service account, i.e. /var/run/secrets/kubernetes.io/serviceaccount/token does not exist. We made the later fatal in rest.InClusterConfig and its use to setup delegated authn/z.

Kubernetes-commit: 2c933695fa61d57d1c6fa5defb89caed7d49f773
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug. priority/critical-urgent Highest priority. Must be actively worked on as someone's top priority right now. sig/auth Categorizes an issue or PR as relevant to SIG Auth. sig/storage Categorizes an issue or PR as relevant to SIG Storage.
Projects
None yet
Development

No branches or pull requests

4 participants