Kube-router unable to connect API server on start because nodelocaldns can't access Service IP provided by kube-router #6175

rearden-steel · 2020-05-21T18:50:32Z

After deploying a new cluster we have experienced a strange problem — kube-router pod on some node stucks in CrashLoopBackOff.
The log file of kube-router says timeout connecting to API server:

E0521 09:25:04.217633    1733 reflector.go:205] github.com/cloudnativelabs/kube-router/vendor/k8s.io/client-go/informers/factory.go:73: Failed to list *v1.Node: Get https://localhost:6443/api/v1/nodes?resourceVersion=0: dial tcp: i/o timeout

Checking strace of the kube-router reveals that it tries to resolve localhost by querying nodelocaldns on it's IP address and gets a timeout.
In logs of nodelocaldns it tries to access DNS service IP's provided by kube-router.

The problem solves if 127.0.0.1 is specified instead of localhost in kube-router kubeconfig, in inventory like this:

kube_apiserver_endpoint: https://127.0.0.1:6443

Environment:

Cloud provider or hardware configuration:
Bare metal k8s cluster
OS (printf "$(uname -srm)\n$(cat /etc/os-release)\n"):

Linux 4.18.0-147.8.1.el8_1.x86_64 x86_64
NAME="CentOS Linux"
VERSION="8 (Core)"
ID="centos"
ID_LIKE="rhel fedora"
VERSION_ID="8"
PLATFORM_ID="platform:el8"
PRETTY_NAME="CentOS Linux 8 (Core)"
ANSI_COLOR="0;31"
CPE_NAME="cpe:/o:centos:centos:8"
HOME_URL="https://www.centos.org/"
BUG_REPORT_URL="https://bugs.centos.org/"

CENTOS_MANTISBT_PROJECT="CentOS-8"
CENTOS_MANTISBT_PROJECT_VERSION="8"
REDHAT_SUPPORT_PRODUCT="centos"
REDHAT_SUPPORT_PRODUCT_VERSION="8"

Version of Ansible (ansible --version):
Version of Python (python --version):

Kubespray version (commit) (git rev-parse --short HEAD):
2.13.0
01dbc90

Network plugin used:
kube-router

Full inventory with variables (ansible -i inventory/sample/inventory.ini all -m debug -a "var=hostvars[inventory_hostname]"):

Command used to invoke ansible:

Output of ansible run:

Anything else do we need to know:

The text was updated successfully, but these errors were encountered:

mikesmitty · 2020-05-25T22:54:14Z

I ran into this issue as well with version 2.13.0 with Ubuntu 18.04 nodes

qingkunl · 2020-07-27T04:46:51Z

I got the same issue as well

qingkunl · 2020-07-27T07:59:47Z

This is because kube-router uses alpine as base image where /etc/nsswitch.conf is not included, as a result, the localhost cannot be resolved from /etc/hosts. I submitted cloudnativelabs/kube-router#957 to work around this issue.

qingkunl · 2020-08-06T04:45:23Z

This is because kube-router uses alpine as base image where /etc/nsswitch.conf is not included, as a result, the localhost cannot be resolved from /etc/hosts. I submitted cloudnativelabs/kube-router#957 to work around this issue.

My kube-router PR has been merged and released in v1.0.1, and #6479 has updated kube-router to v1.0.1 in Kubespray. So this issue should have been fixed.

fejta-bot · 2020-11-04T05:28:52Z

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

fejta-bot · 2020-12-04T06:13:28Z

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle rotten

fejta-bot · 2021-01-03T06:58:48Z

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close

k8s-ci-robot · 2021-01-03T06:58:54Z

@fejta-bot: Closing this issue.

In response to this:

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

rearden-steel added the kind/bug Categorizes issue or PR as related to a bug. label May 21, 2020

qingkunl mentioned this issue Jul 27, 2020

add /etc/nsswitch.conf in Dockerfile cloudnativelabs/kube-router#957

Merged

k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Nov 4, 2020

k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Dec 4, 2020

k8s-ci-robot closed this as completed Jan 3, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Kube-router unable to connect API server on start because nodelocaldns can't access Service IP provided by kube-router #6175

Kube-router unable to connect API server on start because nodelocaldns can't access Service IP provided by kube-router #6175

rearden-steel commented May 21, 2020 •

edited

Loading

mikesmitty commented May 25, 2020

qingkunl commented Jul 27, 2020

qingkunl commented Jul 27, 2020

qingkunl commented Aug 6, 2020

fejta-bot commented Nov 4, 2020

fejta-bot commented Dec 4, 2020

fejta-bot commented Jan 3, 2021

k8s-ci-robot commented Jan 3, 2021

Kube-router unable to connect API server on start because nodelocaldns can't access Service IP provided by kube-router #6175

Kube-router unable to connect API server on start because nodelocaldns can't access Service IP provided by kube-router #6175

Comments

rearden-steel commented May 21, 2020 • edited Loading

mikesmitty commented May 25, 2020

qingkunl commented Jul 27, 2020

qingkunl commented Jul 27, 2020

qingkunl commented Aug 6, 2020

fejta-bot commented Nov 4, 2020

fejta-bot commented Dec 4, 2020

fejta-bot commented Jan 3, 2021

k8s-ci-robot commented Jan 3, 2021

rearden-steel commented May 21, 2020 •

edited

Loading