Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unable to connect to kind API server when using Kind inside Kubernetes #3622

Closed
jayesh-srivastava opened this issue May 20, 2024 · 9 comments
Labels
kind/support Categorizes issue or PR as a support question.

Comments

@jayesh-srivastava
Copy link
Member

What happened: I am creating a test setup similar to Cluster API and Providers repos where I want to run e2e cluster jobs inside test-pods using prow jobs. The test initiates by creating a kind management cluster. I am able to create a kind cluster but I am not unable to connect to this kind cluster's api server. I get a The connection to the server 127.0.0.1:43357 was refused - did you specify the right host or port? error. I have gone through #303 and mounted these paths and changed the dnsPolicy to Default but the error still persists.

 - mountPath: /lib/modules
    name: modules
    readOnly: true
 - mountPath: /sys/fs/cgroup
    name: cgroup
 - name: docker-root
    mountPath: /var/lib/docker

What you expected to happen:I expected to be able to connect to the kind cluster.

How to reproduce it (as minimally and precisely as possible):

Anything else we need to know?:

Environment:

  • kind version: (use kind version): v0.23.0
  • Runtime info: (use docker info, podman info or nerdctl info):
Client:
 Version:    25.0.5
 Context:    default
 Debug Mode: false
 Plugins:
  buildx: Docker Buildx (Docker Inc.)
    Version:  v0.8.2
    Path:     /usr/libexec/docker/cli-plugins/docker-buildx

Server:
 Containers: 3
  Running: 3
  Paused: 0
  Stopped: 0
 Images: 3
 Server Version: 23.0.3
 Storage Driver: overlay2
  Backing Filesystem: extfs
  Supports d_type: true
  Using metacopy: false
  Native Overlay Diff: true
  userxattr: false
 Logging Driver: json-file
 Cgroup Driver: systemd
 Cgroup Version: 2
 Plugins:
  Volume: local
  Network: bridge host ipvlan macvlan null overlay
  Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
 Swarm: inactive
 Runtimes: io.containerd.runc.v2 runc
 Default Runtime: runc
 Init Binary: docker-init
 containerd version: 4e1fe7492b9df85914c389d1f15a3ceedbb280ac
 runc version: a916309fff0f838eb94e928713dbc3c0d0ac7aa4
 init version: fec3683b971d9c3ef73f284f176672c44b448662
 Security Options:
  apparmor
  seccomp
   Profile: builtin
  cgroupns
 Kernel Version: 5.15.146+
 Operating System: Container-Optimized OS from Google
 OSType: linux
 Architecture: x86_64
 CPUs: 8
 Total Memory: 31.36GiB
 Name: prow-abcd
 ID: 8c0d8a83-b2df-4429-8ab3-7bed1934ae0b
 Docker Root Dir: /var/lib/docker
 Debug Mode: false
 Experimental: false
 Insecure Registries:
  10.0.0.0/8
  127.0.0.0/8
 Registry Mirrors:
  https://mirror.gcr.io/
 Live Restore Enabled: true
  • OS (e.g. from /etc/os-release): alpine
  • Kubernetes version: (use kubectl version): v1.30.0
  • Any proxies or other special environment settings?:
@jayesh-srivastava jayesh-srivastava added the kind/bug Categorizes issue or PR as related to a bug. label May 20, 2024
@BenTheElder BenTheElder added kind/support Categorizes issue or PR as a support question. and removed kind/bug Categorizes issue or PR as related to a bug. labels May 20, 2024
@BenTheElder
Copy link
Member

First of all: I do NOT recommend using Kubernetes inside Kubernetes (kind or otherwise) as there's a lot of confusing behavior when attempting to nest them, that said Kubernetes does for ... reasons.

I get a The connection to the server 127.0.0.1:43357 was refused - did you specify the right host or port?

This is not nearly enough information to debug.

You might not be able to connect because the client isn't on the same network (note that 127.0.0.1 is local to a network namespace, so wherever the container runtime is that the kind nodes are running on), or the cluster might not be up or ...

@jayesh-srivastava
Copy link
Member Author

@BenTheElder I understand the network disparity here. I was also looking at #523 where a comment is provided(#523 (comment)). According to this looks like an IP forwarding is required.
But providing a config to kind and then forwarding the IP, seems like manual steps. Is there anyway where a manual intervention is not required. Just like Cluster API and other providers repos, where they perform e2e tests using kind in Kubernetes way, I have the same use-case.

@BenTheElder
Copy link
Member

According to this looks like an IP forwarding is required.
But providing a config to kind and then forwarding the IP, seems like manual steps. Is there anyway where a manual intervention is not required. Just like Cluster API and other providers repos, where they perform e2e tests using kind in Kubernetes way, I have the same use-case.

There's no additional forwarding because they're running any other steps in the same network namespace as the node containers. I.E. the container running dind is also running cluster-API.

@jayesh-srivastava
Copy link
Member Author

Hi @BenTheElder , I was just playing around and trying to get a workaround. So the nested docker container(kind's control plane), I could exec into it and get the /etc/kubernetes/admin.conf, but, I want to access it outside of that nested docker container, in the host container. I am not able to figure out this thing. Is there a way I can do that? Just to again clarify, this host container along with some other container are a part of a GKE cluster's pod.

And I was getting this error when trying to access the nested container(kind control-plane) from the host container:

E0522 19:50:56.903340   17267 memcache.go:265] couldn't get current server API group list: Get "https://127.0.0.1:34335/api?timeout=32s": dial tcp 127.0.0.1:34335: connect: connection refused
E0522 19:50:56.904786   17267 memcache.go:265] couldn't get current server API group list: Get "https://127.0.0.1:34335/api?timeout=32s": dial tcp 127.0.0.1:34335: connect: connection refused
E0522 19:50:56.905070   17267 memcache.go:265] couldn't get current server API group list: Get "https://127.0.0.1:34335/api?timeout=32s": dial tcp 127.0.0.1:34335: connect: connection refused
E0522 19:50:56.906505   17267 memcache.go:265] couldn't get current server API group list: Get "https://127.0.0.1:34335/api?timeout=32s": dial tcp 127.0.0.1:34335: connect: connection refused
The connection to the server 127.0.0.1:34335 was refused - did you specify the right host or port?

@BenTheElder
Copy link
Member

So the nested docker container(kind's control plane), I could exec into it and get the /etc/kubernetes/admin.conf, but, I want to access it outside of that nested docker container, in the host container

Wherever you ran kind create cluster, should be able to access it, unless it's using a remote docker daemon or something.

In cluster API's CI, it looks like:

host => container running dind + kind + cluster API CI scripts => kind nodes => kind pods

That "just works" with the normally exported KUBECONFIG from kind create cluster because the cluster API CI steps are running in the same container as dind / the nodes / ...

Just to again clarify, this host container along with some other container are a part of a GKE cluster's pod.
And I was getting this error when trying to access the nested container(kind control-plane) from the host container:

I can't quite tell but it sounds like your layout is more like:
host => container running dind / kind => kind nodes => kind pods
host => CI steps

It's a LOT more complicated and really out of scope for us / not recommended ... you will have to either operate something like an SSH tunnel or configure the kind cluster to expose to something other than localhost (which we don't recommend for security purposes), the localhost addresses are not going to be accessible between different pods / containers.

https://kind.sigs.k8s.io/docs/user/configuration/#api-server

@jayesh-srivastava
Copy link
Member Author

jayesh-srivastava commented May 22, 2024

@BenTheElder
So my layout looks like:
test-pod => container running kind + CI scripts after exporting kind kubeconfig

Wherever you ran kind create cluster, should be able to access it, unless it's using a remote docker daemon or something.

Yes this is what still bothers me.

or configure the kind cluster to expose to something other than localhost (which we don't recommend for security purposes), the localhost addresses are not going to be accessible between different pods / containers.

Right, I get the security aspect of this.

But my kubectl just denies connecting to the local address which I get in my kubeconfig. I have tried providing a config with kind too with apiServerAddress: 0.0.0.0 but that also doesn't work.

@BenTheElder
Copy link
Member

test-pod => container running kind + CI scripts after exporting kind kubeconfig

To be clear, kind export kubeconfig? or the exported config from kind create cluster?

because kind export kubeconfig is meant to be local to where docker is running, it has no idea about where a remote instance might be.

If it's from kind create cluster, running in the same dind container, and you can't access it, something is broken with the networking in this environment and you'll have to debug that. You could do a more minimal test without kind by just running any container with a networked service and a docker port forward and getting that part to work in your dind environment.

@jayesh-srivastava
Copy link
Member Author

@BenTheElder I mean kind create cluster .

@BenTheElder
Copy link
Member

BenTheElder commented May 23, 2024

I would start debugging from just a container with a minimal docker port forward to hello-world and see what it takes to get that working in the dind environment.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/support Categorizes issue or PR as a support question.
Projects
None yet
Development

No branches or pull requests

2 participants