Cant create a kind cluster after delete cluster in a docker in docker vscode devcontainer #3370

KieranJeffreySmart · 2023-09-26T09:13:58Z

What happened:
I am trying to create a kind cluster in a vscode devcontainer. I am working on windows with docker desktop and have been using a docker inside docker template.

When the container is first constructed I am able to create a cluster using kind create cluster from a terminal within the container and this works successful

However if i delete the cluster and try to create again it fails.

This doesn't happen when I repeat the process on the host windows machine, it will create every time.

This is to be used in a script so I need it to be repeatable, delete cluster then create cluster

kind-control-plane.zip

Thanks in advance for any assistance

What you expected to happen:
A new cluster is created

How to reproduce it (as minimally and precisely as possible):

Create a new devcontainer in VSCode from New Dev Container... menu option
Create from Docker in Docker template
Add features for kind, kubectl and node
create devcontainer
open a terminal
enter command kind create cluster
enter command kind delete cluster
enter command kind create cluster

Anything else we need to know?:

Environment:
Windows 11
Docker Desktop 4.23.0
Dev Container Features

"ghcr.io/devcontainers/features/node:1": {},
"ghcr.io/mpriscella/features/kind:1": {},
"ghcr.io/devcontainers-contrib/features/kubectl-asdf:2": {}
Docker Info from inside Dev Container:

Client:
 Context:    default
 Debug Mode: false
 Plugins:
  buildx: Docker Buildx (Docker Inc.)
    Version:  v0.11.2
    Path:     /home/vscode/.docker/cli-plugins/docker-buildx
  compose: Docker Compose (Docker Inc.)
    Version:  2.21.0-1
    Path:     /usr/libexec/docker/cli-plugins/docker-compose

Server:
 Containers: 1
  Running: 1
  Paused: 0
  Stopped: 0
 Images: 1
 Server Version: 23.0.6+azure-2
 Storage Driver: overlay2
  Backing Filesystem: extfs
  Supports d_type: true
  Using metacopy: false
  Native Overlay Diff: true
  userxattr: false
 Logging Driver: json-file
 Cgroup Driver: cgroupfs
 Cgroup Version: 1
 Plugins:
  Volume: local
  Network: bridge host ipvlan macvlan null overlay
  Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
 Swarm: inactive
 Runtimes: io.containerd.runc.v2 runc
 Default Runtime: runc
 Init Binary: docker-init
 containerd version: 61f9fd88f79f081d64d6fa3bb1a0dc71ec870523
 runc version: ccaecfcbc907d70a7aa870a6650887b901b25b82
 init version: 
 Security Options:
  seccomp
   Profile: builtin
 Kernel Version: 5.10.102.1-microsoft-standard-WSL2
 Operating System: Debian GNU/Linux 11 (bullseye) (containerized)
 OSType: linux
 Architecture: x86_64
 CPUs: 12
 Total Memory: 15.5GiB
 Name: 76da64a73ada
 ID: ee13f67f-b2b0-4995-8883-dd3c59c7f619
 Docker Root Dir: /var/lib/docker
 Debug Mode: false
 Registry: https://index.docker.io/v1/
 Experimental: false
 Insecure Registries:
  127.0.0.0/8
 Live Restore Enabled: false

WARNING: No blkio throttle.read_bps_device support
WARNING: No blkio throttle.write_bps_device support
WARNING: No blkio throttle.read_iops_device support
WARNING: No blkio throttle.write_iops_device support

The text was updated successfully, but these errors were encountered:

BenTheElder · 2023-09-26T17:22:39Z

I am trying to create a kind cluster in a vscode devcontainer. I am working on windows with docker desktop and have been using a docker inside docker template.

We don't recommend this and it may be a bug in the docker in docker environment.

Please avoid adding additional nesting, it's a real headache to debug.

mboutet · 2023-10-06T22:37:46Z

@BenTheElder, I see that you replied this to a lot of similar issues lately, but I just want to say that using kind within an already containerized environment is a totally acceptable use case. Two important use cases:

Dev containers. My team extensively leverages this technology to ensure repeatable dev environments that are uniform across all developers. In my team, we develop K8s operators in dev containers using kind.
Containerized CI runners. This is really common to leverage containerized ephemeral runners to ensure repeatability of CI jobs as well as being able to easily scale those runners horizontally on K8s. In my team, we run integration tests in our CI using kind for our in-house operators.

I understand that this adds complexity on your end and makes debugging more difficult, but I just want to make sure you're aware of the valid use cases of running kind in containerized environments. Those use cases won't go away.

@KieranJeffreySmart, see #3283 (comment). TL;DR, you likely need to enable cgroup v2 on the VM on which Docker runs for kind v20+ to work properly.

BenTheElder · 2023-10-12T22:41:48Z

I'm aware of the use cases, but we have limited bandwidth to provide supprt and it's available as a static go binary, you can't containerize docker itself either.

We'll happily review proposed fixes from contributors but I just cannot justify spending my own time debugging this versus steering people towards more debuggable alternatives.

Kind is already running containers in containers which is unfortunately insecure and error prone but similarly useful, I highly recommend avoid doing this again with another layer.

See also #303 for additional footguns running nested inside of another Kubernetes cluster.

BenTheElder · 2023-10-12T22:44:40Z

For Windows specifically: #1529, nobody has contributed to work on CI for windows.
aojea and I don't use windows for development, so we depend on community contributions to keep the WSL2 docs up to date and identify fixes for us to review or sometimes implement without being able to directly verify ourselves.

... let alone adding container nesting on Windows.

dboreham · 2024-01-07T18:21:13Z

... let alone adding container nesting on Windows.

Quick note for the audience with no Windows exposure: containers/docker on Windows (except for actual Windows containers which nobody uses) runs in a Linux kernel and for the most part behaves the same as if it were running on a bare metal Linux box. Although it's convenient, you don't need to run Docker Desktop on Windows -- regular Linux docker, or podman will work fine inside WSL2. Therefore the issues with nesting containers are essentially the same as for stock Linux.

BenTheElder · 2024-01-08T18:00:59Z

Therefore the issues with nesting containers are essentially the same as for stock Linux.

We tell people to avoid running kind in docker-in-docker on Linux. It's generally not necessary (it's no more secure than just passing the host dockerd socket, and more effort) and creates a lot of additional problems. There are some use cases where it makes sense, but adding another layer of nested containers is very "here be dragons".

BenTheElder · 2024-01-08T18:04:55Z

Also the environment in WSL2 is different from Linux run elsewhere, e.g. it often has a custom init system, and we don't have easy access to reproduce and debug (or the time / inclination really, there's so much to do and OSS developers could use Linux and we don't use Windows ourselves, nor is it really supported for developing Kubernetes/Kubernetes https://kind.sigs.k8s.io/docs/contributing/project-scope/)

(Difference in the init, Kernel => different cgroups management => impact on containers)

dboreham · 2024-01-08T18:54:28Z

init is out of scope here though, since we're running inside a container.

btw it turns out nested kind works just fine now, provided the container has the necessary secret sauce. The stock docker:dind container is an example of such a thing, albeit Alpine so...not for everyone. There is an Ubuntu equivalent image that also works: https://github.com/cruizba/ubuntu-dind

You can start that container, install kind (or k3d) and create a cluster. It can be used as an existence proof from which to generate your own image for CI and so on.

BenTheElder · 2024-01-08T19:07:53Z

init is out of scope here though, since we're running inside a container.

It's not, the init is responsible for setting up cgroups amongst other things and we're sharing that along with the rest of the kernel from the host since we're using containers instead of VMs. Privileged containers like kind nodes are "leakier" than normal containers but all containers are influenced by the host's init.

dboreham · 2024-01-08T19:18:11Z

Well, I've tested stock WSL2 on x86 and it works. I'll try aarm64 and report back...

dboreham · 2024-01-08T21:37:22Z

Well, I've tested stock WSL2 on x86 and it works. I'll try aarm64 and report back...

Reporting back: ARM WSL2 doesn't work :(

dboreham · 2024-01-22T00:54:09Z

Fwiw, the issue where kind delete cluster followed by kind create cluster fails running in dind (original problem reported above) occurs on regular x64 Ubuntu too (unrelated to WSL2).

BenTheElder · 2024-04-17T17:42:39Z

This sort of problem is likely eliminated in cgroup v2+ cgroupns hosts and cgroup v1 is going into maintenance mode by Kubernetes kubernetes/enhancements#4572 and deprecated soon by various ecosystem projects (like OCI, systemd)

On cgroup v1 we started forcing cgroupns=private on kind nodes which may help with some of these problems.

dkirrane · 2024-12-06T13:16:03Z

@BenTheElder spent hours looking at this. Thanks for the hint.
Set this in %UserProfile%\.wslconfig and restarted WSL2.

[wsl2]
kernelCommandLine = cgroup_no_v1=all

My kind devcontainer then worked!

BenTheElder · 2024-12-06T17:17:48Z

awesome! maybe we should add this as a hint to https://kind.sigs.k8s.io/docs/user/using-wsl2/

KieranJeffreySmart added the kind/bug Categorizes issue or PR as related to a bug. label Sep 26, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Cant create a kind cluster after delete cluster in a docker in docker vscode devcontainer #3370

Cant create a kind cluster after delete cluster in a docker in docker vscode devcontainer #3370

KieranJeffreySmart commented Sep 26, 2023

BenTheElder commented Sep 26, 2023

mboutet commented Oct 6, 2023

BenTheElder commented Oct 12, 2023 •

edited

Loading

BenTheElder commented Oct 12, 2023

dboreham commented Jan 7, 2024

BenTheElder commented Jan 8, 2024 •

edited

Loading

BenTheElder commented Jan 8, 2024

dboreham commented Jan 8, 2024

BenTheElder commented Jan 8, 2024

dboreham commented Jan 8, 2024

dboreham commented Jan 8, 2024

dboreham commented Jan 22, 2024

BenTheElder commented Apr 17, 2024

dkirrane commented Dec 6, 2024

BenTheElder commented Dec 6, 2024

Cant create a kind cluster after delete cluster in a docker in docker vscode devcontainer #3370

Cant create a kind cluster after delete cluster in a docker in docker vscode devcontainer #3370

Comments

KieranJeffreySmart commented Sep 26, 2023

BenTheElder commented Sep 26, 2023

mboutet commented Oct 6, 2023

BenTheElder commented Oct 12, 2023 • edited Loading

BenTheElder commented Oct 12, 2023

dboreham commented Jan 7, 2024

BenTheElder commented Jan 8, 2024 • edited Loading

BenTheElder commented Jan 8, 2024

dboreham commented Jan 8, 2024

BenTheElder commented Jan 8, 2024

dboreham commented Jan 8, 2024

dboreham commented Jan 8, 2024

dboreham commented Jan 22, 2024

BenTheElder commented Apr 17, 2024

dkirrane commented Dec 6, 2024

BenTheElder commented Dec 6, 2024

BenTheElder commented Oct 12, 2023 •

edited

Loading

BenTheElder commented Jan 8, 2024 •

edited

Loading