Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cant create a kind cluster after delete cluster in a docker in docker vscode devcontainer #3370

Open
KieranJeffreySmart opened this issue Sep 26, 2023 · 15 comments
Labels
kind/bug Categorizes issue or PR as related to a bug.

Comments

@KieranJeffreySmart
Copy link

What happened:
I am trying to create a kind cluster in a vscode devcontainer. I am working on windows with docker desktop and have been using a docker inside docker template.

When the container is first constructed I am able to create a cluster using kind create cluster from a terminal within the container and this works successful

However if i delete the cluster and try to create again it fails.

This doesn't happen when I repeat the process on the host windows machine, it will create every time.

This is to be used in a script so I need it to be repeatable, delete cluster then create cluster

kind-control-plane.zip

Thanks in advance for any assistance

What you expected to happen:
A new cluster is created

How to reproduce it (as minimally and precisely as possible):

  1. Create a new devcontainer in VSCode from New Dev Container... menu option
  2. Create from Docker in Docker template
  3. Add features for kind, kubectl and node
  4. create devcontainer
  5. open a terminal
  6. enter command kind create cluster
  7. enter command kind delete cluster
  8. enter command kind create cluster

Anything else we need to know?:

Environment:
Windows 11
Docker Desktop 4.23.0
Dev Container Features

"ghcr.io/devcontainers/features/node:1": {},
"ghcr.io/mpriscella/features/kind:1": {},
"ghcr.io/devcontainers-contrib/features/kubectl-asdf:2": {}
Docker Info from inside Dev Container:

Client:
 Context:    default
 Debug Mode: false
 Plugins:
  buildx: Docker Buildx (Docker Inc.)
    Version:  v0.11.2
    Path:     /home/vscode/.docker/cli-plugins/docker-buildx
  compose: Docker Compose (Docker Inc.)
    Version:  2.21.0-1
    Path:     /usr/libexec/docker/cli-plugins/docker-compose

Server:
 Containers: 1
  Running: 1
  Paused: 0
  Stopped: 0
 Images: 1
 Server Version: 23.0.6+azure-2
 Storage Driver: overlay2
  Backing Filesystem: extfs
  Supports d_type: true
  Using metacopy: false
  Native Overlay Diff: true
  userxattr: false
 Logging Driver: json-file
 Cgroup Driver: cgroupfs
 Cgroup Version: 1
 Plugins:
  Volume: local
  Network: bridge host ipvlan macvlan null overlay
  Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
 Swarm: inactive
 Runtimes: io.containerd.runc.v2 runc
 Default Runtime: runc
 Init Binary: docker-init
 containerd version: 61f9fd88f79f081d64d6fa3bb1a0dc71ec870523
 runc version: ccaecfcbc907d70a7aa870a6650887b901b25b82
 init version: 
 Security Options:
  seccomp
   Profile: builtin
 Kernel Version: 5.10.102.1-microsoft-standard-WSL2
 Operating System: Debian GNU/Linux 11 (bullseye) (containerized)
 OSType: linux
 Architecture: x86_64
 CPUs: 12
 Total Memory: 15.5GiB
 Name: 76da64a73ada
 ID: ee13f67f-b2b0-4995-8883-dd3c59c7f619
 Docker Root Dir: /var/lib/docker
 Debug Mode: false
 Registry: https://index.docker.io/v1/
 Experimental: false
 Insecure Registries:
  127.0.0.0/8
 Live Restore Enabled: false

WARNING: No blkio throttle.read_bps_device support
WARNING: No blkio throttle.write_bps_device support
WARNING: No blkio throttle.read_iops_device support
WARNING: No blkio throttle.write_iops_device support
@KieranJeffreySmart KieranJeffreySmart added the kind/bug Categorizes issue or PR as related to a bug. label Sep 26, 2023
@BenTheElder
Copy link
Member

I am trying to create a kind cluster in a vscode devcontainer. I am working on windows with docker desktop and have been using a docker inside docker template.

We don't recommend this and it may be a bug in the docker in docker environment.

Please avoid adding additional nesting, it's a real headache to debug.

@mboutet
Copy link

mboutet commented Oct 6, 2023

@BenTheElder, I see that you replied this to a lot of similar issues lately, but I just want to say that using kind within an already containerized environment is a totally acceptable use case. Two important use cases:

  • Dev containers. My team extensively leverages this technology to ensure repeatable dev environments that are uniform across all developers. In my team, we develop K8s operators in dev containers using kind.
  • Containerized CI runners. This is really common to leverage containerized ephemeral runners to ensure repeatability of CI jobs as well as being able to easily scale those runners horizontally on K8s. In my team, we run integration tests in our CI using kind for our in-house operators.

I understand that this adds complexity on your end and makes debugging more difficult, but I just want to make sure you're aware of the valid use cases of running kind in containerized environments. Those use cases won't go away.

@KieranJeffreySmart, see #3283 (comment). TL;DR, you likely need to enable cgroup v2 on the VM on which Docker runs for kind v20+ to work properly.

@BenTheElder
Copy link
Member

BenTheElder commented Oct 12, 2023

I'm aware of the use cases, but we have limited bandwidth to provide supprt and it's available as a static go binary, you can't containerize docker itself either.

We'll happily review proposed fixes from contributors but I just cannot justify spending my own time debugging this versus steering people towards more debuggable alternatives.

Kind is already running containers in containers which is unfortunately insecure and error prone but similarly useful, I highly recommend avoid doing this again with another layer.

See also #303 for additional footguns running nested inside of another Kubernetes cluster.

@BenTheElder
Copy link
Member

For Windows specifically: #1529, nobody has contributed to work on CI for windows.
aojea and I don't use windows for development, so we depend on community contributions to keep the WSL2 docs up to date and identify fixes for us to review or sometimes implement without being able to directly verify ourselves.

... let alone adding container nesting on Windows.

@dboreham
Copy link

dboreham commented Jan 7, 2024

... let alone adding container nesting on Windows.

Quick note for the audience with no Windows exposure: containers/docker on Windows (except for actual Windows containers which nobody uses) runs in a Linux kernel and for the most part behaves the same as if it were running on a bare metal Linux box. Although it's convenient, you don't need to run Docker Desktop on Windows -- regular Linux docker, or podman will work fine inside WSL2. Therefore the issues with nesting containers are essentially the same as for stock Linux.

@BenTheElder
Copy link
Member

BenTheElder commented Jan 8, 2024

Therefore the issues with nesting containers are essentially the same as for stock Linux.

We tell people to avoid running kind in docker-in-docker on Linux. It's generally not necessary (it's no more secure than just passing the host dockerd socket, and more effort) and creates a lot of additional problems. There are some use cases where it makes sense, but adding another layer of nested containers is very "here be dragons".

@BenTheElder
Copy link
Member

Also the environment in WSL2 is different from Linux run elsewhere, e.g. it often has a custom init system, and we don't have easy access to reproduce and debug (or the time / inclination really, there's so much to do and OSS developers could use Linux and we don't use Windows ourselves, nor is it really supported for developing Kubernetes/Kubernetes https://kind.sigs.k8s.io/docs/contributing/project-scope/)

(Difference in the init, Kernel => different cgroups management => impact on containers)

@dboreham
Copy link

dboreham commented Jan 8, 2024

init is out of scope here though, since we're running inside a container.

btw it turns out nested kind works just fine now, provided the container has the necessary secret sauce. The stock docker:dind container is an example of such a thing, albeit Alpine so...not for everyone. There is an Ubuntu equivalent image that also works: https://github.com/cruizba/ubuntu-dind

You can start that container, install kind (or k3d) and create a cluster. It can be used as an existence proof from which to generate your own image for CI and so on.

@BenTheElder
Copy link
Member

init is out of scope here though, since we're running inside a container.

It's not, the init is responsible for setting up cgroups amongst other things and we're sharing that along with the rest of the kernel from the host since we're using containers instead of VMs. Privileged containers like kind nodes are "leakier" than normal containers but all containers are influenced by the host's init.

@dboreham
Copy link

dboreham commented Jan 8, 2024

Well, I've tested stock WSL2 on x86 and it works. I'll try aarm64 and report back...

@dboreham
Copy link

dboreham commented Jan 8, 2024

Well, I've tested stock WSL2 on x86 and it works. I'll try aarm64 and report back...

Reporting back: ARM WSL2 doesn't work :(

@dboreham
Copy link

Fwiw, the issue where kind delete cluster followed by kind create cluster fails running in dind (original problem reported above) occurs on regular x64 Ubuntu too (unrelated to WSL2).

@BenTheElder
Copy link
Member

This sort of problem is likely eliminated in cgroup v2+ cgroupns hosts and cgroup v1 is going into maintenance mode by Kubernetes kubernetes/enhancements#4572 and deprecated soon by various ecosystem projects (like OCI, systemd)

On cgroup v1 we started forcing cgroupns=private on kind nodes which may help with some of these problems.

@dkirrane
Copy link

dkirrane commented Dec 6, 2024

@BenTheElder spent hours looking at this. Thanks for the hint.
Set this in %UserProfile%\.wslconfig and restarted WSL2.

[wsl2]
kernelCommandLine = cgroup_no_v1=all

My kind devcontainer then worked!

@BenTheElder
Copy link
Member

awesome! maybe we should add this as a hint to https://kind.sigs.k8s.io/docs/user/using-wsl2/

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug.
Projects
None yet
Development

No branches or pull requests

5 participants