Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rancher-Desktop [Alpine] can't create cluster with v0.20.0 [Previously Also Colima] #3277

Closed
pmalek opened this issue Jun 15, 2023 · 73 comments
Labels
kind/bug Categorizes issue or PR as related to a bug. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. triage/accepted Indicates an issue or PR is ready to be actively worked on.

Comments

@pmalek
Copy link

pmalek commented Jun 15, 2023

What happened:

After updating to v0.20.0 I cannot create a cluster anymore.

I'm using Mac with colima

Creating cluster "colima" ...
 βœ“ Ensuring node image (kindest/node:v1.27.2) πŸ–Ό
 βœ— Preparing nodes πŸ“¦
Deleted nodes: ["colima-control-plane"]
ERROR: failed to create cluster: command "docker run --name colima-control-plane --hostname colima-control-plane --label io.x-k8s.kind.role=control-plane --privileged --security-opt seccomp=unconfined --security-opt apparmor=unconfined --tmpfs /tmp --tmpfs /run --volume /var --volume /lib/modules:/lib/modules:ro -e KIND_EXPERIMENTAL_CONTAINERD_SNAPSHOTTER --detach --tty --label io.x-k8s.kind.cluster=colima --net kind --restart=on-failure:1 --init=false --cgroupns=private --publish=127.0.0.1:52490:6443/TCP -e KUBECONFIG=/etc/kubernetes/admin.conf kindest/node:v1.27.2@sha256:3966ac761ae0136263ffdb6cfd4db23ef8a83cba8a463690e98317add2c9ba72" failed with error: exit status 125
Command Output: 3236752928bc442ebdaf6bd3b6b164643987d45b1a120ec3cd20ca14cc7f5dd7
docker: Error response from daemon: failed to create shim task: OCI runtime create failed: runc create failed: unable to start container process: error during container init: error mounting "cgroup" to rootfs at "/sys/fs/cgroup": mount cgroup:/sys/fs/cgroup/openrc (via /proc/self/fd/7), flags: 0xe, data: openrc: invalid argument: unknown.

What you expected to happen:

No error and cluster creates successfully

How to reproduce it (as minimally and precisely as possible):

  1. Try to create cluster with kind v0.20.0

Environment:

  • kind version: (use kind version): v0.20.0

  • Runtime info: (use docker info or podman info):

    Client: Docker Engine - Community
     Version:    24.0.2
     Context:    default
     Debug Mode: false
     Plugins:
      buildx: Docker Buildx (Docker Inc.)
        Version:  v0.10.5
        Path:     /usr/local/lib/docker/cli-plugins/docker-buildx
      compose: Docker Compose (Docker Inc.)
        Version:  v2.18.1
        Path:     /usr/local/lib/docker/cli-plugins/docker-compose
      dev: Docker Dev Environments (Docker Inc.)
        Version:  v0.1.0
        Path:     /usr/local/lib/docker/cli-plugins/docker-dev
      extension: Manages Docker extensions (Docker Inc.)
        Version:  v0.2.19
        Path:     /usr/local/lib/docker/cli-plugins/docker-extension
      init: Creates Docker-related starter files for your project (Docker Inc.)
        Version:  v0.1.0-beta.4
        Path:     /usr/local/lib/docker/cli-plugins/docker-init
      sbom: View the packaged-based Software Bill Of Materials (SBOM) for an image (Anchore Inc.)
        Version:  0.6.0
        Path:     /usr/local/lib/docker/cli-plugins/docker-sbom
      scan: Docker Scan (Docker Inc.)
        Version:  v0.26.0
        Path:     /usr/local/lib/docker/cli-plugins/docker-scan
      scout: Command line tool for Docker Scout (Docker Inc.)
        Version:  v0.12.0
        Path:     /usr/local/lib/docker/cli-plugins/docker-scout
    
    Server:
     Containers: 0
      Running: 0
      Paused: 0
      Stopped: 0
     Images: 1
     Server Version: 23.0.6
     Storage Driver: overlay2
      Backing Filesystem: extfs
      Supports d_type: true
      Using metacopy: false
      Native Overlay Diff: true
      userxattr: false
     Logging Driver: json-file
     Cgroup Driver: cgroupfs
     Cgroup Version: 1
     Plugins:
      Volume: local
      Network: bridge host ipvlan macvlan null overlay
      Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
     Swarm: inactive
     Runtimes: io.containerd.runc.v2 runc
     Default Runtime: runc
     Init Binary: docker-init
     containerd version: 1fbd70374134b891f97ce19c70b6e50c7b9f4e0d
     runc version: 860f061b76bb4fc671f0f9e900f7d80ff93d4eb7
     init version: 
     Security Options:
      seccomp
       Profile: builtin
     Kernel Version: 6.1.29-0-virt
     Operating System: Alpine Linux v3.18
     OSType: linux
     Architecture: aarch64
     CPUs: 6
     Total Memory: 7.754GiB
     Name: colima
     ID: c67ab9db-07cd-4788-8cbe-b016d3bead80
     Docker Root Dir: /var/lib/docker
     Debug Mode: false
     Username: patrykmalekkonghq
     Experimental: false
     Insecure Registries:
      127.0.0.0/8
     Live Restore Enabled: false
    
  • OS (e.g. from /etc/os-release): Mac OS with colima VM. /etc/os-release from within the VM that hosts the docker daemon:

    cat /etc/os-release
    NAME="Alpine Linux"
    ID=alpine
    VERSION_ID=3.18.0
    PRETTY_NAME="Alpine Linux v3.18"
    HOME_URL="https://alpinelinux.org/"
    BUG_REPORT_URL="https://gitlab.alpinelinux.org/alpine/aports/-/issues"
    BUILD_ID=""
    VARIANT_ID="clm"
    
@pmalek pmalek added the kind/bug Categorizes issue or PR as related to a bug. label Jun 15, 2023
@aojea
Copy link
Contributor

aojea commented Jun 15, 2023

docker: Error response from daemon: failed to create shim task: OCI runtime create failed: runc create failed: unable to start container process: error during container init: error mounting "cgroup" to rootfs at "/sys/fs/cgroup": mount cgroup:/sys/fs/cgroup/openrc (via /proc/self/fd/7), flags: 0xe, data: openrc: invalid argument: unknown.

@BenTheElder @AkihiroSuda ^^^

@BenTheElder
Copy link
Member

BenTheElder commented Jun 15, 2023

EDIT: updating this early comment to note that Colima is fixed via #3277 (comment), just upgrade to v0.6.0 colima


This is an issue with the host environment presumably with --cgroupns=private.

colima is @abiosoft

@BenTheElder BenTheElder changed the title Can't create cluster with v0.20.0 Colima can't create cluster with v0.20.0 Jun 15, 2023
@BenTheElder
Copy link
Member

I still don't recommend alpine / openrc for container hosts vs essentially any distro with systemd.

It's unfortunate that we can't even start the container with these options.

you could probably more immediately work around this by using lima with an Ubuntu guest VM

@wzshiming
Copy link
Member

wzshiming commented Jun 15, 2023

Oh, I'm having the same problem, my environment is in GithubAction that using colima to start docker on MacOS runner.

https://github.com/kubernetes-sigs/kwok/actions/runs/5279627795/jobs/9551621894?pr=654#step:14:95

@pmalek
Copy link
Author

pmalek commented Jun 15, 2023

@BenTheElder I've tried with ubuntu layer (colima has this flag: --layer to use it) and I'm getting this:

$ colima ssh cat /etc/lsb-release
DISTRIB_ID=Ubuntu
DISTRIB_RELEASE=23.04
DISTRIB_CODENAME=lunar
DISTRIB_DESCRIPTION="Ubuntu 23.04"
$colima ssh -- uname -a
Linux colima 6.1.29-0-virt #1-Alpine SMP Wed, 17 May 2023 14:22:15 +0000 aarch64 aarch64 aarch64 GNU/Linux
$ docker run --name colima-control-plane --hostname colima-control-plane --label io.x-k8s.kind.role=control-plane --privileged --security-opt seccomp=unconfined --security-opt apparmor=unconfined --tmpfs /tmp --tmpfs /run --volume /var --volume /lib/modules:/lib/modules:ro -e KIND_EXPERIMENTAL_CONTAINERD_SNAPSHOTTER --detach --tty --label io.x-k8s.kind.cluster=colima --net kind --restart=on-failure:1 --init=false --cgroupns=private --publish=127.0.0.1:54688:6443/TCP -e KUBECONFIG=/etc/kubernetes/admin.conf kindest/node:v1.27.2@sha256:3966ac761ae0136263ffdb6cfd4db23ef8a83cba8a463690e98317add2c9ba72
9cc1f3da207bb97b37630eb842cc5137ac52c714ff20b6fecfc1e824e5d0d0b6
docker: Error response from daemon: failed to create shim task: OCI runtime create failed: runc create failed: unable to start container process: error during container init: error mounting "cgroup" to rootfs at "/sys/fs/cgroup": mount cgroup:/sys/fs/cgroup/openrc (via /proc/self/fd/7), flags: 0xe, data: openrc: invalid argument: unknown.
$ docker info
Client:
 Version:    24.0.2
 Context:    default
 Debug Mode: false
 Plugins:
  buildx: Docker Buildx (Docker Inc.)
    Version:  v0.10.5
    Path:     /usr/local/lib/docker/cli-plugins/docker-buildx
  compose: Docker Compose (Docker Inc.)
    Version:  v2.18.1
    Path:     /usr/local/lib/docker/cli-plugins/docker-compose
  dev: Docker Dev Environments (Docker Inc.)
    Version:  v0.1.0
    Path:     /usr/local/lib/docker/cli-plugins/docker-dev
  extension: Manages Docker extensions (Docker Inc.)
    Version:  v0.2.19
    Path:     /usr/local/lib/docker/cli-plugins/docker-extension
  init: Creates Docker-related starter files for your project (Docker Inc.)
    Version:  v0.1.0-beta.4
    Path:     /usr/local/lib/docker/cli-plugins/docker-init
  sbom: View the packaged-based Software Bill Of Materials (SBOM) for an image (Anchore Inc.)
    Version:  0.6.0
    Path:     /usr/local/lib/docker/cli-plugins/docker-sbom
  scan: Docker Scan (Docker Inc.)
    Version:  v0.26.0
    Path:     /usr/local/lib/docker/cli-plugins/docker-scan
  scout: Command line tool for Docker Scout (Docker Inc.)
    Version:  v0.12.0
    Path:     /usr/local/lib/docker/cli-plugins/docker-scout

Server:
 Containers: 1
  Running: 0
  Paused: 0
  Stopped: 1
 Images: 1
 Server Version: 23.0.6
 Storage Driver: overlay2
  Backing Filesystem: extfs
  Supports d_type: true
  Using metacopy: false
  Native Overlay Diff: true
  userxattr: false
 Logging Driver: json-file
 Cgroup Driver: cgroupfs
 Cgroup Version: 1
 Plugins:
  Volume: local
  Network: bridge host ipvlan macvlan null overlay
  Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
 Swarm: inactive
 Runtimes: io.containerd.runc.v2 runc
 Default Runtime: runc
 Init Binary: docker-init
 containerd version: 1fbd70374134b891f97ce19c70b6e50c7b9f4e0d
 runc version: 860f061b76bb4fc671f0f9e900f7d80ff93d4eb7
 init version: 
 Security Options:
  seccomp
   Profile: builtin
 Kernel Version: 6.1.29-0-virt
 Operating System: Alpine Linux v3.18
 OSType: linux
 Architecture: aarch64
 CPUs: 6
 Total Memory: 7.754GiB
 Name: colima
 ID: b3c96bfd-b99b-44bc-b950-9b9109012530
 Docker Root Dir: /var/lib/docker
 Debug Mode: false
 Username: USER
 Experimental: false
 Insecure Registries:
  127.0.0.0/8
 Live Restore Enabled: false

These are the cgroup mounts inside the VM:

mount | grep cgroup
tmpfs on /sys/fs/cgroup type tmpfs (ro,nosuid,nodev,noexec,size=4096k,nr_inodes=1024,mode=755,inode64)
openrc on /sys/fs/cgroup/openrc type cgroup (rw,nosuid,nodev,noexec,relatime,release_agent=/lib/rc/sh/cgroup-release-agent.sh,name=openrc)
cpuset on /sys/fs/cgroup/cpuset type cgroup (rw,nosuid,nodev,noexec,relatime,cpuset)
cpu on /sys/fs/cgroup/cpu type cgroup (rw,nosuid,nodev,noexec,relatime,cpu)
cpuacct on /sys/fs/cgroup/cpuacct type cgroup (rw,nosuid,nodev,noexec,relatime,cpuacct)
blkio on /sys/fs/cgroup/blkio type cgroup (rw,nosuid,nodev,noexec,relatime,blkio)
memory on /sys/fs/cgroup/memory type cgroup (rw,nosuid,nodev,noexec,relatime,memory)
devices on /sys/fs/cgroup/devices type cgroup (rw,nosuid,nodev,noexec,relatime,devices)
freezer on /sys/fs/cgroup/freezer type cgroup (rw,nosuid,nodev,noexec,relatime,freezer)
net_cls on /sys/fs/cgroup/net_cls type cgroup (rw,nosuid,nodev,noexec,relatime,net_cls)
perf_event on /sys/fs/cgroup/perf_event type cgroup (rw,nosuid,nodev,noexec,relatime,perf_event)
net_prio on /sys/fs/cgroup/net_prio type cgroup (rw,nosuid,nodev,noexec,relatime,net_prio)
hugetlb on /sys/fs/cgroup/hugetlb type cgroup (rw,nosuid,nodev,noexec,relatime,hugetlb)
pids on /sys/fs/cgroup/pids type cgroup (rw,nosuid,nodev,noexec,relatime,pids)
cgroup_root on /host/sys/fs/cgroup type tmpfs (rw,nosuid,nodev,noexec,relatime,size=10240k,mode=755,inode64)
openrc on /host/sys/fs/cgroup/openrc type cgroup (rw,nosuid,nodev,noexec,relatime,release_agent=/lib/rc/sh/cgroup-release-agent.sh,name=openrc)
none on /host/sys/fs/cgroup/unified type cgroup2 (rw,nosuid,nodev,noexec,relatime,nsdelegate)
cpuset on /host/sys/fs/cgroup/cpuset type cgroup (rw,nosuid,nodev,noexec,relatime,cpuset)
cpu on /host/sys/fs/cgroup/cpu type cgroup (rw,nosuid,nodev,noexec,relatime,cpu)
cpuacct on /host/sys/fs/cgroup/cpuacct type cgroup (rw,nosuid,nodev,noexec,relatime,cpuacct)
blkio on /host/sys/fs/cgroup/blkio type cgroup (rw,nosuid,nodev,noexec,relatime,blkio)
memory on /host/sys/fs/cgroup/memory type cgroup (rw,nosuid,nodev,noexec,relatime,memory)
devices on /host/sys/fs/cgroup/devices type cgroup (rw,nosuid,nodev,noexec,relatime,devices)
freezer on /host/sys/fs/cgroup/freezer type cgroup (rw,nosuid,nodev,noexec,relatime,freezer)
net_cls on /host/sys/fs/cgroup/net_cls type cgroup (rw,nosuid,nodev,noexec,relatime,net_cls)
perf_event on /host/sys/fs/cgroup/perf_event type cgroup (rw,nosuid,nodev,noexec,relatime,perf_event)
net_prio on /host/sys/fs/cgroup/net_prio type cgroup (rw,nosuid,nodev,noexec,relatime,net_prio)
hugetlb on /host/sys/fs/cgroup/hugetlb type cgroup (rw,nosuid,nodev,noexec,relatime,hugetlb)
pids on /host/sys/fs/cgroup/pids type cgroup (rw,nosuid,nodev,noexec,relatime,pids)
tmpfs on /host/run/containerd/io.containerd.runtime.v2.task/colima/2b274e7b947011e0f0513278d0245b6644c1760edc6cd81af8a72f172b2c4652/rootfs/sys/fs/cgroup type tmpfs (ro,nosuid,nodev,noexec,relatime,size=4096k,nr_inodes=1024,mode=755,inode64)
openrc on /host/run/containerd/io.containerd.runtime.v2.task/colima/2b274e7b947011e0f0513278d0245b6644c1760edc6cd81af8a72f172b2c4652/rootfs/sys/fs/cgroup/openrc type cgroup (rw,nosuid,nodev,noexec,relatime,release_agent=/lib/rc/sh/cgroup-release-agent.sh,name=openrc)
cpuset on /host/run/containerd/io.containerd.runtime.v2.task/colima/2b274e7b947011e0f0513278d0245b6644c1760edc6cd81af8a72f172b2c4652/rootfs/sys/fs/cgroup/cpuset type cgroup (rw,nosuid,nodev,noexec,relatime,cpuset)
cpu on /host/run/containerd/io.containerd.runtime.v2.task/colima/2b274e7b947011e0f0513278d0245b6644c1760edc6cd81af8a72f172b2c4652/rootfs/sys/fs/cgroup/cpu type cgroup (rw,nosuid,nodev,noexec,relatime,cpu)
cpuacct on /host/run/containerd/io.containerd.runtime.v2.task/colima/2b274e7b947011e0f0513278d0245b6644c1760edc6cd81af8a72f172b2c4652/rootfs/sys/fs/cgroup/cpuacct type cgroup (rw,nosuid,nodev,noexec,relatime,cpuacct)
blkio on /host/run/containerd/io.containerd.runtime.v2.task/colima/2b274e7b947011e0f0513278d0245b6644c1760edc6cd81af8a72f172b2c4652/rootfs/sys/fs/cgroup/blkio type cgroup (rw,nosuid,nodev,noexec,relatime,blkio)
memory on /host/run/containerd/io.containerd.runtime.v2.task/colima/2b274e7b947011e0f0513278d0245b6644c1760edc6cd81af8a72f172b2c4652/rootfs/sys/fs/cgroup/memory type cgroup (rw,nosuid,nodev,noexec,relatime,memory)
devices on /host/run/containerd/io.containerd.runtime.v2.task/colima/2b274e7b947011e0f0513278d0245b6644c1760edc6cd81af8a72f172b2c4652/rootfs/sys/fs/cgroup/devices type cgroup (rw,nosuid,nodev,noexec,relatime,devices)
freezer on /host/run/containerd/io.containerd.runtime.v2.task/colima/2b274e7b947011e0f0513278d0245b6644c1760edc6cd81af8a72f172b2c4652/rootfs/sys/fs/cgroup/freezer type cgroup (rw,nosuid,nodev,noexec,relatime,freezer)
net_cls on /host/run/containerd/io.containerd.runtime.v2.task/colima/2b274e7b947011e0f0513278d0245b6644c1760edc6cd81af8a72f172b2c4652/rootfs/sys/fs/cgroup/net_cls type cgroup (rw,nosuid,nodev,noexec,relatime,net_cls)
perf_event on /host/run/containerd/io.containerd.runtime.v2.task/colima/2b274e7b947011e0f0513278d0245b6644c1760edc6cd81af8a72f172b2c4652/rootfs/sys/fs/cgroup/perf_event type cgroup (rw,nosuid,nodev,noexec,relatime,perf_event)
net_prio on /host/run/containerd/io.containerd.runtime.v2.task/colima/2b274e7b947011e0f0513278d0245b6644c1760edc6cd81af8a72f172b2c4652/rootfs/sys/fs/cgroup/net_prio type cgroup (rw,nosuid,nodev,noexec,relatime,net_prio)
hugetlb on /host/run/containerd/io.containerd.runtime.v2.task/colima/2b274e7b947011e0f0513278d0245b6644c1760edc6cd81af8a72f172b2c4652/rootfs/sys/fs/cgroup/hugetlb type cgroup (rw,nosuid,nodev,noexec,relatime,hugetlb)
pids on /host/run/containerd/io.containerd.runtime.v2.task/colima/2b274e7b947011e0f0513278d0245b6644c1760edc6cd81af8a72f172b2c4652/rootfs/sys/fs/cgroup/pids type cgroup (rw,nosuid,nodev,noexec,relatime,pids)
none on /sys/fs/cgroup/unified type cgroup2 (rw,nosuid,nodev,noexec,relatime,nsdelegate)
tmpfs on /sys/fs/cgroup/systemd type tmpfs (rw,nosuid,nodev,noexec,relatime,inode64)

@BenTheElder
Copy link
Member

BenTheElder commented Jun 15, 2023

uname is still showing alpine kernel and openrc is still showing up even though Ubuntu doesn't use it, I don't think that flag is changing the guest VM

@BenTheElder BenTheElder added priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. triage/accepted Indicates an issue or PR is ready to be actively worked on. labels Jun 15, 2023
@BenTheElder
Copy link
Member

From the lima FAQ I think it only provides an Ubuntu userspace environment and doesn't allow customizing the underlying Guest OS / kernel / ...
https://github.com/abiosoft/colima/blob/main/docs/FAQ.md#is-another-distro-supported

So I think colima will always be alpine / openrc unfortunately and subject to bugs like this.

See also past discussion abiosoft/colima#291 (comment) abiosoft/colima#163 ...

I think https://github.com/lima-vm/lima/blob/master/examples/docker-rootful.yaml would be an Ubuntu + typical docker host env on lima.

@BenTheElder
Copy link
Member

BenTheElder commented Jun 15, 2023

I'd also strongly recommend moving to a guest environment that uses cgroup v2 sooner than later, as the ecosystem is poised to drop v1 (I'd guess in the next year or so) and we can't do much about that.

Ubuntu, Debian, Docker desktop, Fedora, ... most linux environments have switched for some time now.

If we can't get this resolved with some patch to colima to enable working cgroups=private containers, we can consider reverting to not require cgroupns=private, but it adds back a third much more broken cgroups nesting environment (cgroup v1, host cgroupns) that we'd otherwise planned to phase out now that docker has supported cgroupns=private for a few years now and podman likewise (also the default on cgroups v2).

@AkihiroSuda
Copy link
Member

From the lima FAQ I think it only provides an Ubuntu userspace environment and doesn't allow customizing the underlying Guest OS / kernel / ...

typo: s/lima/colima/ πŸ™‚

as the ecosystem is poised to drop v1 (I'd guess in the next year or so)

The ecosystem of runc, containerd, etc. isn't likely to drop v1 before 2029 (EL8 EOL).

@BenTheElder
Copy link
Member

BenTheElder commented Jun 16, 2023

typo: s/lima/colima/ πŸ™‚

sorry, yes!

same comment suggests lima with ubuntu / docker guest πŸ˜…

The ecosystem of runc, containerd, etc. isn't likely to drop v1 before 2029 (EL8 EOL).

Kubernetes has been discussing it already and I believe systemd but it's good to know some of the others won't. πŸ˜…

@AkihiroSuda
Copy link
Member

Kubernetes has been discussing it already

Is there a KEP?

@ryancurrah
Copy link

We also have a lot of DNS issues with Lima due to use Alpine. I really wish they would move away from a musl based operating system.

@afbjorklund
Copy link
Contributor

afbjorklund commented Jun 17, 2023

We also have a lot of DNS issues with Lima due to use Alpine. I really wish they would move away from a musl based operating system.

Lima defaults to Ubuntu...

limactl start template://docker

Using Alpine is a choice by downstream, mostly for size reasons. I don't know of an apk distro using systemd/glibc instead of openrc/musl, but I suppose it is possible (or maybe use Debian, it is also smaller)

@pmalek
Copy link
Author

pmalek commented Jun 17, 2023

I remember spending a lot of hours with lima due to network issues.

For instance trying to figure out if I can use lima now instead of colima: I create the VM from one of the examples that contain docker (https://github.com/lima-vm/lima/tree/master/examples) or via the above mentioned limactl start template://docker.

This works and I can create kind cluster when the docker socket is forwarded to the host.

For full context: I use metallb for LoadBalancer service (with some custom route and iptables command so that host traffic is forwarded to the VM and then kind's node.

Now, I'm not sure why (haven't found the place in code that would explain the difference between lima and colima) but when I create VMs with colima and then create the kind cluster inside it, I can see the kind network created:

details...
# uname -a
Linux colima 6.1.29-0-virt #1-Alpine SMP Wed, 17 May 2023 14:22:15 +0000 aarch64 L
$ docker inspect kind
[
    {
        "Name": "kind",
        "Id": "58c6efc261888b451fbf9bfbf0c53da9bd4f6bb48c74a45f8ffdfa56946da376",
        "Created": "2023-06-17T10:50:00.781737055Z",
        "Scope": "local",
        "Driver": "bridge",
        "EnableIPv6": true,
        "IPAM": {
            "Driver": "default",
            "Options": {},
            "Config": [
                {
                    "Subnet": "172.18.0.0/16",
                    "Gateway": "172.18.0.1"
                },
                {
                    "Subnet": "fc00:f853:ccd:e793::/64",
                    "Gateway": "fc00:f853:ccd:e793::1"
                }
            ]
        },
        "Internal": false,
        "Attachable": false,
        "Ingress": false,
        "ConfigFrom": {
            "Network": ""
        },
        "ConfigOnly": false,
        "Containers": {
            "7d7ac41ea6b906f18b4fd2fcc49caed4c541abc30012094718ab3e1886d9c8f9": {
                "Name": "test-control-plane",
                "EndpointID": "9b603f5e6fcd776515e6eacafb2a87c9cafd0d3e81d73a28d7497283833c11cf",
                "MacAddress": "02:42:ac:12:00:02",
                "IPv4Address": "172.18.0.2/16",
                "IPv6Address": "fc00:f853:ccd:e793::2/64"
            }
        },
        "Options": {
            "com.docker.network.bridge.enable_ip_masquerade": "true",
            "com.docker.network.driver.mtu": "1500"
        },
        "Labels": {}
    }
]

and the underlying network interface br-58c6efc26188 using 172.18.0.1/16 network: ( this can then be used by metallb to allocate IPs and I'll get traffic routed to the desired service)

ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host
       valid_lft forever preferred_lft forever
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000
    link/ether 52:55:55:38:aa:84 brd ff:ff:ff:ff:ff:ff
    inet 192.168.5.15/24 scope global eth0
       valid_lft forever preferred_lft forever
    inet6 fe80::5055:55ff:fe38:aa84/64 scope link
       valid_lft forever preferred_lft forever
3: col0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000
    link/ether 52:55:55:e7:7d:6d brd ff:ff:ff:ff:ff:ff
    inet 192.168.106.2/24 scope global col0
       valid_lft forever preferred_lft forever
    inet6 fd63:1468:4f87:231a:5055:55ff:fee7:7d6d/64 scope global dynamic flags 100
       valid_lft 2590839sec preferred_lft 603639sec
    inet6 fe80::5055:55ff:fee7:7d6d/64 scope link
       valid_lft forever preferred_lft forever
4: br-58c6efc26188: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP
    link/ether 02:42:37:28:dd:56 brd ff:ff:ff:ff:ff:ff
    inet 172.18.0.1/16 brd 172.18.255.255 scope global br-58c6efc26188
       valid_lft forever preferred_lft forever
    inet6 fc00:f853:ccd:e793::1/64 scope global
       valid_lft forever preferred_lft forever
    inet6 fe80::42:37ff:fe28:dd56/64 scope link
       valid_lft forever preferred_lft forever
    inet6 fe80::1/64 scope link
       valid_lft forever preferred_lft forever
5: docker0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN
    link/ether 02:42:41:5a:79:67 brd ff:ff:ff:ff:ff:ff
    inet 172.17.0.1/16 brd 172.17.255.255 scope global docker0
       valid_lft forever preferred_lft forever
7: veth471fc84@if6: <BROADCAST,MULTICAST,UP,LOWER_UP,M-DOWN> mtu 1500 qdisc noqueue master br-58c6efc26188 state UP
    link/ether 6e:e3:f6:39:c8:05 brd ff:ff:ff:ff:ff:ff
    inet6 fe80::6ce3:f6ff:fe39:c805/64 scope link
       valid_lft forever preferred_lft forever

with lima I don't get that inferface even though kind network is created exactly the same way :

details...
$ uname - # I've tried with ubuntu 23.04 using kernel 6.2 as well and the same result
Linux lima-docker 5.15.0-72-generic #79-Ubuntu SMP Tue Apr 18 16:53:43 UTC 2023 aarch64 aarch64 aarch64 GNU/Linux
$ docker inspect kind
[
    {
        "Name": "kind",
        "Id": "199d499b093a18902d1cba537d7a30f6f83fbd9d3bf6c79f07b25a72c6d1d969",
        "Created": "2023-06-17T12:07:55.999693706Z",
        "Scope": "local",
        "Driver": "bridge",
        "EnableIPv6": true,
        "IPAM": {
            "Driver": "default",
            "Options": {},
            "Config": [
                {
                    "Subnet": "172.18.0.0/16",
                    "Gateway": "172.18.0.1"
                },
                {
                    "Subnet": "fc00:f853:ccd:e793::/64"
                }
            ]
        },
        "Internal": false,
        "Attachable": false,
        "Ingress": false,
        "ConfigFrom": {
            "Network": ""
        },
        "ConfigOnly": false,
        "Containers": {
            "a1b869e75ea64adc53e59195c2f773f6fb08c2dee7cb01ce9e7981a76476a1fa": {
                "Name": "kong-test-control-plane",
                "EndpointID": "bc061de959a24e50bb8abbeac116b0f55472f8b682f37de9f19f688cff67e695",
                "MacAddress": "02:42:ac:12:00:02",
                "IPv4Address": "172.18.0.2/16",
                "IPv6Address": "fc00:f853:ccd:e793::2/64"
            }
        },
        "Options": {
            "com.docker.network.bridge.enable_ip_masquerade": "true",
            "com.docker.network.driver.mtu": "1500"
        },
        "Labels": {}
    }
]
$ ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host
       valid_lft forever preferred_lft forever
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000
    link/ether 52:55:55:9a:a5:90 brd ff:ff:ff:ff:ff:ff
    altname enp0s2
    inet 192.168.5.15/24 metric 100 brd 192.168.5.255 scope global dynamic eth0
       valid_lft 85593sec preferred_lft 85593sec
    inet6 fec0::5055:55ff:fe9a:a590/64 scope site dynamic mngtmpaddr noprefixroute
       valid_lft 86322sec preferred_lft 14322sec
    inet6 fe80::5055:55ff:fe9a:a590/64 scope link
       valid_lft forever preferred_lft forever
3: docker0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN group default
    link/ether 02:42:2f:ab:dc:63 brd ff:ff:ff:ff:ff:ff
    inet 172.17.0.1/16 brd 172.17.255.255 scope global docker0
       valid_lft forever preferred_lft forever

this way I can't get the traffic into the cluster using 172.18.0.1 network.

EDIT: the reason for this is most likely docker in lima ubuntu VM using cgroup v2, which causes kind network to land in a separate net namespace (but that's a guess). Not sure how could I then make the traffic get routed inside kind's network (and then its container).

$ sudo lsns --type=net
        NS TYPE NPROCS   PID USER      NETNSID NSFS COMMAND
4026531840 net     118     1 root   unassigned      /sbin/init
4026532237 net      12  3820 lima   unassigned      /proc/self/exe --net=slirp4netns --mtu=65520 --slirp4netns-sandbox=auto --slirp4netns-seccomp=auto --disable-host-loopback --port-driver=bu
4026532314 net      30  4404 lima   unassigned      /sbin/init
4026532406 net       1  5492 lima   unassigned      registry serve /etc/docker/registry/config.yml
4026532472 net       1  5628 lima   unassigned      registry serve /etc/docker/registry/config.yml
4026532543 net       2  6176 165534 unassigned      /pause
4026532602 net       2  6144 165534 unassigned      /pause
4026532665 net       2  6216 165533 unassigned      /pause
4026532724 net       2  6215 165534 unassigned      /pause
$ sudo nsenter -n --target 3820 ip a s br-ae7cbfeb3d9b
4: br-ae7cbfeb3d9b: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default
    link/ether 02:42:e8:51:b5:1f brd ff:ff:ff:ff:ff:ff
    inet 172.18.0.1/16 brd 172.18.255.255 scope global br-ae7cbfeb3d9b
       valid_lft forever preferred_lft forever
    inet6 fc00:f853:ccd:e793::1/64 scope global
       valid_lft forever preferred_lft forever
    inet6 fe80::42:e8ff:fe51:b51f/64 scope link
       valid_lft forever preferred_lft forever
    inet6 fe80::1/64 scope link
       valid_lft forever preferred_lft forever

@pmalek
Copy link
Author

pmalek commented Jun 17, 2023

As for the issue at hand:

I understand that with #3241 the ship might have already sailed but perhaps we might still consider using the provider info Cgroup2 field and set the --cgroupns flag only when cgroupv2 is available?

@matteosilv
Copy link

Same error happens with Rancher Desktop that is using lima under the hood

@marcofranssen
Copy link

Experiencing the same on Rancher Desktop. Downgrading to kind 0.19.0 fixes the issue for now.

Would be great to get a fix for 0.20.0.

The issue I see on Rancher Desktop using Kind 0.20.0 is the following:

$ kind create cluster --name test-cluster --image kindest/node:v1.27.3
Boostrapping cluster…
Creating cluster "test-cluster" ...
 βœ“ Ensuring node image (kindest/node:v1.27.3) πŸ–Ό
 βœ— Preparing nodes πŸ“¦  
Deleted nodes: ["eks-cluster-control-plane"]
ERROR: failed to create cluster: command "docker run --name test-cluster-control-plane --hostname test-cluster-control-plane --label io.x-k8s.kind.role=control-plane --privileged --security-opt seccomp=unconfined --security-opt apparmor=unconfined --tmpfs /tmp --tmpfs /run --volume /var --volume /lib/modules:/lib/modules:ro -e KIND_EXPERIMENTAL_CONTAINERD_SNAPSHOTTER --detach --tty --label io.x-k8s.kind.cluster=test-cluster --net kind --restart=on-failure:1 --init=false --cgroupns=private --publish=127.0.0.1:50566:6443/TCP -e KUBECONFIG=/etc/kubernetes/admin.conf kindest/node:v1.27.3" failed with error: exit status 125
Command Output: 82623b67d511c7e10ed075323e621ec66befa9047e3c7b56647ca99fd78e0db6
docker: Error response from daemon: failed to create shim task: OCI runtime create failed: runc create failed: unable to start container process: error during container init: error mounting "cgroup" to rootfs at "/sys/fs/cgroup": mount cgroup:/sys/fs/cgroup/openrc (via /proc/self/fd/7), flags: 0xe, data: openrc: invalid argument: unknown.

@BenTheElder
Copy link
Member

Inability to create a container with this docker 20.10.0 feature from 2020-12-08 is still considered a bug in colima / rancher desktop. I'd like to hear a response from those projects before we revert anything. Ensuring private cgroupns is a big benefit for the project.

@BenTheElder
Copy link
Member

I understand that with #3241 the ship might have already sailed but perhaps we might still consider using the provider info Cgroup2 field and set the --cgroupns flag only when cgroupv2 is available?

The point of setting this flag is to ensure that this is set on cgroupv1 hosts. cgroupv2 hosts already default to this.

cgroupv1 hosts are the problem. On hosts other than apline/colima/rancher desktop this works great. Alpine and colima / rancher desktop use an unusual init system that doesn't seem to set this up properly.

@acuteaura
Copy link

acuteaura commented Jul 18, 2023

the reason for this is most likely docker in lima ubuntu VM using cgroup v2, which causes kind network to land in a separate net namespace (but that's a guess).

You may have some eBPF component in the path (which are attached to cgroup2), which without unsharing cgroup2 will attach bits to your host namespace that were meant to go on nodes, thus creating incidental routability. I had a similar issue forwarding ports in kind with Cilium.

@williamokano-dh
Copy link

Yeah, same issue here. brew install doesn't support kind@0.19.0 so I had to install it through the go approach. Running go install sigs.k8s.io/kind@v0.19.0 seems to have temporarily fixed the issue.

@marcofranssen
Copy link

Yup did same.

@BenTheElder BenTheElder pinned this issue Sep 7, 2023
ack-prow bot pushed a commit to aws-controllers-k8s/test-infra that referenced this issue Sep 26, 2023
Addresses: aws-controllers-k8s/community#1903

Description of changes:
- Bump kind version to `0.19.0` [avoiding `0.20.0` for now - [multiple users reported a bug when using in dind ](kubernetes-sigs/kind#3277)]
- Bump `k8s` to `1.28.0`
- Rebuild and publish a new integration image (containing the `kind` binary)

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
@abiosoft
Copy link

Colima v0.6.0 supports kind https://github.com/abiosoft/colima/releases/tag/v0.6.0

@BenTheElder BenTheElder changed the title Colima, Rancher-Desktop [Alpine] can't create cluster with v0.20.0 Rancher-Desktop [Alpine] can't create cluster with v0.20.0 [Previously Also Colima] Nov 13, 2023
@BenTheElder
Copy link
Member

Thanks @abiosoft!

@marcofranssen
Copy link

@abiosoft does this mean it now also works with latest Rancher Desktop?

@jandubois
Copy link

@marcofranssen No, it does not. colima switched from Alpine to Ubuntu to avoid the issue, but Rancher Desktop still uses Alpine.

The best you can do on Rancher Desktop right now is to use k3d instead of kind. It should provide very similar functionality, but uses k3s instead of kubeadm internally.

@AkihiroSuda
Copy link
Member

The best you can do on Rancher Desktop right now is to use k3d instead of kind. It should provide very similar functionality, but uses k3s instead of kubeadm internally.

Off-topic question, but why not use Rancher Desktop's Kubernetes? πŸ˜„
What are missing in Rancher Desktop's Kubernetes? (Setting custom feature gates, etc.?)

@jandubois
Copy link

Off-topic question, but why not use Rancher Desktop's Kubernetes? πŸ˜„

For me the only reason to use k3d is when I want to have a multi-node cluster to play around with pod placement strategies like taints and affinity, to make sure the manifests work as expected.

Eventually there should be a config setting in Rancher Desktop to allow multiple nodes. Personally I've also wanted a mixed-architecture cluster with both amd64 and arm64 nodes, but that is more for fun than actual need... πŸ˜„

@BenTheElder
Copy link
Member

Multi-node is one of the common reasons I see versus the bundled k8s in containers-in-a-vm solutions, the other is more control over the k8s version used.

@jandubois
Copy link

the other is more control over the k8s version used.

You can pick any k8s (k3s) version you want in Rancher Desktop and you can also upgrade to any new version and see how it affects your deployed workloads:

CleanShot 2023-11-17 at 10 23 06@2x

I'm not actually sure if versions prior to 1.19 still work properly, but all the more recent releases should be fully functional.

@mattfarina
Copy link

To add one more data point to the issues with Alpine (under Rancher Desktop), this is the output that I get from kind after it fails to work...

INFO: ensuring we can execute mount/umount even with userns-remap
INFO: remounting /sys read-only
INFO: making mounts shared
INFO: detected cgroup v1
INFO: detected cgroupns
INFO: clearing and regenerating /etc/machine-id
Initializing machine ID from random generator.
INFO: faking /sys/class/dmi/id/product_name to be "kind"
INFO: faking /sys/class/dmi/id/product_uuid to be random
INFO: faking /sys/devices/virtual/dmi/id/product_uuid as well
INFO: setting iptables to detected mode: legacy
INFO: detected IPv4 address: 172.18.0.2
INFO: detected IPv6 address: fc00:f853:ccd:e793::2
INFO: starting init
Inserted module 'autofs4'
Failed to mount cgroup at /sys/fs/cgroup/systemd: Operation not permitted
[!!!!!!] Failed to mount API filesystems.
Exiting PID 1...

@BenTheElder
Copy link
Member

BenTheElder commented Nov 28, 2023

Right, there's discussion of this above /sys/fs/cgroup, we should have permission to mount here in this privileged container so ... something is odd/broken in that environment.

I can't run rancher desktop at work (VM policy) so I'd appreciate others that use rancher desktop debugging this issue.

@BenTheElder
Copy link
Member

Er and to clarify we have code specifically to ensure things run smoothly on non-systemd hosts:

# workaround for hosts not running systemd

However, on these particular alpine based hosts we seem to be unable to make mounts, which doesn't make sense. With cgroupns enabled we're getting our own view of cgroups and with privileged we should have permission to make mounts (see e.g. the remount /sys ro earlier in the logs). It's possible we can't make this mount in any environment and receive it as a function of systemd being on the host on other hosts, this requires more root-cause debugging.

I still haven't had time to dig into this myself, currently focused on some follow-ups around https://kubernetes.io/blog/2023/08/31/legacy-package-repository-deprecation/, and this is somewhat outside of @aojea's usual wheelhouse.

In the meantime I recommend lima w/ ubuntu docker profile or colima as free alternatives to docker desktop that work with kind.

I would appreciate help in investigating this bug.

cgroupns will be default on cgroupsv2 hosts under all major container runtimes and is enabled for good reasons, so just reverting enabling cgroupns in an attempt to unbreak alpine isn't a very good option (note: rancher desktop is on v2 with cgroupns enabled by default now anyhow), but I'd love to see other suggested fixes or debugging work from anyone else invested in this support.

@jandubois
Copy link

Just wanted to give a quick heads-up that the issue seems to be fixed by Alpine 3.19 (most likely due to the update to OpenRC 0.51+, which has fixed the "unified" cgroups layout):

$ kind create cluster
Creating cluster "kind" ...
 βœ“ Ensuring node image (kindest/node:v1.27.3) πŸ–Ό
 βœ“ Preparing nodes πŸ“¦
 βœ“ Writing configuration πŸ“œ
 βœ“ Starting control-plane πŸ•ΉοΈ
 βœ“ Installing CNI πŸ”Œ
 βœ“ Installing StorageClass πŸ’Ύ
Set kubectl context to "kind-kind"
You can now use your cluster with:

kubectl cluster-info --context kind-kind

Have a nice day! πŸ‘‹

$ k get no
NAME                 STATUS     ROLES           AGE   VERSION
kind-control-plane   NotReady   control-plane   11s   v1.27.3

So this issue can probably be closed, unless you want to wait until a version of Rancher Desktop with Alpine 3.19 is out for verification. That is probably not going to happen until early March though.

@aojea
Copy link
Contributor

aojea commented Jan 29, 2024

So this issue can probably be closed, unless you want to wait until a version of Rancher Desktop with Alpine 3.19

/close

let's close it here, is not anything else we can do and you provided a solution

@k8s-ci-robot
Copy link
Contributor

@aojea: Closing this issue.

In response to this:

So this issue can probably be closed, unless you want to wait until a version of Rancher Desktop with Alpine 3.19

/close

let's close it here, is not anything else we can do and you provided a solution

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@marcindulak
Copy link
Contributor

This issue is closed, but there is still an open issue in rancher desktop - it's hidden in the collapsed comments, so linking it here again rancher-sandbox/rancher-desktop#5092

@BenTheElder
Copy link
Member

Circling back, we have reports of rancher desktop + kind v0.23 working in https://kubernetes.slack.com/archives/CEKK1KTN2/p1723583621985329?thread_ts=1723579586.749849&cid=CEKK1KTN2

FYI @jandubois πŸŽ‰

NOTE: you may still run into issues from https://kind.sigs.k8s.io/docs/user/known-issues/, in this case with many clusters, tuning inotify limits was required https://kind.sigs.k8s.io/docs/user/known-issues/#pod-errors-due-to-too-many-open-files

(it might? be reasonable to bump the defaults in rancher desktop πŸ˜…)

@BenTheElder BenTheElder unpinned this issue Aug 13, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. triage/accepted Indicates an issue or PR is ready to be actively worked on.
Projects
None yet
Development

No branches or pull requests