Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Kind can't create clusters in F35 with Podman #2689

Closed
hhemied opened this issue Mar 23, 2022 · 31 comments
Closed

Kind can't create clusters in F35 with Podman #2689

hhemied opened this issue Mar 23, 2022 · 31 comments
Labels
area/provider/podman Issues or PRs related to podman kind/support Categorizes issue or PR as a support question.

Comments

@hhemied
Copy link

hhemied commented Mar 23, 2022

What happened:

[root@fedora ~]# kind create cluster
enabling experimental podman provider
Creating cluster "kind" ...
 βœ“ Ensuring node image (kindest/node:v1.23.4) πŸ–Ό 
 βœ— Preparing nodes πŸ“¦  
ERROR: failed to create cluster: could not find a log line that matches "Reached target .*Multi-User System.*|detected cgroup v1"

What you expected to happen:
kind could create a cluster.

How to reproduce it (as minimally and precisely as possible):

  • install Fedora 35 server edition
  • Install podman 4.0.2
  • run kind create cluster

Environment:

  • kind version: (use kind version):
kind v0.12.0 go1.17.8 linux/amd64
  • Kubernetes version: (use kubectl version):
Client Version: version.Info{Major:"1", Minor:"23", GitVersion:"v1.23.5", GitCommit:"c285e781331a3785a7f436042c65c5641ce8a9e9", GitTreeState:"clean", BuildDate:"2022-03-16T15:58:47Z", GoVersion:"go1.17.8", Compiler:"gc", Platform:"linux/amd64"}
  • Docker version: (use docker info):
    I use podman
Client:       Podman Engine
Version:      4.0.2
API Version:  4.0.2
Go Version:   go1.16.14

Built:      Thu Mar 10 21:26:05 2022
OS/Arch:    linux/amd64
  • OS (e.g. from /etc/os-release):
Fedora release 35 (Thirty Five)
@hhemied hhemied added the kind/bug Categorizes issue or PR as related to a bug. label Mar 23, 2022
@aojea
Copy link
Contributor

aojea commented Mar 23, 2022

can you check you are using the image in the release notes?

kindest/node:v1.23.4@sha256:0e34f0d0fd448aa2f2819cfd74e99fe5793a6e4938b328f657c8e3f81ee0dfb9

@hhemied
Copy link
Author

hhemied commented Mar 23, 2022

I did, and here is the output

[root@fedora ~]# kind create cluster --image kindest/node:v1.23.4@sha256:0e34f0d0fd448aa2f2819cfd74e99fe5793a6e4938b328f657c8e3f81ee0dfb9
enabling experimental podman provider
Creating cluster "kind" ...
 βœ“ Ensuring node image (kindest/node:v1.23.4) πŸ–Ό 
 βœ— Preparing nodes πŸ“¦  
ERROR: failed to create cluster: could not find a log line that matches "Reached target .*Multi-User System.*|detected cgroup v1"

@aojea
Copy link
Contributor

aojea commented Mar 23, 2022

can you execute adding -v 7 and paste the output?

@hhemied
Copy link
Author

hhemied commented Mar 23, 2022

[root@fedora ~]# kind create cluster --image kindest/node:v1.23.4@sha256:0e34f0d0fd448aa2f2819cfd74e99fe5793a6e4938b328f657c8e3f81ee0dfb9 -v 7
enabling experimental podman provider
Creating cluster "kind" ...
DEBUG: podman/images.go:58] Image: docker.io/kindest/node@sha256:0e34f0d0fd448aa2f2819cfd74e99fe5793a6e4938b328f657c8e3f81ee0dfb9 present locally
 βœ“ Ensuring node image (kindest/node:v1.23.4) πŸ–Ό 
 βœ— Preparing nodes πŸ“¦  
ERROR: failed to create cluster: could not find a log line that matches "Reached target .*Multi-User System.*|detected cgroup v1"
Stack Trace: 
sigs.k8s.io/kind/pkg/errors.Errorf
	sigs.k8s.io/kind/pkg/errors/errors.go:41
sigs.k8s.io/kind/pkg/cluster/internal/providers/common.WaitUntilLogRegexpMatches
	sigs.k8s.io/kind/pkg/cluster/internal/providers/common/cgroups.go:84
sigs.k8s.io/kind/pkg/cluster/internal/providers/podman.createContainerWithWaitUntilSystemdReachesMultiUserSystem
	sigs.k8s.io/kind/pkg/cluster/internal/providers/podman/provision.go:378
sigs.k8s.io/kind/pkg/cluster/internal/providers/podman.planCreation.func2
	sigs.k8s.io/kind/pkg/cluster/internal/providers/podman/provision.go:101
sigs.k8s.io/kind/pkg/errors.UntilErrorConcurrent.func1
	sigs.k8s.io/kind/pkg/errors/concurrent.go:30
runtime.goexit
	runtime/asm_amd64.s:1581

@stmcginnis
Copy link
Contributor

It looks like above for the docker info output you actually did podman version. Can you run podman info and paste the output?

From the client version side it looks like it was compiled for amd64, but wondering if you are running on arm64. There was a similar error reported recently in slack: https://kubernetes.slack.com/archives/CEKK1KTN2/p1646907106345039?thread_ts=1646907067.117919&cid=CEKK1KTN2

@hhemied
Copy link
Author

hhemied commented Mar 23, 2022

You are right, my bad.

[root@fedora ~]# podman info
host:
  arch: amd64
  buildahVersion: 1.24.1
  cgroupControllers:
  - cpuset
  - cpu
  - io
  - memory
  - hugetlb
  - pids
  - misc
  cgroupManager: systemd
  cgroupVersion: v2
  conmon:
    package: conmon-2.1.0-2.fc35.x86_64
    path: /usr/bin/conmon
    version: 'conmon version 2.1.0, commit: '
  cpus: 6
  distribution:
    distribution: fedora
    version: "35"
  eventLogger: journald
  hostname: fedora
  idMappings:
    gidmap: null
    uidmap: null
  kernel: 5.16.16-200.fc35.x86_64
  linkmode: dynamic
  logDriver: journald
  memFree: 7363661824
  memTotal: 9196961792
  networkBackend: netavark
  ociRuntime:
    name: crun
    package: crun-1.4.3-1.fc35.x86_64
    path: /usr/bin/crun
    version: |-
      crun version 1.4.3
      commit: 61c9600d1335127eba65632731e2d72bc3f0b9e8
      spec: 1.0.0
      +SYSTEMD +SELINUX +APPARMOR +CAP +SECCOMP +EBPF +CRIU +YAJL
  os: linux
  remoteSocket:
    exists: true
    path: /run/podman/podman.sock
  security:
    apparmorEnabled: false
    capabilities: CAP_CHOWN,CAP_DAC_OVERRIDE,CAP_FOWNER,CAP_FSETID,CAP_KILL,CAP_NET_BIND_SERVICE,CAP_SETFCAP,CAP_SETGID,CAP_SETPCAP,CAP_SETUID,CAP_SYS_CHROOT
    rootless: false
    seccompEnabled: true
    seccompProfilePath: /usr/share/containers/seccomp.json
    selinuxEnabled: true
  serviceIsRemote: false
  slirp4netns:
    executable: /usr/bin/slirp4netns
    package: slirp4netns-1.1.12-2.fc35.x86_64
    version: |-
      slirp4netns version 1.1.12
      commit: 7a104a101aa3278a2152351a082a6df71f57c9a3
      libslirp: 4.6.1
      SLIRP_CONFIG_VERSION_MAX: 3
      libseccomp: 2.5.3
  swapFree: 8589930496
  swapTotal: 8589930496
  uptime: 1m 52.2s
plugins:
  log:
  - k8s-file
  - none
  - passthrough
  - journald
  network:
  - bridge
  - macvlan
  volume:
  - local
registries:
  search:
  - registry.fedoraproject.org
  - registry.access.redhat.com
  - docker.io
  - quay.io
store:
  configFile: /etc/containers/storage.conf
  containerStore:
    number: 6
    paused: 0
    running: 0
    stopped: 6
  graphDriverName: overlay
  graphOptions:
    overlay.mountopt: nodev,metacopy=on
  graphRoot: /var/lib/containers/storage
  graphStatus:
    Backing Filesystem: xfs
    Native Overlay Diff: "false"
    Supports d_type: "true"
    Using metacopy: "true"
  imageCopyTmpDir: /var/tmp
  imageStore:
    number: 5
  runRoot: /run/containers/storage
  volumePath: /var/lib/containers/storage/volumes
version:
  APIVersion: 4.0.2
  Built: 1646943965
  BuiltTime: Thu Mar 10 21:26:05 2022
  GitCommit: ""
  GoVersion: go1.16.14
  OsArch: linux/amd64
  Version: 4.0.2

@aojea
Copy link
Contributor

aojea commented Mar 23, 2022

ok, try this

kind create cluster --retain --image kindest/node:v1.23.4@sha256:0e34f0d0fd448aa2f2819cfd74e99fe5793a6e4938b328f657c8e3f81ee0dfb9

and then kind export logs so it exports the info to a folder, create a tarball and attach it here

@hhemied
Copy link
Author

hhemied commented Mar 23, 2022

Here is the logs folder
3808958841.zip

@aojea
Copy link
Contributor

aojea commented Mar 23, 2022

I can see the log line and I can't see a 30 seconds delay ... why is it failing?
something related to the stdin/stdout?
@hhemied any "unusual" configuration or environment thing in your setup?

@AkihiroSuda , have you seen something similar?

@BenTheElder BenTheElder added the area/provider/podman Issues or PRs related to podman label Mar 23, 2022
@hhemied
Copy link
Author

hhemied commented Mar 23, 2022

Nothing suspicious, actually it is clean install to test kind with podman 4
Additional info

[root@fedora ~]# podman network ls
NETWORK ID    NAME        DRIVER
faed16303522  kind        bridge
2f259bab93aa  podman      bridge

@aojea
Copy link
Contributor

aojea commented Mar 23, 2022

Does podman 3 work?

@dlipovetsky
Copy link
Contributor

dlipovetsky commented Mar 23, 2022

I am also on Fedora 35, and am affected by the same issue. I was able to create a kind cluster yesterday, but this morning I updated my kernel from 5.16.16. This kernel version appears in the report above.

If I fall back to the 5.16.15 kernel, I no longer have this issue.

FWIW, I am using the docker provider, so I suspect this issue is related in some way to the kernel, not podman. I may be wrong here, because I see nothing related in either the Fedora 5.16.16 changelog, or the upstream 5.16.16 changelog.

Although I don't have time to investigate the root cause right now, I can open a docs PR.

@hhemied
Copy link
Author

hhemied commented Mar 24, 2022

Nope, it seems not connected to the kernel,
I have tested again a fresh install F35 and skipped any update.

  • Kernel
5.14.10-300.fc35.x86_64
  • Podman version
Version:      3.4.0
API Version:  3.4.0
Go Version:   go1.16.8
Built:        Thu Sep 30 21:32:16 2021
OS/Arch:      linux/amd64
  • kind version
kind v0.12.0 go1.17.8 linux/amd64

And still geting the same error

[root@fedora ~]# kind create cluster
enabling experimental podman provider
Creating cluster "kind" ...
 βœ“ Ensuring node image (kindest/node:v1.23.4) πŸ–Ό 
 βœ— Preparing nodes πŸ“¦  
ERROR: failed to create cluster: could not find a log line that matches "Reached target .*Multi-User System.*|detected cgroup v1"

@hhemied
Copy link
Author

hhemied commented Mar 24, 2022

Here is the output if I remove Podman and install Docker

kind create cluster
Creating cluster "kind" ...
 βœ“ Ensuring node image (kindest/node:v1.23.4) πŸ–Ό 
 βœ“ Preparing nodes πŸ“¦  
 βœ“ Writing configuration πŸ“œ 
 βœ“ Starting control-plane πŸ•ΉοΈ 
 βœ“ Installing CNI πŸ”Œ 
 βœ“ Installing StorageClass πŸ’Ύ 
Set kubectl context to "kind-kind"
You can now use your cluster with:

kubectl cluster-info --context kind-kind

Not sure what to do next? πŸ˜…  Check out https://kind.sigs.k8s.io/docs/user/quick-start/

@subnetmarco
Copy link

subnetmarco commented Mar 24, 2022

Same problem happens on macOS Monterey (v12) with Apple M1:

$ kind create cluster
Creating cluster "kind" ...
 βœ“ Ensuring node image (kindest/node:v1.23.4) πŸ–Ό
 βœ— Preparing nodes πŸ“¦
ERROR: failed to create cluster: could not find a log line that matches "Reached target .*Multi-User System.*|detected cgroup v1"

$ kind version
kind v0.12.0 go1.17.8 darwin/arm64
$ docker version
Client:
 Cloud integration: v1.0.22
 Version:           20.10.13
 API version:       1.41
 Go version:        go1.16.15
 Git commit:        a224086
 Built:             Thu Mar 10 14:08:43 2022
 OS/Arch:           darwin/arm64
 Context:           default
 Experimental:      true

Server: Docker Desktop 4.6.0 (75818)
 Engine:
  Version:          20.10.13
  API version:      1.41 (minimum version 1.12)
  Go version:       go1.16.15
  Git commit:       906f57f
  Built:            Thu Mar 10 14:05:37 2022
  OS/Arch:          linux/arm64
  Experimental:     false
 containerd:
  Version:          1.5.10
  GitCommit:        2a1d4dbdb2a1030dc5b01e96fb110a9d9f150ecc
 runc:
  Version:          1.0.3
  GitCommit:        v1.0.3-0-gf46b6ba
 docker-init:
  Version:          0.19.0
  GitCommit:        de40ad0
$ kubectl version
Client Version: version.Info{Major:"1", Minor:"22", GitVersion:"v1.22.2", GitCommit:"8b5a19147530eaac9476b0ab82980b4088bbc1b2", GitTreeState:"clean", BuildDate:"2021-09-15T21:31:32Z", GoVersion:"go1.16.9", Compiler:"gc", Platform:"darwin/arm64"}
Server Version: version.Info{Major:"1", Minor:"21", GitVersion:"v1.21.9-gke.1002", GitCommit:"f87f9d952767b966e72a4bd75afea25dea187bbf", GitTreeState:"clean", BuildDate:"2022-02-25T18:12:32Z", GoVersion:"go1.16.12b7", Compiler:"gc", Platform:"linux/amd64"}

@AkihiroSuda
Copy link
Member

I am also on Fedora 35, and am affected by the same issue. I was able to create a kind cluster yesterday, but this morning I updated my kernel from 5.16.16. This kernel version appears in the report above.

If I fall back to the 5.16.15 kernel, I no longer have this issue.

FWIW, I am using the docker provider, so I suspect this issue is related in some way to the kernel, not podman. I may be wrong here, because I see nothing related in either the Fedora 5.16.16 changelog, or the upstream 5.16.16 changelog.

Although I don't have time to investigate the root cause right now, I can open a docs PR.

Could you try the latest kernel 5.16.17-200.fc35? I don't see any issue with this kernel.

my podman info

[root@fedora ~]# KIND_EXPERIMENTAL_PROVIDER=podman kind create cluster
using podman due to KIND_EXPERIMENTAL_PROVIDER
enabling experimental podman provider
Creating cluster "kind" ...
 βœ“ Ensuring node image (kindest/node:v1.23.4) πŸ–Ό 
 βœ“ Preparing nodes πŸ“¦  
 βœ“ Writing configuration πŸ“œ 
 βœ“ Starting control-plane πŸ•ΉοΈ 
 βœ“ Installing CNI πŸ”Œ 
 βœ“ Installing StorageClass πŸ’Ύ 
Set kubectl context to "kind-kind"
You can now use your cluster with:

kubectl cluster-info --context kind-kind

Not sure what to do next? πŸ˜…  Check out https://kind.sigs.k8s.io/docs/user/quick-start/

[root@fedora ~]# kind version
kind v0.12.0 go1.17.8 linux/amd64

[root@fedora ~]# podman info
host:
  arch: amd64
  buildahVersion: 1.23.1
  cgroupControllers:
  - cpuset
  - cpu
  - io
  - memory
  - hugetlb
  - pids
  - misc
  cgroupManager: systemd
  cgroupVersion: v2
  conmon:
    package: conmon-2.1.0-2.fc35.x86_64
    path: /usr/bin/conmon
    version: 'conmon version 2.1.0, commit: '
  cpus: 2
  distribution:
    distribution: fedora
    variant: cloud
    version: "35"
  eventLogger: journald
  hostname: fedora
  idMappings:
    gidmap: null
    uidmap: null
  kernel: 5.16.17-200.fc35.x86_64
  linkmode: dynamic
  logDriver: journald
  memFree: 111788032
  memTotal: 4103704576
  ociRuntime:
    name: crun
    package: crun-1.4.3-1.fc35.x86_64
    path: /usr/bin/crun
    version: |-
      crun version 1.4.3
      commit: 61c9600d1335127eba65632731e2d72bc3f0b9e8
      spec: 1.0.0
      +SYSTEMD +SELINUX +APPARMOR +CAP +SECCOMP +EBPF +CRIU +YAJL
  os: linux
  remoteSocket:
    path: /run/podman/podman.sock
  security:
    apparmorEnabled: false
    capabilities: CAP_CHOWN,CAP_DAC_OVERRIDE,CAP_FOWNER,CAP_FSETID,CAP_KILL,CAP_NET_BIND_SERVICE,CAP_SETFCAP,CAP_SETGID,CAP_SETPCAP,CAP_SETUID,CAP_SYS_CHROOT
    rootless: false
    seccompEnabled: true
    seccompProfilePath: /usr/share/containers/seccomp.json
    selinuxEnabled: true
  serviceIsRemote: false
  slirp4netns:
    executable: /usr/bin/slirp4netns
    package: slirp4netns-1.1.12-2.fc35.x86_64
    version: |-
      slirp4netns version 1.1.12
      commit: 7a104a101aa3278a2152351a082a6df71f57c9a3
      libslirp: 4.6.1
      SLIRP_CONFIG_VERSION_MAX: 3
      libseccomp: 2.5.3
  swapFree: 4102287360
  swapTotal: 4103073792
  uptime: 4m 7.84s
plugins:
  log:
  - k8s-file
  - none
  - journald
  network:
  - bridge
  - macvlan
  volume:
  - local
registries:
  search:
  - registry.fedoraproject.org
  - registry.access.redhat.com
  - docker.io
  - quay.io
store:
  configFile: /etc/containers/storage.conf
  containerStore:
    number: 1
    paused: 0
    running: 1
    stopped: 0
  graphDriverName: overlay
  graphOptions:
    overlay.mountopt: nodev,metacopy=on
  graphRoot: /var/lib/containers/storage
  graphStatus:
    Backing Filesystem: btrfs
    Native Overlay Diff: "false"
    Supports d_type: "true"
    Using metacopy: "true"
  imageStore:
    number: 1
  runRoot: /run/containers/storage
  volumePath: /var/lib/containers/storage/volumes
version:
  APIVersion: 3.4.4
  Built: 1638999907
  BuiltTime: Wed Dec  8 21:45:07 2021
  GitCommit: ""
  GoVersion: go1.16.8
  OsArch: linux/amd64
  Version: 3.4.4

@dlipovetsky
Copy link
Contributor

Could you try the latest kernel 5.16.17-200.fc35? I don't see any issue with this kernel.

I tried 5.16.18-200.fc35, and I have no issues.

It seems it might still affect other systems. I'm curious what the cause is. (I suppose that systemd isn't reaching the Multi-User System target?)

@BenTheElder
Copy link
Member

#2718 tentatively seems unrelated given the odd workaround there.

Is this still an issue on current fedora kernels?

@hhemied
Copy link
Author

hhemied commented May 8, 2022

Unfortunately, the issue still remains.
I tested with the latest kernel and latest version.

@BenTheElder
Copy link
Member

Just noticed this is with xfs, do we detect and mount devmapper correctly?

func mountDevMapper() bool {

If you run create cluster with --retain it won't delete the container(s) on failure and we can inspect the node logs etc (kind export logs).

@BenTheElder
Copy link
Member

Just noticed this is with xfs, do we detect and mount devmapper correctly?

We appear to in the results from #2689 (comment), /dev/mapper shows up in the volumes in the container inspect.

[οΏ½[0;32m OK οΏ½[0m] Reached target οΏ½[0;1;39mMulti-User SystemοΏ½[0m. is in the node logs and should have matched the regex πŸ˜•

@tty47
Copy link

tty47 commented May 30, 2022

same problem here...

ERROR: failed to create cluster: could not find a log line that matches "Reached target .*Multi-User System.*|detected cgroup v1"

kind version:

kind v0.14.0 go1.18.2 darwin/arm64

docker version:

Client:
 Cloud integration: v1.0.24
 Version:           20.10.14
 API version:       1.41
 Go version:        go1.16.15
 Git commit:        a224086
 Built:             Thu Mar 24 01:49:20 2022
 OS/Arch:           darwin/arm64
 Context:           default
 Experimental:      true

Server: Docker Desktop 4.8.2 (79419)
 Engine:
  Version:          20.10.14
  API version:      1.41 (minimum version 1.12)
  Go version:       go1.16.15
  Git commit:       87a90dc
  Built:            Thu Mar 24 01:45:44 2022
  OS/Arch:          linux/arm64
  Experimental:     false
 containerd:
  Version:          1.5.11
  GitCommit:        3df54a852345ae127d1fa3092b95168e4a88e2f8
 runc:
  Version:          1.0.3
  GitCommit:        v1.0.3-0-gf46b6ba
 docker-init:
  Version:          0.19.0
  GitCommit:        de40ad0

@aojea
Copy link
Contributor

aojea commented May 30, 2022

@jrmanes this issue is about podman and fedora, that seems to be a problem in the kernel.
The one you are reported seems related to #2718 , because you are using docker on mac with ARM architecture, please check if the environment variable is your problem as described in the linked issue.

@tty47
Copy link

tty47 commented May 30, 2022

hello @aojea
Thank you so much!
Checking the other one ;)

@hhemied
Copy link
Author

hhemied commented Jun 19, 2022

The issue still exist,
In my current setup,

 ➜ kind create cluster --name test --config cluster-ha-demo.yaml
using podman due to KIND_EXPERIMENTAL_PROVIDER
enabling experimental podman provider
Creating cluster "test" ...
 βœ“ Ensuring node image (kindest/node:v1.24.0) πŸ–Ό
 βœ“ Preparing nodes πŸ“¦ πŸ“¦ πŸ“¦ πŸ“¦ πŸ“¦ πŸ“¦
 βœ“ Configuring the external load balancer βš–οΈ
 βœ“ Writing configuration πŸ“œ
 βœ— Starting control-plane πŸ•ΉοΈ
ERROR: failed to create cluster: failed to init node with kubeadm: command "podman exec --privileged test-control-plane kubeadm init --skip-phases=preflight --config=/kind/kubeadm.conf --skip-token-print --v=6" failed with error: exit status 1
Command Output: I0619 17:44:28.743732     106 initconfiguration.go:255] loading configuration from "/kind/kubeadm.conf"
W0619 17:44:28.746570     106 initconfiguration.go:332] [config] WARNING: Ignored YAML document with GroupVersionKind kubeadm.k8s.io/v1beta3, Kind=JoinConfiguration
[init] Using Kubernetes version: v1.24.0
[certs] Using certificateDir folder "/etc/kubernetes/pki"
I0619 17:44:28.767327     106 certs.go:112] creating a new certificate authority for ca
[certs] Generating "ca" certificate and key
I0619 17:44:28.965046     106 certs.go:522] validating certificate period for ca certificate
[certs] Generating "apiserver" certificate and key
[certs] apiserver serving cert is signed for DNS names [kubernetes kubernetes.default kubernetes.default.svc kubernetes.default.svc.cluster.local localhost test-control-plane test-external-load-balancer] and IPs [10.96.0.1 10.89.0.6 0.0.0.0]
[certs] Generating "apiserver-kubelet-client" certificate and key
I0619 17:44:29.394052     106 certs.go:112] creating a new certificate authority for front-proxy-ca
[certs] Generating "front-proxy-ca" certificate and key
I0619 17:44:29.568655     106 certs.go:522] validating certificate period for front-proxy-ca certificate
[certs] Generating "front-proxy-client" certificate and key
I0619 17:44:29.684652     106 certs.go:112] creating a new certificate authority for etcd-ca
[certs] Generating "etcd/ca" certificate and key
I0619 17:44:29.807459     106 certs.go:522] validating certificate period for etcd/ca certificate
[certs] Generating "etcd/server" certificate and key
[certs] etcd/server serving cert is signed for DNS names [localhost test-control-plane] and IPs [10.89.0.6 127.0.0.1 ::1]
[certs] Generating "etcd/peer" certificate and key
[certs] etcd/peer serving cert is signed for DNS names [localhost test-control-plane] and IPs [10.89.0.6 127.0.0.1 ::1]
[certs] Generating "etcd/healthcheck-client" certificate and key
[certs] Generating "apiserver-etcd-client" certificate and key
I0619 17:44:30.757394     106 certs.go:78] creating new public/private key files for signing service account users
[certs] Generating "sa" key and public key
[kubeconfig] Using kubeconfig folder "/etc/kubernetes"
I0619 17:44:30.839032     106 kubeconfig.go:103] creating kubeconfig file for admin.conf
[kubeconfig] Writing "admin.conf" kubeconfig file
I0619 17:44:30.973013     106 kubeconfig.go:103] creating kubeconfig file for kubelet.conf
[kubeconfig] Writing "kubelet.conf" kubeconfig file
I0619 17:44:31.214993     106 kubeconfig.go:103] creating kubeconfig file for controller-manager.conf
[kubeconfig] Writing "controller-manager.conf" kubeconfig file
I0619 17:44:31.306636     106 kubeconfig.go:103] creating kubeconfig file for scheduler.conf
[kubeconfig] Writing "scheduler.conf" kubeconfig file
I0619 17:44:31.510689     106 kubelet.go:65] Stopping the kubelet
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Starting the kubelet
[control-plane] Using manifest folder "/etc/kubernetes/manifests"
[control-plane] Creating static Pod manifest for "kube-apiserver"
I0619 17:44:31.851460     106 manifests.go:99] [control-plane] getting StaticPodSpecs
I0619 17:44:31.852437     106 certs.go:522] validating certificate period for CA certificate
I0619 17:44:31.853146     106 manifests.go:125] [control-plane] adding volume "ca-certs" for component "kube-apiserver"
I0619 17:44:31.853961     106 manifests.go:125] [control-plane] adding volume "etc-ca-certificates" for component "kube-apiserver"
I0619 17:44:31.854111     106 manifests.go:125] [control-plane] adding volume "k8s-certs" for component "kube-apiserver"
I0619 17:44:31.854642     106 manifests.go:125] [control-plane] adding volume "usr-local-share-ca-certificates" for component "kube-apiserver"
I0619 17:44:31.855205     106 manifests.go:125] [control-plane] adding volume "usr-share-ca-certificates" for component "kube-apiserver"
I0619 17:44:31.863183     106 manifests.go:154] [control-plane] wrote static Pod manifest for component "kube-apiserver" to "/etc/kubernetes/manifests/kube-apiserver.yaml"
[control-plane] Creating static Pod manifest for "kube-controller-manager"
I0619 17:44:31.868517     106 manifests.go:99] [control-plane] getting StaticPodSpecs
...
If you are on a systemd-powered system, you can try to troubleshoot the error with the following commands:
        - 'systemctl status kubelet'
        - 'journalctl -xeu kubelet'

Additionally, a control plane component may have crashed or exited when started by the container runtime.
To troubleshoot, list all containers using your preferred container runtimes CLI.
Here is one example how you may list all running Kubernetes containers by using crictl:
        - 'crictl --runtime-endpoint unix:///run/containerd/containerd.sock ps -a | grep kube | grep -v pause'
        Once you have found the failing container, you can inspect its logs with:
        - 'crictl --runtime-endpoint unix:///run/containerd/containerd.sock logs CONTAINERID'
couldn't initialize a Kubernetes cluster
k8s.io/kubernetes/cmd/kubeadm/app/cmd/phases/init.runWaitControlPlanePhase
        cmd/kubeadm/app/cmd/phases/init/waitcontrolplane.go:108
k8s.io/kubernetes/cmd/kubeadm/app/cmd/phases/workflow.(*Runner).Run.func1
        cmd/kubeadm/app/cmd/phases/workflow/runner.go:234
k8s.io/kubernetes/cmd/kubeadm/app/cmd/phases/workflow.(*Runner).visitAll
        cmd/kubeadm/app/cmd/phases/workflow/runner.go:421
k8s.io/kubernetes/cmd/kubeadm/app/cmd/phases/workflow.(*Runner).Run
        cmd/kubeadm/app/cmd/phases/workflow/runner.go:207
k8s.io/kubernetes/cmd/kubeadm/app/cmd.newCmdInit.func1
        cmd/kubeadm/app/cmd/init.go:153
k8s.io/kubernetes/vendor/github.com/spf13/cobra.(*Command).execute
        vendor/github.com/spf13/cobra/command.go:856
k8s.io/kubernetes/vendor/github.com/spf13/cobra.(*Command).ExecuteC
        vendor/github.com/spf13/cobra/command.go:974
k8s.io/kubernetes/vendor/github.com/spf13/cobra.(*Command).Execute
        vendor/github.com/spf13/cobra/command.go:902
k8s.io/kubernetes/cmd/kubeadm/app.Run
        cmd/kubeadm/app/kubeadm.go:50
main.main
        cmd/kubeadm/kubeadm.go:25
runtime.main
        /usr/local/go/src/runtime/proc.go:250
runtime.goexit
        /usr/local/go/src/runtime/asm_amd64.s:1571
error execution phase wait-control-plane
k8s.io/kubernetes/cmd/kubeadm/app/cmd/phases/workflow.(*Runner).Run.func1
        cmd/kubeadm/app/cmd/phases/workflow/runner.go:235
k8s.io/kubernetes/cmd/kubeadm/app/cmd/phases/workflow.(*Runner).visitAll
        cmd/kubeadm/app/cmd/phases/workflow/runner.go:421
k8s.io/kubernetes/cmd/kubeadm/app/cmd/phases/workflow.(*Runner).Run
        cmd/kubeadm/app/cmd/phases/workflow/runner.go:207
k8s.io/kubernetes/cmd/kubeadm/app/cmd.newCmdInit.func1
        cmd/kubeadm/app/cmd/init.go:153
k8s.io/kubernetes/vendor/github.com/spf13/cobra.(*Command).execute
        vendor/github.com/spf13/cobra/command.go:856
k8s.io/kubernetes/vendor/github.com/spf13/cobra.(*Command).ExecuteC
        vendor/github.com/spf13/cobra/command.go:974
k8s.io/kubernetes/vendor/github.com/spf13/cobra.(*Command).Execute
        vendor/github.com/spf13/cobra/command.go:902
k8s.io/kubernetes/cmd/kubeadm/app.Run
        cmd/kubeadm/app/kubeadm.go:50
main.main
        cmd/kubeadm/kubeadm.go:25
runtime.main
        /usr/local/go/src/runtime/proc.go:250
runtime.goexit
        /usr/local/go/src/runtime/asm_amd64.s:1571

OS:

[root@localhost ~]# cat /etc/redhat-release
Fedora release 36 (Thirty Six)

Here is also the file system

[root@localhost ~]# lsblk -f
NAME FSTYPE FSVER LABEL UUID                                 FSAVAIL FSUSE% MOUNTPOINTS
sr0
vda
β”œβ”€vda1
β”‚
β”œβ”€vda2
β”‚    vfat   FAT16 EFI-SYSTEM
β”‚                       6CEF-1B1F
β”œβ”€vda3
β”‚    ext4   1.0   boot  a71809e0-8212-4321-9c28-bc736ac25184  226.1M    29% /boot
└─vda4
     xfs          root  786374f5-cd31-4aa2-b76f-b101250fd984   95.2G     4% /var/lib/containers/storage/overlay
                                                                            /var
                                                                            /sysroot/ostree/deploy/fedora-coreos/var
                                                                            /usr
                                                                            /etc
                                                                            /
                                                                            /sysroot

This is rootful podman

@BenTheElder
Copy link
Member

I've been out, @hhemied can you share the kind export logs from a create with --retain, also can you please try a minimal test with just kind create cluster --retain; kind export logs; kind delete cluster (Versus configuring many nodes, so we can get a minimal reproduction and not hit resource issues)

@glitchcrab
Copy link

I'm also seeing the same on Arch linux with both kind 0.13 and 0.14 with kernel 5.19.2-arch1-2. My docker dir is xfs-backed if that matters.

Logs are here kind-logs.tar.gz

@jimdevops19
Copy link

jimdevops19 commented Oct 27, 2022

Seeing the same issue with v.1.25.3
Running on Ubuntu22.04

Creating cluster "test-cluster" ...
 βœ“ Ensuring node image (kindest/node:v1.25.3) πŸ–Ό 
 βœ— Preparing nodes πŸ“¦  
ERROR: failed to create cluster: could not find a log line that matches "Reached target .*Multi-User System.*|detected cgroup v1"

BUT: when I removed podman and docker and re installed docker, worked like a charm

@hhemied
Copy link
Author

hhemied commented Nov 1, 2022

It's working for me now using another method
I use podman machine.
I can create clusters with single and multiple nodes.

@hhemied
Copy link
Author

hhemied commented Jan 13, 2023

For me it looks a resource issue.
I am going to close it as with more resource I can achieve what I need.

@hhemied hhemied closed this as completed Jan 13, 2023
@BenTheElder
Copy link
Member

Thanks for following up!

@BenTheElder BenTheElder added kind/support Categorizes issue or PR as a support question. and removed kind/bug Categorizes issue or PR as related to a bug. labels Jan 17, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/provider/podman Issues or PRs related to podman kind/support Categorizes issue or PR as a support question.
Projects
None yet
Development

No branches or pull requests

10 participants