-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. Weβll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Kind can't create clusters in F35 with Podman #2689
Comments
can you check you are using the image in the release notes?
|
I did, and here is the output [root@fedora ~]# kind create cluster --image kindest/node:v1.23.4@sha256:0e34f0d0fd448aa2f2819cfd74e99fe5793a6e4938b328f657c8e3f81ee0dfb9
enabling experimental podman provider
Creating cluster "kind" ...
β Ensuring node image (kindest/node:v1.23.4) πΌ
β Preparing nodes π¦
ERROR: failed to create cluster: could not find a log line that matches "Reached target .*Multi-User System.*|detected cgroup v1" |
can you execute adding |
|
It looks like above for the From the client version side it looks like it was compiled for amd64, but wondering if you are running on arm64. There was a similar error reported recently in slack: https://kubernetes.slack.com/archives/CEKK1KTN2/p1646907106345039?thread_ts=1646907067.117919&cid=CEKK1KTN2 |
You are right, my bad. [root@fedora ~]# podman info
host:
arch: amd64
buildahVersion: 1.24.1
cgroupControllers:
- cpuset
- cpu
- io
- memory
- hugetlb
- pids
- misc
cgroupManager: systemd
cgroupVersion: v2
conmon:
package: conmon-2.1.0-2.fc35.x86_64
path: /usr/bin/conmon
version: 'conmon version 2.1.0, commit: '
cpus: 6
distribution:
distribution: fedora
version: "35"
eventLogger: journald
hostname: fedora
idMappings:
gidmap: null
uidmap: null
kernel: 5.16.16-200.fc35.x86_64
linkmode: dynamic
logDriver: journald
memFree: 7363661824
memTotal: 9196961792
networkBackend: netavark
ociRuntime:
name: crun
package: crun-1.4.3-1.fc35.x86_64
path: /usr/bin/crun
version: |-
crun version 1.4.3
commit: 61c9600d1335127eba65632731e2d72bc3f0b9e8
spec: 1.0.0
+SYSTEMD +SELINUX +APPARMOR +CAP +SECCOMP +EBPF +CRIU +YAJL
os: linux
remoteSocket:
exists: true
path: /run/podman/podman.sock
security:
apparmorEnabled: false
capabilities: CAP_CHOWN,CAP_DAC_OVERRIDE,CAP_FOWNER,CAP_FSETID,CAP_KILL,CAP_NET_BIND_SERVICE,CAP_SETFCAP,CAP_SETGID,CAP_SETPCAP,CAP_SETUID,CAP_SYS_CHROOT
rootless: false
seccompEnabled: true
seccompProfilePath: /usr/share/containers/seccomp.json
selinuxEnabled: true
serviceIsRemote: false
slirp4netns:
executable: /usr/bin/slirp4netns
package: slirp4netns-1.1.12-2.fc35.x86_64
version: |-
slirp4netns version 1.1.12
commit: 7a104a101aa3278a2152351a082a6df71f57c9a3
libslirp: 4.6.1
SLIRP_CONFIG_VERSION_MAX: 3
libseccomp: 2.5.3
swapFree: 8589930496
swapTotal: 8589930496
uptime: 1m 52.2s
plugins:
log:
- k8s-file
- none
- passthrough
- journald
network:
- bridge
- macvlan
volume:
- local
registries:
search:
- registry.fedoraproject.org
- registry.access.redhat.com
- docker.io
- quay.io
store:
configFile: /etc/containers/storage.conf
containerStore:
number: 6
paused: 0
running: 0
stopped: 6
graphDriverName: overlay
graphOptions:
overlay.mountopt: nodev,metacopy=on
graphRoot: /var/lib/containers/storage
graphStatus:
Backing Filesystem: xfs
Native Overlay Diff: "false"
Supports d_type: "true"
Using metacopy: "true"
imageCopyTmpDir: /var/tmp
imageStore:
number: 5
runRoot: /run/containers/storage
volumePath: /var/lib/containers/storage/volumes
version:
APIVersion: 4.0.2
Built: 1646943965
BuiltTime: Thu Mar 10 21:26:05 2022
GitCommit: ""
GoVersion: go1.16.14
OsArch: linux/amd64
Version: 4.0.2
|
ok, try this
and then |
Here is the logs folder |
I can see the log line and I can't see a 30 seconds delay ... why is it failing? @AkihiroSuda , have you seen something similar? |
Nothing suspicious, actually it is clean install to test kind with podman 4 [root@fedora ~]# podman network ls
NETWORK ID NAME DRIVER
faed16303522 kind bridge
2f259bab93aa podman bridge |
Does podman 3 work? |
I am also on Fedora 35, and am affected by the same issue. I was able to create a kind cluster yesterday, but this morning I updated my kernel from 5.16.16. This kernel version appears in the report above. If I fall back to the 5.16.15 kernel, I no longer have this issue. FWIW, I am using the docker provider, so I suspect this issue is related in some way to the kernel, not podman. I may be wrong here, because I see nothing related in either the Fedora 5.16.16 changelog, or the upstream 5.16.16 changelog. Although I don't have time to investigate the root cause right now, I can open a docs PR. |
Nope, it seems not connected to the kernel,
And still geting the same error
|
Here is the output if I remove Podman and install Docker kind create cluster
Creating cluster "kind" ...
β Ensuring node image (kindest/node:v1.23.4) πΌ
β Preparing nodes π¦
β Writing configuration π
β Starting control-plane πΉοΈ
β Installing CNI π
β Installing StorageClass πΎ
Set kubectl context to "kind-kind"
You can now use your cluster with:
kubectl cluster-info --context kind-kind
Not sure what to do next? π
Check out https://kind.sigs.k8s.io/docs/user/quick-start/ |
Same problem happens on macOS Monterey (v12) with Apple M1: $ kind create cluster
Creating cluster "kind" ...
β Ensuring node image (kindest/node:v1.23.4) πΌ
β Preparing nodes π¦
ERROR: failed to create cluster: could not find a log line that matches "Reached target .*Multi-User System.*|detected cgroup v1"
$ kind version
kind v0.12.0 go1.17.8 darwin/arm64
$ docker version
Client:
Cloud integration: v1.0.22
Version: 20.10.13
API version: 1.41
Go version: go1.16.15
Git commit: a224086
Built: Thu Mar 10 14:08:43 2022
OS/Arch: darwin/arm64
Context: default
Experimental: true
Server: Docker Desktop 4.6.0 (75818)
Engine:
Version: 20.10.13
API version: 1.41 (minimum version 1.12)
Go version: go1.16.15
Git commit: 906f57f
Built: Thu Mar 10 14:05:37 2022
OS/Arch: linux/arm64
Experimental: false
containerd:
Version: 1.5.10
GitCommit: 2a1d4dbdb2a1030dc5b01e96fb110a9d9f150ecc
runc:
Version: 1.0.3
GitCommit: v1.0.3-0-gf46b6ba
docker-init:
Version: 0.19.0
GitCommit: de40ad0
$ kubectl version
Client Version: version.Info{Major:"1", Minor:"22", GitVersion:"v1.22.2", GitCommit:"8b5a19147530eaac9476b0ab82980b4088bbc1b2", GitTreeState:"clean", BuildDate:"2021-09-15T21:31:32Z", GoVersion:"go1.16.9", Compiler:"gc", Platform:"darwin/arm64"}
Server Version: version.Info{Major:"1", Minor:"21", GitVersion:"v1.21.9-gke.1002", GitCommit:"f87f9d952767b966e72a4bd75afea25dea187bbf", GitTreeState:"clean", BuildDate:"2022-02-25T18:12:32Z", GoVersion:"go1.16.12b7", Compiler:"gc", Platform:"linux/amd64"}
|
Could you try the latest kernel 5.16.17-200.fc35? I don't see any issue with this kernel. my
|
I tried 5.16.18-200.fc35, and I have no issues. It seems it might still affect other systems. I'm curious what the cause is. (I suppose that systemd isn't reaching the |
#2718 tentatively seems unrelated given the odd workaround there. Is this still an issue on current fedora kernels? |
Unfortunately, the issue still remains. |
Just noticed this is with xfs, do we detect and mount devmapper correctly?
If you run create cluster with --retain it won't delete the container(s) on failure and we can inspect the node logs etc (kind export logs). |
We appear to in the results from #2689 (comment), /dev/mapper shows up in the volumes in the container inspect.
|
same problem here...
kind version:
docker version:
|
@jrmanes this issue is about podman and fedora, that seems to be a problem in the kernel. |
hello @aojea |
The issue still exist, β kind create cluster --name test --config cluster-ha-demo.yaml
using podman due to KIND_EXPERIMENTAL_PROVIDER
enabling experimental podman provider
Creating cluster "test" ...
β Ensuring node image (kindest/node:v1.24.0) πΌ
β Preparing nodes π¦ π¦ π¦ π¦ π¦ π¦
β Configuring the external load balancer βοΈ
β Writing configuration π
β Starting control-plane πΉοΈ
ERROR: failed to create cluster: failed to init node with kubeadm: command "podman exec --privileged test-control-plane kubeadm init --skip-phases=preflight --config=/kind/kubeadm.conf --skip-token-print --v=6" failed with error: exit status 1
Command Output: I0619 17:44:28.743732 106 initconfiguration.go:255] loading configuration from "/kind/kubeadm.conf"
W0619 17:44:28.746570 106 initconfiguration.go:332] [config] WARNING: Ignored YAML document with GroupVersionKind kubeadm.k8s.io/v1beta3, Kind=JoinConfiguration
[init] Using Kubernetes version: v1.24.0
[certs] Using certificateDir folder "/etc/kubernetes/pki"
I0619 17:44:28.767327 106 certs.go:112] creating a new certificate authority for ca
[certs] Generating "ca" certificate and key
I0619 17:44:28.965046 106 certs.go:522] validating certificate period for ca certificate
[certs] Generating "apiserver" certificate and key
[certs] apiserver serving cert is signed for DNS names [kubernetes kubernetes.default kubernetes.default.svc kubernetes.default.svc.cluster.local localhost test-control-plane test-external-load-balancer] and IPs [10.96.0.1 10.89.0.6 0.0.0.0]
[certs] Generating "apiserver-kubelet-client" certificate and key
I0619 17:44:29.394052 106 certs.go:112] creating a new certificate authority for front-proxy-ca
[certs] Generating "front-proxy-ca" certificate and key
I0619 17:44:29.568655 106 certs.go:522] validating certificate period for front-proxy-ca certificate
[certs] Generating "front-proxy-client" certificate and key
I0619 17:44:29.684652 106 certs.go:112] creating a new certificate authority for etcd-ca
[certs] Generating "etcd/ca" certificate and key
I0619 17:44:29.807459 106 certs.go:522] validating certificate period for etcd/ca certificate
[certs] Generating "etcd/server" certificate and key
[certs] etcd/server serving cert is signed for DNS names [localhost test-control-plane] and IPs [10.89.0.6 127.0.0.1 ::1]
[certs] Generating "etcd/peer" certificate and key
[certs] etcd/peer serving cert is signed for DNS names [localhost test-control-plane] and IPs [10.89.0.6 127.0.0.1 ::1]
[certs] Generating "etcd/healthcheck-client" certificate and key
[certs] Generating "apiserver-etcd-client" certificate and key
I0619 17:44:30.757394 106 certs.go:78] creating new public/private key files for signing service account users
[certs] Generating "sa" key and public key
[kubeconfig] Using kubeconfig folder "/etc/kubernetes"
I0619 17:44:30.839032 106 kubeconfig.go:103] creating kubeconfig file for admin.conf
[kubeconfig] Writing "admin.conf" kubeconfig file
I0619 17:44:30.973013 106 kubeconfig.go:103] creating kubeconfig file for kubelet.conf
[kubeconfig] Writing "kubelet.conf" kubeconfig file
I0619 17:44:31.214993 106 kubeconfig.go:103] creating kubeconfig file for controller-manager.conf
[kubeconfig] Writing "controller-manager.conf" kubeconfig file
I0619 17:44:31.306636 106 kubeconfig.go:103] creating kubeconfig file for scheduler.conf
[kubeconfig] Writing "scheduler.conf" kubeconfig file
I0619 17:44:31.510689 106 kubelet.go:65] Stopping the kubelet
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Starting the kubelet
[control-plane] Using manifest folder "/etc/kubernetes/manifests"
[control-plane] Creating static Pod manifest for "kube-apiserver"
I0619 17:44:31.851460 106 manifests.go:99] [control-plane] getting StaticPodSpecs
I0619 17:44:31.852437 106 certs.go:522] validating certificate period for CA certificate
I0619 17:44:31.853146 106 manifests.go:125] [control-plane] adding volume "ca-certs" for component "kube-apiserver"
I0619 17:44:31.853961 106 manifests.go:125] [control-plane] adding volume "etc-ca-certificates" for component "kube-apiserver"
I0619 17:44:31.854111 106 manifests.go:125] [control-plane] adding volume "k8s-certs" for component "kube-apiserver"
I0619 17:44:31.854642 106 manifests.go:125] [control-plane] adding volume "usr-local-share-ca-certificates" for component "kube-apiserver"
I0619 17:44:31.855205 106 manifests.go:125] [control-plane] adding volume "usr-share-ca-certificates" for component "kube-apiserver"
I0619 17:44:31.863183 106 manifests.go:154] [control-plane] wrote static Pod manifest for component "kube-apiserver" to "/etc/kubernetes/manifests/kube-apiserver.yaml"
[control-plane] Creating static Pod manifest for "kube-controller-manager"
I0619 17:44:31.868517 106 manifests.go:99] [control-plane] getting StaticPodSpecs
...
If you are on a systemd-powered system, you can try to troubleshoot the error with the following commands:
- 'systemctl status kubelet'
- 'journalctl -xeu kubelet'
Additionally, a control plane component may have crashed or exited when started by the container runtime.
To troubleshoot, list all containers using your preferred container runtimes CLI.
Here is one example how you may list all running Kubernetes containers by using crictl:
- 'crictl --runtime-endpoint unix:///run/containerd/containerd.sock ps -a | grep kube | grep -v pause'
Once you have found the failing container, you can inspect its logs with:
- 'crictl --runtime-endpoint unix:///run/containerd/containerd.sock logs CONTAINERID'
couldn't initialize a Kubernetes cluster
k8s.io/kubernetes/cmd/kubeadm/app/cmd/phases/init.runWaitControlPlanePhase
cmd/kubeadm/app/cmd/phases/init/waitcontrolplane.go:108
k8s.io/kubernetes/cmd/kubeadm/app/cmd/phases/workflow.(*Runner).Run.func1
cmd/kubeadm/app/cmd/phases/workflow/runner.go:234
k8s.io/kubernetes/cmd/kubeadm/app/cmd/phases/workflow.(*Runner).visitAll
cmd/kubeadm/app/cmd/phases/workflow/runner.go:421
k8s.io/kubernetes/cmd/kubeadm/app/cmd/phases/workflow.(*Runner).Run
cmd/kubeadm/app/cmd/phases/workflow/runner.go:207
k8s.io/kubernetes/cmd/kubeadm/app/cmd.newCmdInit.func1
cmd/kubeadm/app/cmd/init.go:153
k8s.io/kubernetes/vendor/github.com/spf13/cobra.(*Command).execute
vendor/github.com/spf13/cobra/command.go:856
k8s.io/kubernetes/vendor/github.com/spf13/cobra.(*Command).ExecuteC
vendor/github.com/spf13/cobra/command.go:974
k8s.io/kubernetes/vendor/github.com/spf13/cobra.(*Command).Execute
vendor/github.com/spf13/cobra/command.go:902
k8s.io/kubernetes/cmd/kubeadm/app.Run
cmd/kubeadm/app/kubeadm.go:50
main.main
cmd/kubeadm/kubeadm.go:25
runtime.main
/usr/local/go/src/runtime/proc.go:250
runtime.goexit
/usr/local/go/src/runtime/asm_amd64.s:1571
error execution phase wait-control-plane
k8s.io/kubernetes/cmd/kubeadm/app/cmd/phases/workflow.(*Runner).Run.func1
cmd/kubeadm/app/cmd/phases/workflow/runner.go:235
k8s.io/kubernetes/cmd/kubeadm/app/cmd/phases/workflow.(*Runner).visitAll
cmd/kubeadm/app/cmd/phases/workflow/runner.go:421
k8s.io/kubernetes/cmd/kubeadm/app/cmd/phases/workflow.(*Runner).Run
cmd/kubeadm/app/cmd/phases/workflow/runner.go:207
k8s.io/kubernetes/cmd/kubeadm/app/cmd.newCmdInit.func1
cmd/kubeadm/app/cmd/init.go:153
k8s.io/kubernetes/vendor/github.com/spf13/cobra.(*Command).execute
vendor/github.com/spf13/cobra/command.go:856
k8s.io/kubernetes/vendor/github.com/spf13/cobra.(*Command).ExecuteC
vendor/github.com/spf13/cobra/command.go:974
k8s.io/kubernetes/vendor/github.com/spf13/cobra.(*Command).Execute
vendor/github.com/spf13/cobra/command.go:902
k8s.io/kubernetes/cmd/kubeadm/app.Run
cmd/kubeadm/app/kubeadm.go:50
main.main
cmd/kubeadm/kubeadm.go:25
runtime.main
/usr/local/go/src/runtime/proc.go:250
runtime.goexit
/usr/local/go/src/runtime/asm_amd64.s:1571 OS: [root@localhost ~]# cat /etc/redhat-release
Fedora release 36 (Thirty Six) Here is also the file system [root@localhost ~]# lsblk -f
NAME FSTYPE FSVER LABEL UUID FSAVAIL FSUSE% MOUNTPOINTS
sr0
vda
ββvda1
β
ββvda2
β vfat FAT16 EFI-SYSTEM
β 6CEF-1B1F
ββvda3
β ext4 1.0 boot a71809e0-8212-4321-9c28-bc736ac25184 226.1M 29% /boot
ββvda4
xfs root 786374f5-cd31-4aa2-b76f-b101250fd984 95.2G 4% /var/lib/containers/storage/overlay
/var
/sysroot/ostree/deploy/fedora-coreos/var
/usr
/etc
/
/sysroot This is rootful podman |
I've been out, @hhemied can you share the |
I'm also seeing the same on Arch linux with both kind 0.13 and 0.14 with kernel Logs are here kind-logs.tar.gz |
Seeing the same issue with v.1.25.3
BUT: when I removed podman and docker and re installed docker, worked like a charm |
It's working for me now using another method |
For me it looks a resource issue. |
Thanks for following up! |
What happened:
What you expected to happen:
kind could create a cluster.
How to reproduce it (as minimally and precisely as possible):
Environment:
kind version
):kubectl version
):docker info
):I use podman
/etc/os-release
):The text was updated successfully, but these errors were encountered: