Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WSL environment needs additional cgroupv2 configuration #3685

Closed
tppalani opened this issue Jul 15, 2024 · 28 comments · Fixed by #3689
Closed

WSL environment needs additional cgroupv2 configuration #3685

tppalani opened this issue Jul 15, 2024 · 28 comments · Fixed by #3689
Labels
area/provider/podman Issues or PRs related to podman kind/documentation Categorizes issue or PR as related to documentation.

Comments

@tppalani
Copy link

What happened:

I'm have created kind cluster using kindest node with base image using config.yaml i can control plane and node are in read state but when i see application pod i can see some error related i don't see this error in older release version kindest/node:v1.27.

What you expected to happen:

How to reproduce it (as minimally and precisely as possible):

Anything else we need to know?:

logs

combined from similar events): Liveness probe errored: rpc error: code = Unknown desc = failed to exec in container: failed to start exec "d755ebc8e8c9df8392a0819a9a9873f8b3dc92bc8adc2e0cb78a373effb85c2d": OCI runtime exec failed: exec failed: unable to start container process: error adding pid 655569 to cgroups: failed to write 655569: openat2 /sys/fs/cgroup/unified/kubelet.slice/kubelet-kubepods.slice/kubelet-kubepods-burstable.slice/kubelet-kubepods-burstable-pod9aca007f_efe0_4ef4_98f7_17751468c3e5.slice/cri-containerd-1872cb535b10d6ae6b00f2e0891

Environment:

  • kind version: (use kind version): kind v0.18.0 go1.20.2 windows/amd64
  • Runtime info: (use docker info, podman info or nerdctl info):
host:
  arch: amd64
  buildahVersion: 1.36.0
  cgroupControllers:
  - cpuset
  - cpu
  - cpuacct
  - blkio
  - memory
  - devices
  - freezer
  - net_cls
  - perf_event
  - net_prio
  - hugetlb
  - pids
  - rdma
  - misc
  cgroupManager: cgroupfs
  cgroupVersion: v1
  conmon:
    package: conmon-2.1.10-1.fc40.x86_64
    path: /usr/bin/conmon
    version: 'conmon version 2.1.10, commit: '
  cpuUtilization:
    idlePercent: 83.06
    systemPercent: 7.27
    userPercent: 9.67
  cpus: 12
  databaseBackend: sqlite
  distribution:
    distribution: fedora
    variant: container
    version: "40"
  eventLogger: journald
  freeLocks: 2042
  hostname: LDD4C6G3
  idMappings:
    gidmap: null
    uidmap: null
  kernel: 5.15.153.1-microsoft-standard-WSL2
  linkmode: dynamic
  logDriver: journald
  memFree: 407486464
  memTotal: 16566644736
  networkBackend: netavark
  networkBackendInfo:
    backend: netavark
    dns:
      package: aardvark-dns-1.11.0-1.fc40.x86_64
      path: /usr/libexec/podman/aardvark-dns
      version: aardvark-dns 1.11.0
    package: netavark-1.11.0-1.fc40.x86_64
    path: /usr/libexec/podman/netavark
    version: netavark 1.11.0
  ociRuntime:
    name: crun
    package: crun-1.15-1.fc40.x86_64
    path: /usr/bin/crun
    version: |-
      crun version 1.15
      commit: e6eacaf4034e84185fd8780ac9262bbf57082278
      rundir: /run/crun
      spec: 1.0.0
      +SYSTEMD +SELINUX +APPARMOR +CAP +SECCOMP +EBPF +CRIU +LIBKRUN +WASM:wasmedge +YAJL
  os: linux
  pasta:
    executable: /usr/bin/pasta
    package: passt-0^20240607.g8a83b53-1.fc40.x86_64
    version: |
      pasta 0^20240607.g8a83b53-1.fc40.x86_64
      Copyright Red Hat
      GNU General Public License, version 2 or later
        <https://www.gnu.org/licenses/old-licenses/gpl-2.0.html>
      This is free software: you are free to change and redistribute it.
      There is NO WARRANTY, to the extent permitted by law.
  remoteSocket:
    exists: true
    path: /run/podman/podman.sock
  rootlessNetworkCmd: pasta
  security:
    apparmorEnabled: false
    capabilities: CAP_CHOWN,CAP_DAC_OVERRIDE,CAP_FOWNER,CAP_FSETID,CAP_KILL,CAP_NET_BIND_SERVICE,CAP_SETFCAP,CAP_SETGID,CAP_SETPCAP,CAP_SETUID,CAP_SYS_CHROOT
    rootless: false
    seccompEnabled: true
    seccompProfilePath: /usr/share/containers/seccomp.json
    selinuxEnabled: false
  serviceIsRemote: true
  slirp4netns:
    executable: ""
    package: ""
    version: ""
  swapFree: 4220379136
  swapTotal: 4294967296
  uptime: 8h 40m 51.00s (Approximately 0.33 days)
  variant: ""
plugins:
  authorization: null
  log:
  - k8s-file
  - none
  - passthrough
  - journald
  network:
  - bridge
  - macvlan
  - ipvlan
  volume:
  - local
registries:
  search:
  - docker.io
store:
  configFile: /usr/share/containers/storage.conf
  containerStore:
    number: 3
    paused: 0
    running: 3
    stopped: 0
  graphDriverName: overlay
  graphOptions:
    overlay.imagestore: /usr/lib/containers/storage
    overlay.mountopt: nodev,metacopy=on
  graphRoot: /var/lib/containers/storage
  graphRootAllocated: 1081101176832
  graphRootUsed: 24112721920
  graphStatus:
    Backing Filesystem: extfs
    Native Overlay Diff: "false"
    Supports d_type: "true"
    Supports shifting: "false"
    Supports volatile: "true"
    Using metacopy: "true"
  imageCopyTmpDir: /var/tmp
  imageStore:
    number: 8
  runRoot: /run/containers/storage
  transientStore: false
  volumePath: /var/lib/containers/storage/volumes
version:
  APIVersion: 5.1.1
  Built: 1717459200
  BuiltTime: Tue Jun  4 05:30:00 2024
  GitCommit: ""
  GoVersion: go1.22.3
  Os: linux
  OsArch: linux/amd64
  Version: 5.1.1

  • OS (e.g. from /etc/os-release): windows 11
  • Kubernetes version: (use kubectl version): Client Version: v1.28.2 Kustomize Version: v5.0.4-0.20230601165947-6ce0bf390ce3 Server Version: v1.30.0
  • Any proxies or other special environment settings?: NA
@tppalani tppalani added the kind/bug Categorizes issue or PR as related to a bug. label Jul 15, 2024
@aojea
Copy link
Contributor

aojea commented Jul 15, 2024

use the latest stable version from kind please and report back, also it seems you are using cgroupsv1 that has known issues

#3558 (comment)

@tppalani
Copy link
Author

May I know what is the stable version?

@aojea
Copy link
Contributor

aojea commented Jul 15, 2024

May I know what is the stable version?

the last one https://github.com/kubernetes-sigs/kind/releases , 0.23.0 in this case

@BenTheElder
Copy link
Member

OS (e.g. from /etc/os-release): windows 11

WSL2 + rootless podman is uncharted territory for us, but see these ~user contributed guides as well:
https://kind.sigs.k8s.io/docs/user/using-wsl2/
https://kind.sigs.k8s.io/docs/user/rootless/

@BenTheElder BenTheElder added the area/provider/podman Issues or PRs related to podman label Jul 15, 2024
@stmcginnis
Copy link
Contributor

I think cgroupv1 is going to be an issue, right?

cgroupVersion: v1

@tppalani
Copy link
Author

Hi @stmcginnis Yes even i have tried with other image as well which is suggest by @aojea - image: kindest/node:v1.30.0@sha256:047357ac0cfea04663786a612ba1eaba9702bef25227a794b52890dd8bcd692e
still same issue

Startup probe errored: rpc error: code = Unknown desc = failed to exec in container: failed to start exec "9f88c516c2affe0d0a6255ed2b4a7ba3400a453753e21c4c8550227d1bbb4332": OCI runtime exec failed: exec failed: unable to start container process: error adding pid 7115 to cgroups: failed to write 7115: openat2 /sys/fs/cgroup/unified/kubelet.slice/kubelet-kubepods.slice/kubelet-kubepods-besteffort.slice/kubelet-kubepods-besteffort-pod77edb64a_8046_4a7d_a69c_a57762b51e74.slice/cri-containerd-012ad1d06d571fd9c144addb4de8ab3c2eba9fa5490132fb0fa1e191bd11ab9f.scope/cgroup.procs: no such file or directory: unknown

@stmcginnis
Copy link
Contributor

OK, that does look like it may be part of the issue. I believe you will need to set up your WSL environment to be using cgroupv2.

@tppalani
Copy link
Author

tppalani commented Jul 15, 2024

I'm really sorry i'm using windows machine how can i set wsl environment with cgroupv2.

$ wsl -l -v
  NAME                      STATE           VERSION
* podman-machine-default    Running         2
  podman-net-usermode       Running         2

@tppalani
Copy link
Author

And here the data inside podman machine ssh

# fstab intentionally empty for containers
/run/user/1000/podman/podman.sock /mnt/wsl/podman-sockets/podman-machine-default/podman-user.sock none noauto,user,bind,defaults 0 0

@stmcginnis
Copy link
Contributor

Sorry, no idea as I haven't used WSL or Windows for a number of years now, but this looks like it may have some useful information: https://github.com/spurin/wsl-cgroupsv2

@BenTheElder
Copy link
Member

cgroup v1 can work but needs cgroupns suport.

I would suggest using something like lima or docker desktop with docker instead, follow https://kind.sigs.k8s.io/docs/user/using-wsl2/

@BenTheElder
Copy link
Member

The failure to write cgroups isn't in kind, that's coming from podman after we ask it to create the container.

@BenTheElder
Copy link
Member

I would guess cgroupns issues on this linux guest environment

@tppalani
Copy link
Author

Hi @BenTheElder do I need to change any configuration from my side to make it work?

@tppalani
Copy link
Author

Hi @BenTheElder @aojea

I don't think this cgroup issue because when i have used below image i can see all the pods and up and running without any error message, do you have idea about this still i'm using cgroup v1 only according above podman info output.

kindest/node:v1.27.1@sha256:c44686bf1f422942a21434e5b4070fc47f3c190305be2974f91444cd34909f1b

@rbngzlv
Copy link

rbngzlv commented Jul 17, 2024

Seems that I hit the same problem when trying to create a cluster on WSL 2 using the image kindest/node:v1.30 and fixed it switching wsl to use cgroupsv2 as pointed by comments (although I'm using docker and not podman).

Thank y'all for the hints.

@BenTheElder
Copy link
Member

Hi @BenTheElder do I need to change any configuration from my side to make it work?

#3685 (comment)

I recommend using a better supported platform than kind-on-podman-on-wsl2

Kubernetes uses docker on Linux primarily, some contributors use it on macOS.

podman on WSL2 with cgroup v1 is probably the worst supported combination of options in the ecosystem and I can't personally replicate this, I'm not a windows user, and no windows users have helped us figure out a workable CI approach (e.g. previously we tried actions but could not run docker or podman in that environment).

@BenTheElder
Copy link
Member

I don't think this cgroup issue because when i have used below image i can see all the pods and up and running without any error message, do you have idea about this still i'm using cgroup v1 only according above podman info output.

This is difficult to debug over github when we receive partial information, for example you say you're using this image but not with what kind version / environment, and with only excerpts from the logs.

Have you looked at the suggestions above, including e.g. the complete guide for using WSL2? #3685 (comment)

Seems that I hit the same problem when trying to create a cluster on WSL 2 using the image kindest/node:v1.30 and fixed it switching wsl to use cgroupsv2 as pointed by comments (although I'm using docker and not podman).

Yes, I would highly recommend this. You can't create cgroup v2 clusters with Kubernetes < 1.19 but that's long out of support anyhow. Cgroup v2 is maturing and will be the focus for the ecosystem going forward, and in particular makes nested containers a lot more straightforward by typically having cgroupns enabled by default + the unified hierarchy.

@tppalani
Copy link
Author

Seems that I hit the same problem when trying to create a cluster on WSL 2 using the image kindest/node:v1.30 and fixed it switching wsl to use cgroupsv2 as pointed by comments (although I'm using docker and not podman).

Thank y'all for the hints.

How did you fixed? Are you using mac book or windows system?

@rbngzlv
Copy link

rbngzlv commented Jul 18, 2024

How did you fixed? Are you using mac book or windows system?

I was able to create a cluster configuring WSL 2 to use cgroup v2 instead of cgroup v1, following the instructions in the readme of the repo shared in #3685 (comment).

@tppalani
Copy link
Author

We are good close this ticket issue has been resolved.

$ podman run -it --rm spurin/wsl-cgroupsv2:latest
Success: cgroup type is cgroup2fs

AL44469@LDD4C6G3 MINGW64 ~/AWSCLI
$ podman info
host:
  arch: amd64
  buildahVersion: 1.36.0
  cgroupControllers:
  - cpuset
  - cpu
  - io
  - memory
  - hugetlb
  - pids
  - rdma
  - misc
  cgroupManager: cgroupfs
  cgroupVersion: v2

@tppalani
Copy link
Author

Thanks for all the contributions and guidelines

@stmcginnis
Copy link
Contributor

Glad you got it working!

/close

@k8s-ci-robot
Copy link
Contributor

@stmcginnis: Closing this issue.

In response to this:

Glad you got it working!

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@BenTheElder
Copy link
Member

Awesome!

We should probably add a note pointing to the WSL2 cgroupv2 guide in the WSL2 page?

@BenTheElder
Copy link
Member

Thanks all

@stmcginnis
Copy link
Contributor

We should probably add a note pointing to the WSL2 cgroupv2 guide in the WSL2 page?

Good point, we really should capture that.

/reopen
/retitle WSL environment needs additional cgroupv2 configuration
/remove-kind bug
/kind documentation

@k8s-ci-robot k8s-ci-robot changed the title kindest/node:v1.30 cgroups: failed to write WSL environment needs additional cgroupv2 configuration Jul 18, 2024
@k8s-ci-robot k8s-ci-robot added kind/documentation Categorizes issue or PR as related to documentation. and removed kind/bug Categorizes issue or PR as related to a bug. labels Jul 18, 2024
@k8s-ci-robot
Copy link
Contributor

@stmcginnis: Reopened this issue.

In response to this:

We should probably add a note pointing to the WSL2 cgroupv2 guide in the WSL2 page?

Good point, we really should capture that.

/reopen
/retitle WSL environment needs additional cgroupv2 configuration
/remove-kind bug
/kind documentation

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/provider/podman Issues or PRs related to podman kind/documentation Categorizes issue or PR as related to documentation.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants