Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Systemd container unit works but not reliably on startup #19740

Closed
mattventura opened this issue Aug 24, 2023 · 5 comments
Closed

Systemd container unit works but not reliably on startup #19740

mattventura opened this issue Aug 24, 2023 · 5 comments
Labels
kind/bug Categorizes issue or PR as related to a bug. locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments.

Comments

@mattventura
Copy link

mattventura commented Aug 24, 2023

Issue Description

I created a systemd unit file to automatically start a container as a user using rootless podman. Due to other open issues such as #12778 , I am using a simple Type=exec unit. It is modified from the podman generate output to accommodate for those issues:

[Unit]
Description=Podman container-my-service.service
Documentation=man:podman-generate-systemd(1)
Wants=network-online.target
After=network-online.target
RequiresMountsFor=%T/containers
StartLimitIntervalSec=300
StartLimitBurst=5
Type=idle

[Service]
Environment=PODMAN_SYSTEMD_UNIT=%n
Restart=always
TimeoutStopSec=70
ExecStart=/usr/bin/podman run \
        --cgroups=no-conmon \
        --sdnotify=conmon \
        --replace \
        -p 8080:8080 \
        --name my-service my-service-deploy
ExecStop=/usr/bin/podman stop \
        --ignore -t 10 \
        my-service
ExecStopPost=/usr/bin/podman rm \
        -f \
        --ignore -t 10 \
        my-service
Type=exec
NotifyAccess=all
User=opc
Group=opc
WorkingDirectory=/home/opc
Restart=on-failure
RestartSec=30s

[Install]
WantedBy=default.target

This works fine when I run systemctl daemon-reload and systemctl start container-my-service. The container comes up, and I see it in podman container list if I log in as the appropriate user. Then, I do systemctl enable container-my-service to enable to to start automatically. But when I reboot, sometimes it will fail to come up and I see this in journalctl:

Aug 24 19:02:09 inst-1 systemd[1]: container-my-service.service: About to execute /usr/bin/podman run --cgroups=no-conmon --sdnotify=conmon --replace -p 8080:8080 --name my-service my-service-deploy
Aug 24 19:02:09 inst-1 systemd[1]: container-my-service.service: Forked /usr/bin/podman as 2039
Aug 24 19:02:09 inst-1 systemd[1]: container-my-service.service: Changed dead -> start
Aug 24 19:02:09 inst-1 systemd[1]: Starting Podman container-my-service.service...
Aug 24 19:02:10 inst-1 systemd[2039]: container-my-service.service: Executing: /usr/bin/podman run --cgroups=no-conmon --sdnotify=conmon --replace -p 8080:8080 --name my-service my-service-deploy
Aug 24 19:02:10 inst-1 systemd[1]: container-my-service.service: User lookup succeeded: uid=1000 gid=1000
Aug 24 19:02:10 inst-1 systemd[1]: container-my-service.service: got exec-fd event
Aug 24 19:02:10 inst-1 systemd[1]: container-my-service.service: Got EOF on exec-fd
Aug 24 19:02:10 inst-1 systemd[1]: container-my-service.service: Changed start -> running
Aug 24 19:02:10 inst-1 systemd[1]: container-my-service.service: Job 367 container-my-service.service/start finished, result=done
Aug 24 19:02:10 inst-1 systemd[1]: Started Podman container-my-service.service.
Aug 24 19:02:12 inst-1 podman[2039]: time="2023-08-24T19:02:12Z" level=warning msg="RunRoot is pointing to a path (/run/user/1000/containers) which is not writable. Most likely podman will fail."
Aug 24 19:02:12 inst-1 podman[2039]: Error: default OCI runtime "crun" not found: invalid argument
Aug 24 19:02:12 inst-1 systemd[1]: container-my-service.service: Child 2039 belongs to container-my-service.service.
Aug 24 19:02:12 inst-1 systemd[1]: container-my-service.service: Main process exited, code=exited, status=125/n/a

But then if I manually restart the service with systemctl restart container-my-service, it comes up successfully. I tried adding sufficient delay to the restart attempts to make sure it wasn't just starting up too early in the boot process, but that seems to not be the case.

This vaguely sounds like item 31 in the troubleshooting guide, but I'm not sure if that is applicable to something already being managed by systemd.

podman version:

Client:       Podman Engine
Version:      4.4.1
API Version:  4.4.1
Go Version:   go1.19.10
Built:        Wed Aug  9 19:48:36 2023
OS/Arch:      linux/amd64

Steps to reproduce the issue

Steps to reproduce the issue

  1. Create service file as above
  2. Make sure the service works when manually started
  3. Reboot repeatedly until the service fails to come up

Describe the results you received

Sometimes (maybe 30-40% of the time), the container fails to start as described above.

Describe the results you expected

If it starts successfully when manually started with systemctl start, then it should also automatically start successfully on boot.

podman info output

host:
  arch: amd64
  buildahVersion: 1.29.0
  cgroupControllers:
  - memory
  - pids
  cgroupManager: systemd
  cgroupVersion: v2
  conmon:
    package: conmon-2.1.7-1.el9_2.x86_64
    path: /usr/bin/conmon
    version: 'conmon version 2.1.7, commit: ee2f8dd0a09933610c92940874094961cd55a4bf'
  cpuUtilization:
    idlePercent: 97.89
    systemPercent: 1.04
    userPercent: 1.07
  cpus: 2
  distribution:
    distribution: '"ol"'
    variant: server
    version: "9.2"
  eventLogger: journald
  hostname: inst-2
  idMappings:
    gidmap:
    - container_id: 0
      host_id: 1000
      size: 1
    - container_id: 1
      host_id: 100000
      size: 65536
    uidmap:
    - container_id: 0
      host_id: 1000
      size: 1
    - container_id: 1
      host_id: 100000
      size: 65536
  kernel: 5.15.0-103.114.4.el9uek.x86_64
  linkmode: dynamic
  logDriver: journald
  memFree: 65069056
  memTotal: 981934080
  networkBackend: netavark
  ociRuntime:
    name: crun
    package: crun-1.8.4-1.el9_2.x86_64
    path: /usr/bin/crun
    version: |-
      crun version 1.8.4
      commit: 5a8fa99a5e41facba2eda4af12fa26313918805b
      rundir: /run/user/1000/crun
      spec: 1.0.0
      +SYSTEMD +SELINUX +APPARMOR +CAP +SECCOMP +EBPF +CRIU +YAJL
  os: linux
  remoteSocket:
    path: /run/user/1000/podman/podman.sock
  security:
    apparmorEnabled: false
    capabilities: CAP_CHOWN,CAP_DAC_OVERRIDE,CAP_FOWNER,CAP_FSETID,CAP_KILL,CAP_NET_BIND_SERVICE,CAP_SETFCAP,CAP_SETGID,CAP_SETPCAP,CAP_SETUID,CAP_SYS_CHROOT
    rootless: true
    seccompEnabled: true
    seccompProfilePath: /usr/share/containers/seccomp.json
    selinuxEnabled: true
  serviceIsRemote: false
  slirp4netns:
    executable: /usr/bin/slirp4netns
    package: slirp4netns-1.2.0-3.el9.x86_64
    version: |-
      slirp4netns version 1.2.0
      commit: 656041d45cfca7a4176f6b7eed9e4fe6c11e8383
      libslirp: 4.4.0
      SLIRP_CONFIG_VERSION_MAX: 3
      libseccomp: 2.5.2
  swapFree: 1576157184
  swapTotal: 1962930176
  uptime: 0h 28m 44.00s
plugins:
  authorization: null
  log:
  - k8s-file
  - none
  - passthrough
  - journald
  network:
  - bridge
  - macvlan
  volume:
  - local
registries:
  search:
  - container-registry.oracle.com
  - docker.io
store:
  configFile: /home/opc/.config/containers/storage.conf
  containerStore:
    number: 14
    paused: 0
    running: 1
    stopped: 13
  graphDriverName: overlay
  graphOptions: {}
  graphRoot: /home/opc/.local/share/containers/storage
  graphRootAllocated: 31630573568
  graphRootUsed: 8107454464
  graphStatus:
    Backing Filesystem: xfs
    Native Overlay Diff: "true"
    Supports d_type: "true"
    Using metacopy: "false"
  imageCopyTmpDir: /var/tmp
  imageStore:
    number: 15
  runRoot: /run/user/1000/containers
  transientStore: false
  volumePath: /home/opc/.local/share/containers/storage/volumes
version:
  APIVersion: 4.4.1
  Built: 1691610516
  BuiltTime: Wed Aug  9 19:48:36 2023
  GitCommit: ""
  GoVersion: go1.19.10
  Os: linux
  OsArch: linux/amd64
  Version: 4.4.1

Podman in a container

No

Privileged Or Rootless

Rootless

Upstream Latest Release

No

Additional environment details

cat /etc/oracle-release: Oracle Linux Server release 9.2
uname -a: Linux *** 5.15.0-103.114.4.el9uek.x86_64 #2 SMP Mon Jun 26 10:09:23 PDT 2023 x86_64 x86_64 x86_64 GNU/Linux

$ getsebool container_manage_cgroup
container_manage_cgroup --> on

Additional information

Described above - intermittent issue

@mattventura mattventura added the kind/bug Categorizes issue or PR as related to a bug. label Aug 24, 2023
@vrothberg
Copy link
Member

Thanks for reaching out, @mattventura.

Editing the output of generate systemd is not something we can support. The generated units were carefully crafted (and tested). Changing the unit leaves tested territory and is hence unsupported.

Due to other open issues such as #12778

I think that #12778 elaborates quite a bit that systemd's User primitive along with user namespaces created by the workload (such as Podman does) are not supported.

Does a simple unit generated with podman generate systemd --new work on boot? Just a simple container running sleep or top. The below log looks suspicious and I wonder if you can reproduce with a simple (unaltered) unit:

Aug 24 19:02:12 inst-1 podman[2039]: time="2023-08-24T19:02:12Z" level=warning msg="RunRoot is pointing to a path (/run/user/1000/containers) which is not writable. Most likely podman will fail."

@Luap99
Copy link
Member

Luap99 commented Aug 25, 2023

Rootless containers must be run in the user systemd session in order to function properly.

Aug 24 19:02:12 inst-1 podman[2039]: time="2023-08-24T19:02:12Z" level=warning msg="RunRoot is pointing to a path (/run/user/1000/containers) which is not writable. Most likely podman will fail."

This indicates to me that the unit is started before the user systemd session is created so it just fails because the path does not exists.

@vrothberg
Copy link
Member

This indicates to me that the unit is started before the user systemd session is created so it just fails because the path does not exists.

Good thinking! A After=systemd-user-sessions.service may resolve it.

@rhatdan
Copy link
Member

rhatdan commented Aug 25, 2023

As always, we recommend using quadlets for running jobs under systemd.

@mattventura
Copy link
Author

This indicates to me that the unit is started before the user systemd session is created so it just fails because the path does not exists.

Good thinking! A After=systemd-user-sessions.service may resolve it.

Unfortunately, that did not fix it, but I think it's on the right track. I just had an occurrence where it was failed, but the automatic restart got it running. But this only seems to work after I log in - if I don't log in, then it just keeps failing.

I went back to an unmodified generated unit file for a simple service, and it worked. But as soon as I added User=, it didn't.

I will take a look at quadlets. It seems to be what I'm looking for, thanks!

@github-actions github-actions bot added the locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments. label Nov 24, 2023
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Nov 24, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
kind/bug Categorizes issue or PR as related to a bug. locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments.
Projects
None yet
Development

No branches or pull requests

4 participants