In case of PINP, the Outer Container Requires SIGKILL to Stop When Using setcap for Inner Container Execution #25062

mohaa7 · 2025-01-20T21:53:54Z

mohaa7
Jan 20, 2025

The outer Podman container fails to stop gracefully with SIGTERM when specific capabilities (cap_setuid and cap_setgid) are set to enable the use of machinectl and inner containers. Without these capabilities, machinectl commands result in errors related to newuidmap and newgidmap. However, with these capabilities set, stopping the outer container requires forcibly using SIGKILL, even after the default timeout, which is not an ideal behavior.

Steps to Reproduce:

Run the Outer Container:

systemd-run --scope --user \
    podman --runtime=crun run -d \
    outer-container-image

Run the Inner Container from the Outer Container:

machinectl shell --uid=user .host /usr/bin/env \
    podman run -t --name inner-container \
    image

Attempt to Stop the Outer Container: podman stop outer-container and now observe errors:

Without setcap:

ERRO[0000] running /usr/bin/newuidmap 85 0 <user id> 1 1 100000 65536: newuidmap: write to uid_map failed: Operation not permitted
Error: cannot set up namespace using "/usr/bin/newuidmap": should have setuid or have filecaps setuid: exit status 1

With setcap:

 WARN[0010] StopSignal (15) failed to stop container outer-container in 10 seconds, resorting to SIGKILL

Adding cap_setuid+ep and cap_setgid+ep to newuidmap and newgidmap enables the inner container setup but introduces the stopping issue. Impact:
-- Without capabilities: Inner containers cannot be managed due to namespace errors.
-- With capabilities: The outer container cannot be gracefully stopped using SIGTERM.

Workaround:

The only current workaround is using --stop-signal SIGKILL when running the outer container, which is suboptimal and forces an abrupt termination.

Environment:

        OS Kernel: 4.18.0-553.34.1.el8_10.x86_64
        Podman Version: 4.9.4-rhel
        Container Runtime: crun
        Outer Image: rockylinux:8

Expected Behavior:

The outer container should gracefully stop with SIGTERM, propagating signals to its processes, regardless of whether cap_setuid and cap_setgid are set.

Actual Behavior:

SIGTERM fails to stop the outer container, requiring SIGKILL after the timeout.

Attempts to Resolve:

Increasing stop-timeout: podman run --stop-timeout=60 ...
Outcome: Still fails to stop with SIGTERM after the timeout and resorts to SIGKILL.
Releasing Capabilities Post-Setup: I attempted to revoke cap_setuid and cap_setgid after starting the inner container by setcap cap_setuid-ep /usr/bin/newuidmap and setcap cap_setgid-ep /usr/bin/newgidmap
Outcome: Results in the following error when trying to execute machinectl:

ERRO[0000] running /usr/bin/newuidmap 83 0 <user id> 1 1 100000 65536: newuidmap: write to uid_map failed: Operation not permitted

Request for Assistance:

Why does setting cap_setuid and cap_setgid affect the signal handling of the outer container?
Is there a way to allow the inner container setup without compromising the graceful stopping of the outer container?

Would you like additional logs, configurations, or minimal reproducer scripts?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

In case of PINP, the Outer Container Requires SIGKILL to Stop When Using setcap for Inner Container Execution #25062

{{title}}

Replies: 0 comments

Select a reply

In case of PINP, the Outer Container Requires SIGKILL to Stop When Using setcap for Inner Container Execution #25062

mohaa7 Jan 20, 2025

Steps to Reproduce:

Workaround:

Environment:

Expected Behavior:

Actual Behavior:

Attempts to Resolve:

Request for Assistance:

Replies: 0 comments

mohaa7
Jan 20, 2025