Replies: 81 comments 33 replies
-
This is a limitation on the systemd side. They will only accept notifications, or PID files, that are created by or sent by root, for security reasons - even if the User and Group of the unit file are explicitly set to start the process as a non-root user. Their recommendation was to start the container as a user service of the user in question via |
Beta Was this translation helpful? Give feedback.
-
Previous discussion: #9642 |
Beta Was this translation helpful? Give feedback.
-
Thank you both. For now I've worked around it by managing the service under the user's systemd which is clunky to say the least. I don't understand systemd's security argument - if the process is run as a given user, why would systemd not allow that user's process to send sd_notify? Who else could? But I guess this is no flaw of podman. #9642 mentions some code changes that need to happen to podman for sd_notify, what are those? And have they progressed since March? I guess you could close this issue or use it to track progress. |
Beta Was this translation helpful? Give feedback.
-
Yes, there is some progress. The main PID is now communicated via sd notify but there are still some remaining issues. For instance, |
Beta Was this translation helpful? Give feedback.
-
I think the next big thing to tackle is finding a way how to lift the |
Beta Was this translation helpful? Give feedback.
-
But even that is rejected by systemd, as seen in the logs above. |
Beta Was this translation helpful? Give feedback.
-
I fear there's not much Podman can do at the moment. |
Beta Was this translation helpful? Give feedback.
-
Only after solving this problem can become truly rootless. So I have to keep using the root account for now. |
Beta Was this translation helpful? Give feedback.
-
Is there a quick overview what, at the moment, the best approach / workaround is for starting podman containers with systemd as a specific non-root user? Furthermore, if a container is run as root, is there a workaround how to change the ownership of files and directories created inside the container (in a bound volume) to a specific host user? |
Beta Was this translation helpful? Give feedback.
-
use use |
Beta Was this translation helpful? Give feedback.
-
The services need to be started and managed as the specific non-root user. Using the |
Beta Was this translation helpful? Give feedback.
-
For the moment my workaround is to run such containers in a systemd --user. This means that for every system service I want to run as a rootless container, I need to create a separate system user, enable linger, and run a separate systemd --user instance for that user. It works but it's clunky, e.g. restarting Nginx is Inside these rootless containers root is mapped to the system user, which is a different uid for each service. If something inside the containers runs as non-root, that gets mapped to a high-numbered host uid by default. However with some magic on the host you can map a specific non-root uid in the container to a host uid of your choice, which can then be mapped to a different non-root uid in a different container running under a different user. I should probably document my setup one of these days... |
Beta Was this translation helpful? Give feedback.
-
@Gchbg If you are running a recent systemd version (for instance by running Fedora 35), I think you could run
No need to set |
Beta Was this translation helpful? Give feedback.
-
Is that
How does that relate to what @Gchbg and @eriksjolund wrote above? Do I have to run several instances of systemd or is there another way? For systemd beginners like me, it is quite difficult to understand the various layers of abstraction and user permission between systemd, host processes and containers. After all, I would assume that this is the use case for 80 % of the users: run some container service that gets restarted automatically when the machine boots and that is as restricted as possible (by means of user permissions). |
Beta Was this translation helpful? Give feedback.
-
I got it wrong, modifying UID and GID via env requires entrypoint.sh。 https://docs.docker.com/engine/security/userns-remap/ |
Beta Was this translation helpful? Give feedback.
-
@markstos Not really, just documenting the fact that you need to use |
Beta Was this translation helpful? Give feedback.
-
This is also what I need, and what seems to be a pretty valid use case. Additionally, for peristent state one can still use I actually almost got it working but ran against a brick wall with this issue, i.e. Hacking away with things like (I got the normal |
Beta Was this translation helpful? Give feedback.
-
Since I don't think it was mentioned yet, you should probably not do that since it can happen that the Podman process is killed while the container keeps running (see #9642 (reply in thread)). |
Beta Was this translation helpful? Give feedback.
-
I'm still confused why we're all having problems with this; clearly using User= is not the recommended/supported approach. So the recommended/supported approach really is to run containers as root? Am I missing something and people generally think that's ok? Non-containerised services wouldn't be running as root right? So why is it ok to run containerised services as root? |
Beta Was this translation helpful? Give feedback.
-
Personally, I have taken to just running it as a user service with lingering enabled. It still lets me start it on boot, and manage/observe it via systemctl and journalctl. |
Beta Was this translation helpful? Give feedback.
-
I think at this point we should change this to a discussion. User= causes lots of issues with running podman and rootless support is fairly easy. I also recomend that people look at using rootful with --userns=auto, which will run your containers each in a unigue user nemespace. |
Beta Was this translation helpful? Give feedback.
-
If you setup a directory that is setgid to a GID (foobar), and the user running podman is in the foobar group, then you can leak the foobar group into the container with --group-add keep-groups. This should allow any containers written this way to write to the group. |
Beta Was this translation helpful? Give feedback.
-
@giuseppe WDYT of allowing something like. podman run --userns=auto --gidmap=1000:1000:1 alpine cat /proc/self/uid_mapError: --userns and --uidmap/--gidmap/--subuidname/--subgidname are mutually exclusive |
Beta Was this translation helpful? Give feedback.
-
I tried some more with
I don't know how robust the solution is but at least something is working. /etc/systemd/system/example3.service
/etc/systemd/system/example3.socket
Note that rootless podman runs the nginx container with socket activation (port 80) without being blocked by the
I added this as Example 3 in |
Beta Was this translation helpful? Give feedback.
-
In case it helps anyone else trying to use rootless podman to run systemd user units, configured by ansible, on Rocky8 or Rocky9 I did a bit of experimenting here: https://github.com/sjpb/systemd-podman-experiments. The conclusion I drew is that actually this is still pretty user-unfriendly for my use-case and I think it'd still be really nice to just be able to set |
Beta Was this translation helpful? Give feedback.
-
I've been working on setting up a bare-metal environment that runs exclusively in ram on fcos and boots over pxe with an ignition file. My goal is to go from zero -> running github actions container / terraform cloud agent container securely without manual intervention. Admittedly I am pretty unfamiliar with linux internals and systemd, but I am trying to familiarize myself. Ignition only supports systemd for launching services, so understanding this is pretty crucial. At this point I've been able to figure out every step of my config, but getting rootless containers working has been a thorn in my side. Trying to set the I'm a bit stumped here and am reverting back to running as root for now, but I'm open to any pointers (including this ignition process as a whole):
|
Beta Was this translation helpful? Give feedback.
-
+1 As quadlets were created with the idea of "integrating better with systemd", not supporting the systemd way of managing rootless services does not make sense. |
Beta Was this translation helpful? Give feedback.
-
What's the next-best-thing approach for running a container as systemd service, but as a non-root user that's easy and repeatable?
I would happy if all uids in the container were collapsed to a single user and I could specific that user with |
Beta Was this translation helpful? Give feedback.
-
I was able to run rootless podman containers using systemd units with "User=" without using I just use normal service commands like Podman was run with the following options, where 1111 is the serviceuser's uid:
Systemd unit file has the following lines (I posted only the most important lines):
Subuids was configured for the serviceuser:
Also linger state was enabled for servicesuser account:
The key option is
@eriksjolund posted this option in his message. |
Beta Was this translation helpful? Give feedback.
-
This seems to be the magic combination that makes
|
Beta Was this translation helpful? Give feedback.
-
Is this a BUG REPORT or FEATURE REQUEST?
/kind bug
Description
I want to have a systemd system service that runs a rootless container under an isolated user, but systemd rejects the sd_notify call and terminates the service.
A similar problem was menitoned but not resolved in #5572, which seems to have been closed without a resolution.
Happy to help tracking this down.
Steps to reproduce the issue:
podman generate systemd --new
:Describe the results you received:
Describe the results you expected:
Nginx runs until the end of time.
Output of
podman version
:Output of
podman info --debug
:Package info (e.g. output of
apt list podman
):Have you tested with the latest version of Podman and have you checked the Podman Troubleshooting Guide? (https://github.com/containers/podman/blob/master/troubleshooting.md)
Yes and yes.
Additional environment details (AWS, VirtualBox, physical, etc.):
Machine is a VM.
Beta Was this translation helpful? Give feedback.
All reactions