You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Since the mitigation for CVE-2019-5736 was amended with /proc/self/exe read-only bind mount in #1984, every runc start/run/exec causes a mount and unmount event.
Those are picked up by systemd to create/remove a mount unit. This can be seen in system journal:
(for some reason, only the unmount is shown when the default log level is used).
First, this creates a load on the system -- systemd re-reads mountinfo on every event (maybe it is fixed in later versions to rate-limit the reading).
Second, with older systemd and some setups, due to a bug in systemd (fixed in 2018 by systemd/systemd#10980, but not backported to certain distros), this eventually results in systemd unit table reaching its maximum size. Once this happens, systemd is not able to start or stop any more units, which is A Very Bad Thing.
I understand that this is a systemd (rather than runc) issue, but perhaps we can work around it in some way?
The text was updated successfully, but these errors were encountered:
I once suggested having an explanation of how to set up a sealed memfd bind-mount for users but from memory it's not really possible to do? I'll need to look into it a bit more.
Unfortunately most other protections (immutable bit, letting the host do the bind-mount themselves, etc) all have minor downsides because there are reasonable cases where they might not work (container with extra capabilities, container where the host does some update which causes the mount table to be cleared of some entries for some reason) and so on.
Since the mitigation for CVE-2019-5736 was amended with /proc/self/exe read-only bind mount in #1984, every runc start/run/exec causes a mount and unmount event.
Those are picked up by systemd to create/remove a mount unit. This can be seen in system journal:
(for some reason, only the unmount is shown when the default log level is used).
First, this creates a load on the system -- systemd re-reads mountinfo on every event (maybe it is fixed in later versions to rate-limit the reading).
Second, with older systemd and some setups, due to a bug in systemd (fixed in 2018 by systemd/systemd#10980, but not backported to certain distros), this eventually results in systemd unit table reaching its maximum size. Once this happens, systemd is not able to start or stop any more units, which is A Very Bad Thing.
I understand that this is a systemd (rather than runc) issue, but perhaps we can work around it in some way?
The text was updated successfully, but these errors were encountered: