Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Decouple setting cgroup device rules from cgroup manager #3452

Merged
merged 3 commits into from
May 25, 2022

Conversation

kolyshkin
Copy link
Contributor

@kolyshkin kolyshkin commented Apr 8, 2022

This is an alternative to #3421, which tries to remove the functionality of setting cgroup device access rules from the cgroup managers.

Motivation

This should help runc/libct/cg users who are

  1. Worried that adding libct/cg as a dependency increases the binary size considerably (because of cilium/ebpf dependency).
  2. Not interested in managing device rules for cgroups (such as kubelet, which only uses libct/cg to create parent/pod cgroups). Previously we have added two workarounds: SkipDevices and SkipFreezeOnSet for such users, and with this code SkipDevices can be omitted entirely in case the sources do not include github.com/opencontainers/runc/libcontainer/cgroups/devices package.

Implementation

This

  • Moves the functionality to set cgroup device rules to a separate package, libct/cg/devices. Those are setV1 and setV2. Same for generating systemd cgroup device properties (systemdProperties).
  • If the above package is included, it autoconfigures cgroup managers to manage devices; if not, they will return an error if any devices rules are specified. So, a user can choose to have a cgroup manager with or without device management functionality (by adding or not adding an import statement).

As a result:

  • By default, cgroup managers are not able to set any device rules (and will error out if any devices rules are set in config)
  • If device rules management is needed, one has to do something like
+import _ "github.com/opencontainers/runc/libcontainer/cgroups/devices"

Caveats

  • It seems impossible to get rid of SkipFreezeOnSet for systemd v1 cgroup manager since it can manage an existing systemd unit which has some device rules set, and any update of such unit results in rules being reloaded, so we have to freeze it before set.

@kolyshkin kolyshkin force-pushed the separate-devices branch 2 times, most recently from c08172a to 9833a73 Compare April 8, 2022 22:25
@kolyshkin
Copy link
Contributor Author

@AkihiroSuda PTAL 🙏🏻 (if you like this more or less than #3421). It is not finished, but it is more or less clear where this is going.

Personally I think this is better than the build tag, because we can use the same library with and without device management (i.e. a runtime switch is better than the compile time)

// DeviceFilter returns eBPF device filter program and its license string
func DeviceFilter(rules []*devices.Rule) (asm.Instructions, string, error) {
// deviceFilter returns eBPF device filter program and its license string.
func deviceFilter(rules []*devices.Rule) (asm.Instructions, string, error) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably no need to unexpose DeviceFilter, Emulator, etc.
Or this could be another PR.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a separate commit so I can just drop it.

Or, if moving everything in a single package is not that good (let's see what @cyphar says), this can be undone, too.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moving it to a single package is fine. Honestly I probably should've kept it in this package in the first place (I wish Go had project-level internal symbols that didn't require you to restructure everything to use internal/).

AkihiroSuda
AkihiroSuda previously approved these changes Apr 14, 2022
@kolyshkin
Copy link
Contributor Author

@cyphar PTAL

@cdoern
Copy link
Contributor

cdoern commented Apr 27, 2022

@kolyshkin is there any movement on this?

@kolyshkin
Copy link
Contributor Author

@cyphar @opencontainers/runc-maintainers PTAL

AkihiroSuda
AkihiroSuda previously approved these changes May 17, 2022
@cdoern
Copy link
Contributor

cdoern commented May 17, 2022

@kolyshkin @AkihiroSuda @cyphar is there any chance of getting this merged this week so work on containers/common#936 can resume?

@kolyshkin
Copy link
Contributor Author

@kolyshkin @AkihiroSuda @cyphar is there any chance of getting this merged this week so work on containers/common#936 can resume?

Being the author of this, I can not really influence its acceptance (although I am a maintainer here, I can not accept my own PRs, and so I rely on other maintainers for this, the same way they rely on me).

You can still work on that using my repo fork; I guess that until a tagged runc release is made, you can't include this anyway so the fork is as good as the unreleased version.

@kolyshkin
Copy link
Contributor Author

OK, I have simplified the code a bit, removed all the new methods from cgroup managers, replacing those with 3 functions variables added to cgroups package, and added init to libct/cg/dev which sets those three variables to its own implementations. This should simplify the review as well. Still a big change, but mostly it's just moving the code around without changing it.

cyphar
cyphar previously requested changes May 18, 2022
libcontainer/cgroups/cgroups.go Outdated Show resolved Hide resolved
libcontainer/factory_linux.go Show resolved Hide resolved
libcontainer/cgroups/systemd/freeze_test.go Show resolved Hide resolved
This moves the functionality related to devices, SkipDevices, and
SkipFreezeOnSet to a separate file, in preparation for the next commit.

No code changes.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
This commit separates the functionality of setting cgroup device
rules out of libct/cgroups to libct/cgroups/devices package. This
package, if imported, sets the function variables in libct/cgroups and
libct/cgroups/systemd, so that a cgroup manager can use those to manage
devices. If those function variables are nil (when libct/cgroups/devices
are not imported), a cgroup manager returns the ErrDevicesUnsupported
in case any device rules are set in Resources.

It also consolidates the code from libct/cgroups/ebpf and
libct/cgroups/ebpf/devicefilter into libct/cgroups/devices.

Moved some tests in libct/cg/sd that require device management to
libct/sd/devices.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
These are only used from inside the package, and we don't want them to
be public.

The only two methods left are Enable and Disable.

While at it, fix or suppress found lint-extra warnings.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
Copy link
Member

@cyphar cyphar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

@AkihiroSuda AkihiroSuda merged commit 016a0d2 into opencontainers:main May 25, 2022
cdoern pushed a commit to cdoern/common that referenced this pull request Jun 6, 2022
switch c/common to use runc cgroup creation so that we can use resource limits

This entails importing the newly refactored runc code to manage reading from and writing to cgroup.
vendoring in directly an unreleased runc commit from opencontainers/runc#3452

Signed-off-by: cdoern <cdoern@redhat.com>
cdoern pushed a commit to cdoern/common that referenced this pull request Jun 6, 2022
switch c/common to use runc cgroup creation so that we can use resource limits

This entails importing the newly refactored runc code to manage reading from and writing to cgroup.
vendoring in directly an unreleased runc commit from opencontainers/runc#3452

Signed-off-by: cdoern <cdoern@redhat.com>
cdoern pushed a commit to cdoern/common that referenced this pull request Jun 7, 2022
switch c/common to use runc cgroup creation so that we can use resource limits

This entails importing the newly refactored runc code to manage reading from and writing to cgroup.
vendoring in directly an unreleased runc commit from opencontainers/runc#3452

Signed-off-by: cdoern <cdoern@redhat.com>
cdoern added a commit to cdoern/common that referenced this pull request Jun 7, 2022
switch c/common to use runc cgroup creation so that we can use resource limits

This entails importing the newly refactored runc code to manage reading from and writing to cgroup.
vendoring in directly an unreleased runc commit from opencontainers/runc#3452

Signed-off-by: cdoern <cdoern@redhat.com>
cdoern added a commit to cdoern/common that referenced this pull request Jun 7, 2022
switch c/common to use runc cgroup creation so that we can use resource limits

This entails importing the newly refactored runc code to manage reading from and writing to cgroup.
vendoring in directly an unreleased runc commit from opencontainers/runc#3452

Signed-off-by: cdoern <cdoern@redhat.com>
cdoern added a commit to cdoern/common that referenced this pull request Jun 7, 2022
switch c/common to use runc cgroup creation so that we can use resource limits

This entails importing the newly refactored runc code to manage reading from and writing to cgroup.
vendoring in directly an unreleased runc commit from opencontainers/runc#3452

Signed-off-by: cdoern <cdoern@redhat.com>
cdoern added a commit to cdoern/common that referenced this pull request Jun 7, 2022
switch c/common to use runc cgroup creation so that we can use resource limits

This entails importing the newly refactored runc code to manage reading from and writing to cgroup.
vendoring in directly an unreleased runc commit from opencontainers/runc#3452

Signed-off-by: cdoern <cdoern@redhat.com>
cdoern added a commit to cdoern/common that referenced this pull request Jun 7, 2022
switch c/common to use runc cgroup creation so that we can use resource limits

This entails importing the newly refactored runc code to manage reading from and writing to cgroup.
vendoring in directly an unreleased runc commit from opencontainers/runc#3452

Signed-off-by: cdoern <cdoern@redhat.com>
cdoern added a commit to cdoern/common that referenced this pull request Jun 7, 2022
switch c/common to use runc cgroup creation so that we can use resource limits

This entails importing the newly refactored runc code to manage reading from and writing to cgroup.
vendoring in directly an unreleased runc commit from opencontainers/runc#3452

Signed-off-by: cdoern <cdoern@redhat.com>
cdoern added a commit to cdoern/common that referenced this pull request Jun 8, 2022
switch c/common to use runc cgroup creation so that we can use resource limits

This entails importing the newly refactored runc code to manage reading from and writing to cgroup.
vendoring in directly an unreleased runc commit from opencontainers/runc#3452

Signed-off-by: cdoern <cdoern@redhat.com>
cdoern added a commit to cdoern/common that referenced this pull request Jun 8, 2022
switch c/common to use runc cgroup creation so that we can use resource limits

This entails importing the newly refactored runc code to manage reading from and writing to cgroup.
vendoring in directly an unreleased runc commit from opencontainers/runc#3452

Signed-off-by: cdoern <cdoern@redhat.com>
@lifubang lifubang mentioned this pull request Jun 10, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants