-
Notifications
You must be signed in to change notification settings - Fork 553
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Proposal: define a "ALL_CAPS" pseudo-capability to grant all capabilities #1071
Comments
I think a standard way to ask the runtime about what it supports might be better. The runtime could return a JSON doc with everything it supports in, and the runtime should always use a subset. |
alternative idea: what do you think about supporting the capability value in addition to its name? e.g. "capabilities": {
"bounding": [
"CAP_CHOWN",
"1",
"CAP_DAC_READ",
... the higher level runtimes could read the maximum value from
|
Yes, I think numeric values would work (at a cost of not being very human-readable, but perhaps that's not the biggest concern 🤔) |
Having runc report it isn't bad, but I think in practice it is not very usable for this case. runc's update cycle is very different from higher level runtimes, so we can:
It would be nice to not depend on a library to have an update to date listing of names if nothing else than because of this discrepancy in update cycles. |
Related: opencontainers/runc#3296, which implemented a |
While the list of capabilities in the Kernel has been relatively stable, recently,
new capabilities were added (
CAP_PERFMON
,CAP_BPF
, andCAP_CHECKPOINT_RESTORE
).This proved to be a challenge, as (for example), docker was updated to be aware
of these new capabilities (and detects if the kernel on which it's running supports them),
however, the current runc release (and possibly other runtimes) not yet recognize them.
The specification currently defines that, in order to grant capabilities to a container process,
the container configuration has to specify those capabilities:
In most situations, this is not a problem. For example, if I'm running on a 5.8+ kernel
and want to grant my container
CAP_BPF
capabilities, I start the container with--cap-add CAP_BPF
.Attempting to do the same on an older kernel version will produce an error (either generated
by
dockerd
, or byrunc
).However, when granting a container all capabilities (for example, when using
--cap-add=ALL
, or when running a container with--privileged
), things becomeproblematic.
In this situation, dockerd generates a list of all capabilities supported by the
host's kernel, and sets those capabilities in the container configuration. On a
5.8+ kernel, this will include the (
CAP_PERFMON
,CAP_BPF
, andCAP_CHECKPOINT_RESTORE
).Docker has no option to detect what capabilities are supported by the runtime, and
runc (or other runtime) on their hand, process the list of capabilities, and
produce an error for any "unknown" capability.
While docker could account for the runtime not supporting certain capabilities
(which is what's currently done as a temporary solution moby/moby#41563),
doing so is undesirable, as it would tightly couple the runtime (and would complicate
using alternative runtimes, such as
crun
, gVisor (runsc
) or others).Proposal
My proposal is to delegate generation of the "all capabilities" list to the runtime,
and to include a special
ALL_CAPS
(just a suggestion, I'm not attached to the name)value in the specification.
ALL_CAPS
special value, consider it an"unknown capability", and will produce an error (as defined by the specification).
ALL_CAPS
special value will materialize the listof capabilities, and add all capabilities that the runtime (and active kernel)
supports.
ALL_CAPS
with other capabilities (e.g.ALL_CAPS
andCAP_CHMOD
),ALL_CAPS
must take precedence. Alternatively, this situation could be consideredambiguous, and an error can be produced (we should consider what's more future-proof
in case additional "special" values are to be added in future).
Compatibility and downsides
Ideally, docker would be able to detect what version of the runtime-spec is supported
by a runtime, but this is likely a separate discussion to have.
As described above, runtimes that do not support the
ALL_CAPS
special valuewill produce an error. This could be considered a breaking change, on the other
hand, the current situation already does not handle new capabilities to be added
to the list.
Having an
ALL_CAPS
capability makes the container configuration "non-declarative";the meaning of "all" capabilities will depend on the runtime, and the kernel on
which it's running. I don't think that's worse than the current situation, in
which the same applies, only at a higher level (dockerd or containerd supporting
the new capabilities).
The text was updated successfully, but these errors were encountered: