Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding cgroups path to the Spec. #137

Merged
merged 1 commit into from
Sep 10, 2015
Merged

Conversation

vishh
Copy link
Contributor

@vishh vishh commented Sep 2, 2015

Related to #114
This change should be useful for exec'ing new processes in an existing container.

@@ -130,6 +130,13 @@ Next parameters can be specified:
Also known as cgroups, they are used to restrict resource usage for a container and handle
device access. cgroups provide controls to restrict cpu, memory, IO, and network for
the container. For more information, see the [kernel cgroups documentation](https://www.kernel.org/doc/Documentation/cgroups/cgroups.txt).
The path to the cgroups can be specified in the Spec. The path is expected to be relative
to the cgroups mount point. The Spec does not support split hierarchy. The cgroups will be
created if they don't exist.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be one line per sentence. I'll file a PR suggesting we move stuff like that from the README to a CONTRIBUTING.md

@wking
Copy link
Contributor

wking commented Sep 2, 2015

This might be worth floating on the list first 1 ;). Reviewing
different proposals (e.g. your path in be6ec5f vs. my array of
objects in 2) is probably too much to wedge into GitHub's inline
comments :p.

@wking wking mentioned this pull request Sep 2, 2015
@vishh
Copy link
Contributor Author

vishh commented Sep 3, 2015

@wking: Apologies for not being updated on the new contribution workflows.

@mrunalp: Any thoughts on systemd compatibility?

@@ -21,6 +21,9 @@ type LinuxRuntime struct {
// Resources contain cgroup information for handling resource constraints
// for the container
Resources Resources `json:"resources"`
// CgroupsPath specifies the path to cgroups that are created and/or joined by the container.
// The path is expected to be relative to the cgroups mountpoint.
CgroupsPath string `json:"cgroupsPath"`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@vishh It will be either CgroupsPath or Resources, right? We should either clarify that in the spec or have a layer of indirection.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i agree that it should either be path or resources as it doesnt really make sense ( or even wrong ) to join with existing paths and then modify those resources.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mrunalp: How about creating the cgroups for the user and applying the limits expressed via Resources?
@dqminh: Why would it be wrong? AFAIK the Spec doesn't try to limit how the existing Linux functionalities can be consumed right?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@vishh There could be two interpretations. One is simply joining the cgroups (that were configured by some other tool or by another container) and other is using that path and then going in and modifying the values under that path. The first is necessary for us to support exec like functionality. Does that work for your use cases?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm actually i withdraw my comment about enforcing the limitation of cgroupPaths vs resources. I think there can be usecases where having that capabilities is helpful ( like launching an agent container that modifies the cgroup resources at boot time and/or during the agent runtime maybe ? ).

I think that clarifying the order of resources modification is good enough i.e, if cgroups path is specified then we join the paths, otherwise we create new paths, and then apply the resources limit.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mrunalp: Your reasoning is sound! I see use cases for both exec and resource updates. Would making Resources optional resolve the issue?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On Tue, Sep 08, 2015 at 03:23:20PM -0700, Daniel, Dao Quang Minh wrote:

I think that clarifying the order of resources modification is good
enough i.e, if cgroups path is specified then we join the paths,
otherwise we create new paths, and then apply the resources limit.

See also comments in #155 and #158, where we're having the same
discussion about namespaces.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mrunalp: Your reasoning is sound! I see use cases for both exec and resource updates. Would making Resources optional resolve the issue?

@vishh Sounds good to me!

@mrunalp
Copy link
Contributor

mrunalp commented Sep 4, 2015

@vishh I think that this should work with systemd as well. The only difference is the terminology AFAICS.

For e.g.

10:blkio:/system.slice/sshd.service
9:hugetlb:/
8:perf_event:/
7:devices:/system.slice/sshd.service
6:freezer:/
5:memory:/system.slice/sshd.service
4:net_cls,net_prio:/
3:cpuset:/
2:cpu,cpuacct:/system.slice/sshd.service
1:name=systemd:/system.slice/sshd.service

# mount
...
cgroup on /sys/fs/cgroup/systemd type cgroup (rw,nosuid,nodev,noexec,relatime,xattr,release_agent=/usr/lib/systemd/systemd-cgroups-agent,name=systemd)
...

@vbatts vbatts added this to the draft milestone Sep 4, 2015
@philips
Copy link
Contributor

philips commented Sep 9, 2015

nit: reformat to one sentence per line please

@mrunalp
Copy link
Contributor

mrunalp commented Sep 9, 2015

Needs rebase as well.

@vishh
Copy link
Contributor Author

vishh commented Sep 9, 2015

@mrunalp @crosbymichael: WDYT of making cgroups path required? Without that, the options are cgroups can created under under root /, or under a user specified hierarchy (which requires configuration).

@mrunalp
Copy link
Contributor

mrunalp commented Sep 9, 2015

@vishh I think we could default to / if cgroups path is not specified and a user specified path otherwise.

@vishh
Copy link
Contributor Author

vishh commented Sep 9, 2015

What will be the identifier we use for the cgroups? Will it be the base
name of the bundle?

On Wed, Sep 9, 2015 at 4:49 PM, Mrunal Patel notifications@github.com
wrote:

@vishh https://github.com/vishh I think we could default to / if
cgroups path is not specified and a user specified path otherwise.


Reply to this email directly or view it on GitHub
#137 (comment).

@vbatts
Copy link
Member

vbatts commented Sep 9, 2015

That is a common thought.
On Sep 9, 2015 19:55, "Vish Kannan" notifications@github.com wrote:

What will be the identifier we use for the cgroups? Will it be the base
name of the bundle?

On Wed, Sep 9, 2015 at 4:49 PM, Mrunal Patel notifications@github.com
wrote:

@vishh https://github.com/vishh I think we could default to / if
cgroups path is not specified and a user specified path otherwise.


Reply to this email directly or view it on GitHub
<#137 (comment)
.


Reply to this email directly or view it on GitHub
#137 (comment).

@mrunalp
Copy link
Contributor

mrunalp commented Sep 10, 2015

@vishh runc uses the --id or the base name if not specified.

@wking
Copy link
Contributor

wking commented Sep 10, 2015

On Wed, Sep 09, 2015 at 04:55:45PM -0700, Vish Kannan wrote:

What will be the identifier we use for the cgroups? Will it be the
base name of the bundle?

Does this need to be spec'ed out? There's not currently much language
around container IDs. We just list them in state.json, and I've
pointed out that I'm not clear on why 1. It makes sense that we'd
want some sort of handle for tracking “the same container” across a
checkpoint/restore migration (which is likely to change the PID), but
I don't see a need to bind the container ID in more tightly than that.
I think we should leave the cgroup/namespace name generation to the
runtime, and have consumers query (via the PID) to figure out the
cgroup/namespace name if they need it for something.

 Message-ID: <20150905041822.GA25638@odin.tremily.us>

@vishh vishh force-pushed the cgroups-path branch 2 times, most recently from 0ae67b0 to 1a36827 Compare September 10, 2015 00:10
@vishh
Copy link
Contributor Author

vishh commented Sep 10, 2015

@wking: My understanding is that it is recommended for runtimes to manage cgroups. But for a good dev experience, we can default to creating cgroups under / with an easy to use identifier (base name of the bundle).

@vishh
Copy link
Contributor Author

vishh commented Sep 10, 2015

@mrunalp @vbatts @dqminh Updated docs based on comments and rebased. PTAL.

@vbatts
Copy link
Member

vbatts commented Sep 10, 2015

Though base of the bundle would not be unique enough for more than 1
container runtime. Let's leave this id for the runtime. Won't the paths be
referencable through the /run/opencontainers/ path anyways.
On Sep 9, 2015 20:13, "Vish Kannan" notifications@github.com wrote:

@wking https://github.com/wking: My understanding is that it is
recommended for runtimes to manage cgroups. But for a good dev experience,
we can default to creating cgroups under / with an easy to use identifier
(base name of the bundle).


Reply to this email directly or view it on GitHub
#137 (comment).

@vishh
Copy link
Contributor Author

vishh commented Sep 10, 2015

Yeah. The cgroup paths can be retrieved from the state.

@wking
Copy link
Contributor

wking commented Sep 10, 2015

On Wed, Sep 09, 2015 at 05:13:32PM -0700, Vish Kannan wrote:

My understanding is that it is recommended for runtimes to manage
cgroups. But for a good dev experience, we can default to creating
cgroups under / with an easy to use identifier (base name of the
bundle).

If folks can spawn new containers without setting ‘cgroupsPath’, then
folks using a runtime shouldn't be guessing “/{container.id}” (or
whatever) as a cgroup path, they'll have to look it up for each
container (which I think is a good thing).

So since users will have to lookup a process's cgroups path anyway,
why should we bother specifying a rule to auto-generate paths? If
different runtimes pick different rules for the default cgroup path,
what use-cases will suffer?

Wed, Sep 09, 2015 at 06:43:27PM -0700, Vish Kannan:

Yeah. The cgroup paths can be retrieved from the state.

Not directly. Folks need to use the PID to get them (and that seems
fine to me), at least with the spec that landed via #87, see 1.

@@ -131,6 +131,58 @@ Also known as cgroups, they are used to restrict resource usage for a container
cgroups provide controls to restrict cpu, memory, IO, pids and network for the container.
For more information, see the [kernel cgroups documentation](https://www.kernel.org/doc/Documentation/cgroups/cgroups.txt).

The path to the cgroups can to be specified in the Spec via `cgroupsPath`.
`cgroupsPath` is expected to be relative to the cgroups mount point.
If not specified, cgroups will be created under '/' with the base name of the bundle.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's more reasonable / simpler to error out here rather than continuing and generate a path based on base name of the bundle. cgroupsPath should be a required data right ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On Thu, Sep 10, 2015 at 01:54:07AM -0700, Daniel, Dao Quang Minh wrote:

+If not specified, cgroups will be created under '/' with the base name of the bundle.

I think it's more reasonable / simpler to error out here rather than
continuing and generate a path based on base name of the
bundle. cgroupsPath should be a required data right ?

I often want cgroups for my containers, but don't care what the path
is. I think we do want the ability to tell the runtime “I don't care,
just make something up”, which is what leaving cgroupsPath out does.
But I'd rather leave the “make something up” algorithm up to the
runtime 1. See also the “several containers from one bundle” use
case 2.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm in favor of defaulting to sub-containers of root. Ping @crosbymichael @mrunalp @vbatts

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On Thu, Sep 10, 2015 at 10:46:59AM -0700, Vish Kannan wrote:

+If not specified, cgroups will be created under '/' with the base name of the bundle.

I'm in favor of defaulting to sub-containers of root.

If anyone disagrees with that, I think it's a whole nother can of
worms. I think the disagreement so far has been about what the
sub-container is going to be called, not where it will be rooted.

@vishh vishh force-pushed the cgroups-path branch 2 times, most recently from 461ad52 to ef882c5 Compare September 10, 2015 18:14
@mrunalp
Copy link
Contributor

mrunalp commented Sep 10, 2015

LGTM

Signed-off-by: Vishnu Kannan <vishnuk@google.com>
@crosbymichael
Copy link
Member

LGTM

The path to the cgroups can to be specified in the Spec via `cgroupsPath`.
`cgroupsPath` is expected to be relative to the cgroups mount point.
If not specified, cgroups will be created under '/'.
Implementations of the Spec can choose to name cgroups in any manner.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is only true if cgroupsPath is unset. So maybe rephrase as “In that case, implementations can choose to name cgroups in any manner.” to tie it into the previous ”If not specified…” sentence.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel it's clear enough for now.

On Thu, Sep 10, 2015 at 2:55 PM, W. Trevor King notifications@github.com
wrote:

In runtime-config-linux.md
#137 (comment):

@@ -131,6 +131,60 @@ Also known as cgroups, they are used to restrict resource usage for a container
cgroups provide controls to restrict cpu, memory, IO, pids and network for the container.
For more information, see the kernel cgroups documentation.

+The path to the cgroups can to be specified in the Spec via cgroupsPath.
+cgroupsPath is expected to be relative to the cgroups mount point.
+If not specified, cgroups will be created under '/'.
+Implementations of the Spec can choose to name cgroups in any manner.

This is only true if cgroupsPath is unset. So maybe rephrase as “In that
case, implementations can choose to name cgroups in any manner.” to tie it
into the previous ”If not specified…” sentence.


Reply to this email directly or view it on GitHub
https://github.com/opencontainers/specs/pull/137/files#r39198980.

@vbatts
Copy link
Member

vbatts commented Sep 10, 2015

LGTM

vbatts added a commit that referenced this pull request Sep 10, 2015
Adding cgroups path to the Spec.
@vbatts vbatts merged commit 88b0fc5 into opencontainers:master Sep 10, 2015
wking added a commit to wking/opencontainer-runtime-spec that referenced this pull request Nov 16, 2015
This should help clarify the cgroupsPath setting added in opencontainers#137, which
was the subject of some confusion in opencontainers/runc#397.  Issues
I'm trying to clarify here:

* If you specify a cgroupsPath, is the container added to that path or
  a sub-cgroup underneath it [1]?  (This commit rules in favor of
  "added to that path")

* If you specify a cgroupsPath, can the runtime modify that cgroup
  [2]?  (This commit rules "yes, if 'resources' is specified",
  following [3] and the Go comment from opencontainers#137 [4]).

To help make the distinctions clearer, I've added a facet list to help
folks think about the difference between cgroup creation, process
assignment, and resource configuration.  cgroupsPath is just about
cgroup creation and process assignment.  'resources' is just about
resource configuration.  I've listed out Mrunal's first three cases
[3] to be even clearer.  I stayed away from the "neither are set"
case, since I covered that fairly directly in opencontainers#237, which that was
punted back to the list [5] and has seen no further interest.  So I'm
not clear on what the intended semantics are there, although Mrunal's
wording in [4] seems to agree with the proposal in opencontainers#237.

[1]: opencontainers/runc#397 (comment)
[2]: opencontainers/runc#397 (comment)
[3]: opencontainers/runc#397 (comment)
[4]: opencontainers@429f936#diff-34c30be66233f08b447fb608ea0e66bbR30
[5]: https://groups.google.com/a/opencontainers.org/d/msg/dev/qWHoKs8Fsrk/c9mv6qXtDAAJ
     Message-ID: <20151029194427.GA30073@odin.tremily.us>

Signed-off-by: W. Trevor King <wking@tremily.us>
wking added a commit to wking/opencontainer-runtime-spec that referenced this pull request Dec 16, 2015
This should help clarify the cgroupsPath setting added in opencontainers#137, which
was the subject of some confusion in opencontainers/runc#397.  Issues
I'm trying to clarify here:

* If you specify a cgroupsPath, is the container added to that path or
  a sub-cgroup underneath it [1]?  (This commit rules in favor of
  "added to that path")

* If you specify a cgroupsPath, can the runtime modify that cgroup
  [2]?  (This commit rules "yes, if 'resources' is specified",
  following [3] and the Go comment from opencontainers#137 [4]).

To help make the distinctions clearer, I've added a facet list to help
folks think about the difference between cgroup creation, process
assignment, and resource configuration.  cgroupsPath is just about
cgroup creation and process assignment.  'resources' is just about
resource configuration.  I've listed out Mrunal's first three cases
[3] to be even clearer.  I stayed away from the "neither are set"
case, since I covered that fairly directly in opencontainers#237, which that was
punted back to the list [5] and has seen no further interest.  So I'm
not clear on what the intended semantics are there, although Mrunal's
wording in [4] seems to agree with the proposal in opencontainers#237.

[1]: opencontainers/runc#397 (comment)
[2]: opencontainers/runc#397 (comment)
[3]: opencontainers/runc#397 (comment)
[4]: opencontainers@429f936#diff-34c30be66233f08b447fb608ea0e66bbR30
[5]: https://groups.google.com/a/opencontainers.org/d/msg/dev/qWHoKs8Fsrk/c9mv6qXtDAAJ
     Message-ID: <20151029194427.GA30073@odin.tremily.us>

Signed-off-by: W. Trevor King <wking@tremily.us>
wking added a commit to wking/opencontainer-runtime-spec that referenced this pull request Dec 16, 2015
This should help clarify the cgroupsPath setting added in opencontainers#137, which
was the subject of some confusion in opencontainers/runc#397.  Issues
I'm trying to clarify here:

* If you specify a cgroupsPath, is the container added to that path or
  a sub-cgroup underneath it [1]?  (This commit rules in favor of
  "added to that path")

* If you specify a cgroupsPath, can the runtime modify that cgroup
  [2]?  (This commit rules "yes, if 'resources' is specified",
  following [3] and the Go comment from opencontainers#137 [4]).

To help make the distinctions clearer, I've added a facet list to help
folks think about the difference between cgroup creation, process
assignment, and resource configuration.  cgroupsPath is just about
cgroup creation and process assignment.  'resources' is just about
resource configuration.  I've listed out Mrunal's first three cases
[3] to be even clearer.  I stayed away from the "neither are set"
case, since I covered that fairly directly in opencontainers#237, which that was
punted back to the list [5] and has seen no further interest.  So I'm
not clear on what the intended semantics are there, although Mrunal's
wording in [4] seems to agree with the proposal in opencontainers#237.

[1]: opencontainers/runc#397 (comment)
[2]: opencontainers/runc#397 (comment)
[3]: opencontainers/runc#397 (comment)
[4]: opencontainers@429f936#diff-34c30be66233f08b447fb608ea0e66bbR30
[5]: https://groups.google.com/a/opencontainers.org/d/msg/dev/qWHoKs8Fsrk/c9mv6qXtDAAJ
     Message-ID: <20151029194427.GA30073@odin.tremily.us>

Signed-off-by: W. Trevor King <wking@tremily.us>
wking added a commit to wking/opencontainer-runtime-spec that referenced this pull request May 9, 2017
The only discussion related to this is in [1,2], where the
relationship between oomScoreAdj and disableOOMKiller is raised. But
since 429f936 (Adding cgroups path to the Spec, 2015-09-02, opencontainers#137)
resources has been tied to cgroups, and oomScoreAdj is not about
cgroups.  For example, we currently have (in config-linux.md):

  You can configure a container's cgroups via the resources field of
  the Linux configuration.

I suggested we move the property from linux.resources.oomScoreAdj to
linux.oomScoreAdj so config authors and runtimes don't have to worry
about what cgroupsPath means if the only entry in resources is
oomScoreAdj.  Michael responded with [4]:

  If anything it should probably go on the process

So that's what this commit does.

[1]: opencontainers#236
[2]: opencontainers#292
[3]: opencontainers#137
[4]: opencontainers#782 (comment)

Signed-off-by: W. Trevor King <wking@tremily.us>
wking added a commit to wking/opencontainer-runtime-spec that referenced this pull request May 9, 2017
The only discussion related to this is in [1,2], where the
relationship between oomScoreAdj and disableOOMKiller is raised. But
since 429f936 (Adding cgroups path to the Spec, 2015-09-02, opencontainers#137)
resources has been tied to cgroups, and oomScoreAdj is not about
cgroups.  For example, we currently have (in config-linux.md):

  You can configure a container's cgroups via the resources field of
  the Linux configuration.

I suggested we move the property from linux.resources.oomScoreAdj to
linux.oomScoreAdj so config authors and runtimes don't have to worry
about what cgroupsPath means if the only entry in resources is
oomScoreAdj.  Michael responded with [4]:

  If anything it should probably go on the process

So that's what this commit does.

I've gone with the four-space indents here to keep Pandoc happy (see
7795661 (runtime.md: Fix sub-bullet indentation, 2016-06-08, opencontainers#495),
but have left the existing entries in this list unchanged to reduce
churn.

[1]: opencontainers#236
[2]: opencontainers#292
[3]: opencontainers#137
[4]: opencontainers#782 (comment)

Signed-off-by: W. Trevor King <wking@tremily.us>
wking added a commit to wking/opencontainer-runtime-spec that referenced this pull request May 9, 2017
The only discussion related to this is in [1,2], where the
relationship between oomScoreAdj and disableOOMKiller is raised. But
since 429f936 (Adding cgroups path to the Spec, 2015-09-02, opencontainers#137)
resources has been tied to cgroups, and oomScoreAdj is not about
cgroups.  For example, we currently have (in config-linux.md):

  You can configure a container's cgroups via the resources field of
  the Linux configuration.

I suggested we move the property from linux.resources.oomScoreAdj to
linux.oomScoreAdj so config authors and runtimes don't have to worry
about what cgroupsPath means if the only entry in resources is
oomScoreAdj.  Michael responded with [4]:

  If anything it should probably go on the process

So that's what this commit does.

I've gone with the four-space indents here to keep Pandoc happy (see
7795661 (runtime.md: Fix sub-bullet indentation, 2016-06-08, opencontainers#495),
but have left the existing entries in this list unchanged to reduce
churn.

[1]: opencontainers#236
[2]: opencontainers#292
[3]: opencontainers#137
[4]: opencontainers#782 (comment)

Signed-off-by: W. Trevor King <wking@tremily.us>
wking added a commit to wking/opencontainer-runtime-spec that referenced this pull request Jul 12, 2017
It's backed by memory.oom_control, so this commit moves it in with
the rest of the memory-controller config.

Looking at the history, the initial request landing a setting for this
in the Docker/OCI ecosystem seems to be [1], which added
Cgroup.OomKillDisable.  That commit was carried from libcontainer into
runC [2] where it is now Resources.OomKillDisable [3].  From runC it
was carried into this repo (with some renaming) in [4].  Subsequent
early doc updates landed in [5,6].  In none of those can I find
discussion about why the setting is not already under memory.  I
expect the reason is that the runC structures are flat, so "under
memory" is not a thing there.  But in this spec, resources has
per-controller sub-properties.  The fact that disableOOMKiller
belonged to the memory controller may have been overlooked in [4] and
never revisited until now.

[1]: docker-archive/libcontainer#417
     Subject: cgroups: add support for oom control
[2]: opencontainers/runc@295c708
     Subject: cgroups: add support for oom control
[3]: https://github.com/opencontainers/runc/blob/v1.0.0-rc3/libcontainer/configs/cgroup_unix.go#L113-L114
[4]: opencontainers#51
     Subject: Add Go types for specification
[5]: opencontainers#137
     Subject: Adding cgroups path to the Spec.
[6]: opencontainers#199
     Subject: runtime: config: linux: add cgroups informations

Signed-off-by: W. Trevor King <wking@tremily.us>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants