Skip to content
This repository has been archived by the owner on Feb 24, 2020. It is now read-only.

Expose relative cpu shares #3242

Closed
yifan-gu opened this issue Sep 23, 2016 · 12 comments
Closed

Expose relative cpu shares #3242

yifan-gu opened this issue Sep 23, 2016 · 12 comments
Labels
Milestone

Comments

@yifan-gu
Copy link
Contributor

yifan-gu commented Sep 23, 2016

Today, k8s API expose cpu limit and cpu request for cpu resource restriction.

The kubelet translates cpu limits to cpu-quota/cpu-period, and translates cpu request to relative cpu shares here

Then, it passes runtime the cpu share, cpu quota, and cpu period.

But for rkt app add we only have --cpu which accepts millicores, and under the hood, it's setting the cfs-cpu-quota, not the the relative cpu shares.

We need to add another flag (e.g. --cpu-share) to expose the relative CPUShares option in the app unit

On the other hand, --cpu feels ambiguous, as a user I can't tell whether it's specifying the relative cpu share or cfs cpu quota until I dug into the implementation. So maybe we can rework the flag name as well, for example:

  • Deprecate --cpu flag, and use --cpu-quota to take a percentage number to set cfs cpu quotas.
  • Add --cpu-share flag to take a number (2 to 262144) to set the relative cpu shares.

cc @squeed @lucab @iaguis @s-urbaniak

@s-urbaniak s-urbaniak self-assigned this Oct 5, 2016
@s-urbaniak s-urbaniak added this to the v1.17.0 milestone Oct 5, 2016
@s-urbaniak
Copy link
Contributor

This sounds good to me, and is (nearly) en par with the expected interface wrt CRI.

@s-urbaniak
Copy link
Contributor

@yifan-gu as discussed OOB with @jonboulle this seems to be mainly motivated by docker CLI options [1]. In fact, looking at the k8s code [2] setting cpuShares/cpuQuota the source for setting those knobs is cpuLimit/cpuRequest anyways.

In the case of rkt the existing CPU isolator already understands the user-facing cpu requests/limits so I agree that introducing yet another indirection seems unnecessary. Can we suggest to move the code in [2] to a docker specific CRI runtime implementation, and simply pass-through the existing CPU request/limit options in https://github.com/kubernetes/kubernetes/blob/master/pkg/kubelet/api/v1alpha1/runtime/api.proto#L330?

[1] https://docs.docker.com/engine/reference/run/#/runtime-constraints-on-resources
[2] https://github.com/kubernetes/kubernetes/blob/3f4a66f3d6892b8d8831a8a60b91fd1afbefee4d/pkg/kubelet/kuberuntime/kuberuntime_container.go#L193-L214

@squeed
Copy link
Contributor

squeed commented Oct 7, 2016

Regardless of the source of these options, they definitely have distinct behavior and, IMO, should be exposed separately.

The big difference between cpu-shares and cpu-quota is that the quota is a hard percentage limit enforced by the scheduler, whereas cpu-shares is a relative value that is then normalized.

If Kubernetes presents these settings distinctly, then we should not commingle them.

@s-urbaniak
Copy link
Contributor

@squeed @jonboulle Agreed, I was initially assuming both settings are being equivalent, but they are not, hence it does make sense to include setting.

@jonboulle
Copy link
Contributor

sorry, I started typing a reply and never sent it.

The reasoning was that they are not exposed in the Kubernetes API (ref: http://kubernetes.io/docs/user-guide/compute-resources/#resource-requests-and-limits-of-pod-and-container), just in CRI. Basically what Kubernetes exposes to users is its abstraction notion of a CPU, and then the kubelet does this process of converting it into shares. So the question was whether we could push this down into the runtime and simplify the CRI.

@squeed
Copy link
Contributor

squeed commented Oct 11, 2016

Hmm, this part of the doc seems to imply that both Shares and Quota are exposed in the podspec?

@jonboulle
Copy link
Contributor

Thanks for correcting me. I need better reading glasses today.
I maintain that that's particularly lazy/stupid behaviour, but looks like
the cat's out of the bag.

On 11 October 2016 at 19:13, Casey Callendrello notifications@github.com
wrote:

Hmm, this part of the doc
http://kubernetes.io/docs/user-guide/compute-resources/#how-pods-with-resource-limits-are-run
seems to imply that both Shares and Quota are exposed in the podspec?


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#3242 (comment), or mute
the thread
https://github.com/notifications/unsubscribe-auth/ACewN7TJaiek00oW_LNOqwhGgb_XVbAjks5qy8OxgaJpZM4KFcEP
.

@s-urbaniak
Copy link
Contributor

Bumping to the next release. I have a nearly working version, but systemd refuses to set the cpu.shares cgroup for some reason. I need to investigate this further to find the root cause.

@s-urbaniak s-urbaniak modified the milestones: v1.18.0, v1.17.0 Oct 12, 2016
@iaguis
Copy link
Member

iaguis commented Oct 12, 2016

Bumping to the next release. I have a nearly working version, but systemd refuses to set the cpu.shares cgroup for some reason. I need to investigate this further to find the root cause.

You need to remount the relevant knobs RW like we do with the other CPU/memory isolators :)

@s-urbaniak
Copy link
Contributor

@iaguis thanks, that was it 👍

s-urbaniak pushed a commit to s-urbaniak/rkt that referenced this issue Oct 12, 2016
Currently we have func .*V[1|2].* named function names in the cgroup
package.

This factors those functions into two separate packages cgroup/v1, and
cgroup/v2.

This is a prerequisite for rkt#3242
s-urbaniak pushed a commit to s-urbaniak/rkt that referenced this issue Oct 12, 2016
Currently we have func .*V[1|2].* named function names in the cgroup
package.

This factors those functions into two separate packages cgroup/v1, and
cgroup/v2.

This is a prerequisite for rkt#3242
@jonboulle jonboulle reopened this Oct 20, 2016
s-urbaniak pushed a commit to s-urbaniak/rkt that referenced this issue Oct 20, 2016
s-urbaniak pushed a commit to s-urbaniak/rkt that referenced this issue Oct 25, 2016
@lucab lucab modified the milestones: v1.19.0, v1.18.0 Oct 25, 2016
s-urbaniak pushed a commit to s-urbaniak/rkt that referenced this issue Nov 1, 2016
@s-urbaniak
Copy link
Contributor

  • Deprecate --cpu flag, and use --cpu-quota to take a percentage number to set cfs cpu quotas.

As discussed with @lucab OOB we will stick for now with the --cpu CLI. We do a mechanical conversion from cpu-millis to percent as per [1], hence deprecating this flag doesn't make a huge difference.

  • Add --cpu-share flag to take a number (2 to 262144) to set the relative cpu shares.

This was addressed in #3299, and is merged.

@lucab I hope I summarized the current state correctly, therefore I am inclined to close this issue, unless we have a consensus that deprecating --cpu is an idea we want to pursue.

[1] https://github.com/coreos/rkt/blob/9b4b356/stage1/init/common/units.go#L532

@lucab
Copy link
Member

lucab commented Nov 7, 2016

I think that for the time being we are fine with --cpu and --cpu-share. We may revisit --cpu semantics if we need at some point in the future, but I don't see a clear need for doing it now. I consider the original report fixed by #3299, thus closing.

@lucab lucab closed this as completed Nov 7, 2016
indradhanush pushed a commit to kinvolk-archives/appc-spec that referenced this issue Jan 24, 2018
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

No branches or pull requests

6 participants