-
Notifications
You must be signed in to change notification settings - Fork 761
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix CPU share calculation #5324
Fix CPU share calculation #5324
Conversation
.../Microsoft.Extensions.Diagnostics.ResourceMonitoring/Linux/LinuxUtilizationParserCgroupV2.cs
Outdated
Show resolved
Hide resolved
.../Microsoft.Extensions.Diagnostics.ResourceMonitoring/Linux/LinuxUtilizationParserCgroupV2.cs
Outdated
Show resolved
Hide resolved
.../Microsoft.Extensions.Diagnostics.ResourceMonitoring/Linux/LinuxUtilizationParserCgroupV2.cs
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
.../Microsoft.Extensions.Diagnostics.ResourceMonitoring/Linux/LinuxUtilizationParserCgroupV2.cs
Show resolved
Hide resolved
.../Microsoft.Extensions.Diagnostics.ResourceMonitoring/Linux/LinuxUtilizationParserCgroupV2.cs
Show resolved
Hide resolved
cpuUnits = cpuPodShare; | ||
// The formula to calculate CPU pod weight (measured in millicores) from CPU share: | ||
// y = (1 + ((x - 2) * 9999) / 262142), | ||
// where y is the CPU pod weight (e.g. cpuPodWeight) and x is the CPU share of cgroup v1 (e.g. cpuUnits). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Btw, since we're in V2 class, should the comment say "cgroup v2" too?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
CPU share is a setting from cgroups v1 only and it is not necessarily present in cgroups v2 files, though it might happen, so we are trying to take advantage of that and calculate CPU request based on the CPU share value (if it exists).
} | ||
|
||
_ = GetNextNumber(cpuPodWeightBuffer, out long cpuPodWeight); | ||
|
||
if (cpuPodWeight == -1) | ||
{ | ||
Throw.InvalidOperationException($"Could not parse '{_cpuPodWeight}'. Expected to get an integer but got: '{cpuPodWeight}'."); | ||
Throw.InvalidOperationException( | ||
$"Could not parse '{_cpuPodWeight}' content. Expected to get an integer but got: '{cpuPodWeight}'."); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't this include the buffer? The value here will always be -1.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you @danmoseley, I totally missed this one.
This, however, reinforces the need to have tests that validate exception messages (see #5324 (comment)).
@evgenyfedorov2 if you could add tests - that'd be great. The best way to have such tests in my experience is to use Verify to assert blobs (e.g., text). This way we don't have to maintain/update test assertions manually whenever text changes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome, thank you
// where y is the CPU pod weight (e.g. cpuPodWeight) and x is the CPU share of cgroup v1 (e.g. cpuUnits). | ||
// https://github.com/kubernetes/enhancements/tree/master/keps/sig-node/2254-cgroup-v2#phase-1-convert-from-cgroups-v1-settings-to-v2 | ||
// We invert the formula to calculate CPU share from CPU pod weight: | ||
#pragma warning disable S109 // Magic numbers should not be used - using the formula, forgive. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can just extract local consts for the values then you can name them and eliminate the warnings.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But I deliberately don't want to do that, prefer to use the formula as is because it is more readable :)
I am sending a PR to fix this: #5334 |
@@ -533,6 +533,9 @@ private static bool TryGetCpuUnitsFromCgroups(IFileSystem fileSystem, out float | |||
|
|||
private static bool TryGetCgroupRequestCpu(IFileSystem fileSystem, out float cpuUnits) | |||
{ | |||
const long CpuPodWeightPossibleMax = 10_000; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@evgenyfedorov2 I am being curious here, pod, for me is a K8s concept and as far as I understand this parser will calculate the utilization of resources on a cgroup on a container regardless of the use of k8s or not.
In the future would it be worth it to rename pod to container to not confuse that this class and logic is not K8s exclussive?
Fixes #5323
Microsoft Reviewers: Open in CodeFlow