-
Notifications
You must be signed in to change notification settings - Fork 371
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
don't report CPU metrics if it's negative value #2538
Conversation
Codecov Report
@@ Coverage Diff @@
## develop #2538 +/- ##
========================================
Coverage 71.98% 71.99%
========================================
Files 102 102
Lines 15197 15202 +5
Branches 2404 2406 +2
========================================
+ Hits 10940 10945 +5
Misses 3776 3776
Partials 481 481
Continue to review full report at Codecov.
|
if 'track_throttled_time' in kwargs and kwargs['track_throttled_time']: | ||
tracked.append(MetricValue(MetricsCategory.CPU_CATEGORY, MetricsCounter.THROTTLED_TIME, self.name, self.get_throttled_time())) | ||
throttled_time = self.get_throttled_time() | ||
if cpu_usage >= float(0) and throttled_time >= float(0): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In this case, very likely throttled time could have wrong value too. So, I decided not to report both.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we do not need to check for the 2 conditions, if the single condition cpu_usage < 0 is true, we should not report the 2 metrics
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, if cpu_usage < 0 is true then it won't check second condition. I added second condition for rare case.
MetricValue(MetricsCategory.CPU_CATEGORY, MetricsCounter.PROCESSOR_PERCENT_TIME, self.name, self.get_cpu_usage()), | ||
] | ||
tracked = [] | ||
cpu_usage = self.get_cpu_usage() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
updating previous/current values happens inside these get functions on every poll, so no explicit logic needed to reset the values.
@@ -374,3 +374,44 @@ def test_extension_telemetry_not_sent_for_empty_perf_metrics(self, *args): # py | |||
metrics = CGroupsTelemetry.poll_all_tracked() | |||
self.assertEqual(0, len(metrics)) | |||
|
|||
@patch("azurelinuxagent.common.cgroup.MemoryCgroup.get_max_memory_usage") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
too many mocks here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I followed old test cases. Actually, we don't need to mock memory usage as this test/change only for CPU values. I will remove them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok, do you need the mock for is_active?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
needed otherwise it won't track cpu metrics on subsequent polls
@patch("azurelinuxagent.common.cgroup.CpuCgroup.get_cpu_usage") | ||
@patch("azurelinuxagent.common.cgroup.CpuCgroup.get_throttled_time") | ||
@patch("azurelinuxagent.common.cgroup.CGroup.is_active") | ||
def test_cgroup_telemetry_should_not_report_cpu_negative_value(self, patch_is_active, path_get_throttled_time, patch_get_cpu_usage, patch_get_memory_usage, patch_get_memory_max_usage, *args):# pylint: disable=unused-argument |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
don't disable unused-argument, use "_" as the argument name instead
Description
The cpu usage can be negative value if we notice less current cpu ticks from last time and this could happen when service restart between two polls.
Fix is not to report telemetry if cpu usage is negative value.
Issue #
PR information
Quality of Code and Contribution Guidelines