-
Notifications
You must be signed in to change notification settings - Fork 2.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Aggregate perf metrics #2611
Aggregate perf metrics #2611
Conversation
d2eb985
to
cb52f40
Compare
Tested with 8 cores, 60 containers, 5 perf events. configuration file:
Aggregated perf events Per CPU perf events @wacuuu Could you test this pull request in your environment? |
baseline for disable_metrics is set to To measure the parameters i run the following command on the node that is running cadvisor in container:
So the conclusion would be that |
ea453cb
to
d370486
Compare
@dashpole it is ready for review. |
docs/runtime_options.md
Outdated
- `--disable_metrics="percpu"` - core perf events are aggregated | ||
- `--disable_metrics=""` - core perf events are exposed per CPU. | ||
|
||
Aggregated form of core perf events significantly decrease volume of data. For aggregated form of core perf events scaling ratio (`container_perf_metric_scaling ratio`) indicates average scaling ratio for specific event. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO "average" is no useful comparing to the worst case when scaling happened
after aggregation with average equal 90% one doesnt know if the all cpus were scaled with by 10% or most of them were 100% and one was running only 5% and scaled 20 times (quite significant error)
You should add or replace (avg) with min scaling ratio accross all cpus.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok so I'll replace average with min
/retest |
@katarzyna-z is it possible to unit test this? |
Add documentation about core perf events aggregation Signed-off-by: Katarzyna Kujawa <katarzyna.kujawa@intel.com>
821ed4b
to
538a6d5
Compare
@@ -44,6 +44,21 @@ func TestPrometheusCollector(t *testing.T) { | |||
testPrometheusCollector(t, reg, "testdata/prometheus_metrics") | |||
} | |||
|
|||
func TestPrometheusCollectorWithPerfAggregated(t *testing.T) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@dashpole Unit tests are available in this file. Do you think that I should add something?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
Signed-off-by: Katarzyna Kujawa katarzyna.kujawa@intel.com
I want to test it more and add more unit tests.
Related to #2608