Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Memory numa stats #2621

Merged
merged 2 commits into from
Aug 13, 2020
Merged

Memory numa stats #2621

merged 2 commits into from
Aug 13, 2020

Conversation

katarzyna-z
Copy link
Collaborator

Signed-off-by: Katarzyna Kujawa katarzyna.kujawa@intel.com

This pull request introduces information from memory.numa_stat as Prometheus metrics.

Signed-off-by: Katarzyna Kujawa <katarzyna.kujawa@intel.com>
values := make(metricValues, 0)

values = append(values, getNumaStatsPerNode(s.Memory.ContainerData.NumaStats.Total,
[]string{"total", "container"}, s.Timestamp)...)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The general rule of thumb for when to use a label vs when to add a new metric is that the sum of a metric across all dimensions should be meaningful. "total" isn't a great dimension to have, as we would expect the sum of dimensions to be the "total". So we can either calculate the "other" portion, or make ...pages_total a separate metric.

Copy link
Collaborator Author

@katarzyna-z katarzyna-z Jul 27, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In documentation it is written that "total" count is sum of file + anon + unevictable so I'll remove metrics with "type"="total".

[]string{"unevictable", "container"}, s.Timestamp)...)

values = append(values, getNumaStatsPerNode(s.Memory.HierarchicalData.NumaStats.Total,
[]string{"total", "hierarchy"}, s.Timestamp)...)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suspect hierarchy vs container may need to be separate metrics by the same logic above.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I followed the pattern which was used for "container_memory_failures_total metric", see this. There is "scope" label with values "container" or "hierarchy".

name: "container_memory_numa_pages",
help: "Memory usage per numa node",
valueType: prometheus.GaugeValue,
extraLabels: []string{"type", "scope", "node"},
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we only add two labels to metrics below?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this comment is still relevant

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I see now.

if includedMetrics.Has(container.MemoryNumaMetrics) {
c.containerMetrics = append(c.containerMetrics, []containerMetric{
{
name: "container_memory_numa_pages",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this measured in #pages, or bytes? The help text should specify the units, and the suffix of the metric should be _

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here it is written that memory.numa_stat contains pages and I see that in runc values are only read from file. I'll improve help text.

Improve help text for prometheus metric

Signed-off-by: Katarzyna Kujawa <katarzyna.kujawa@intel.com>
@katarzyna-z
Copy link
Collaborator Author

@dashpole Could you take a look if new names of metrics are better?

Copy link
Collaborator

@dashpole dashpole left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@dashpole dashpole merged commit 90f391f into google:master Aug 13, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants