-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use working set bytes if usage bytes is zero #25428
Use working set bytes if usage bytes is zero #25428
Conversation
💚 Build Succeeded
Expand to view the summary
Build stats
Test stats 🧪
Trends 🧪💚 Flaky test reportTests succeeded. Expand to view the summary
Test stats 🧪
|
Pinging @elastic/integrations (Team:Integrations) |
/test |
Hey @brianharwell, thanks for opening this PR! Could you please also add a changelog entry in |
@jsoriano Done. I can help test if you want. |
@brianharwell I have been thing about this and I have conflicting thoughts about what to do with these fields. I don't think that we can consider working set to be the same as the usage memory. These are not reported as the same by docker or kubernetes and Metricbeat is not reporting it as the same in this case or the docker case. We had similar discussions when adding these fields to the docker module, in #12172. But, if a pod is killed when its working set grows over its limit, I guess that we can use this value to calculate the percentage. Do you know if that is the case? Is a pod killed if its working set grows over its limit? |
CHANGELOG.asciidoc
Outdated
@@ -46,6 +46,7 @@ https://github.com/elastic/beats/compare/v7.12.0...v7.12.1[View commits] | |||
*Metricbeat* | |||
|
|||
- Ignore unsupported derive types for filesystem metricset. {issue}22501[22501] {pull}24502[24502] | |||
- Use working set bytes when memory usage is not reported. {pull}25428[25428] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This line should be added in CHANGELOG.next.asciidoc
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Whoops! Sorry!
I read through that discussion and I understand your point. This comment stuck out to me: #12172 (comment) Kubernetes supports limits for Windows pods.
Yes, I have test cases that prove that Kubernetes will terminate and restart a pod when the pod goes over a memory threshold. I created a console application that adds 1M guids to a list, waits a few seconds, and adds another 1M guids and continues until it consumes as much memory as possible. It appears that the pod is terminated when the Here are the console logs of my test app... And here is the memory usage chart from Kibana... And here is output from describing the pod...
Based on this, in Windows the working set bytes may not be synonymous with usage bytes. But the biggest issue, and the reason for this PR is I have no way to chart or alert on memory usage compared to the limit. So I am open to ideas. |
Thanks for the investigation, then I think that it makes sense to calculate the percentages based on the working set. I would propose to do the following:
|
I agree.
I am not sure about this. Ideally we would be able to tell if this was a Linux pod or a Windows pod and then show fields based on that. That would make things clearer and less confusing. If (or when) a change is made on the Kubernetes side some of these values may start to appear. For example, Isn't this a breaking change? Would the removal of these fields potentially break usages in charts, queries, and watchers? |
Well, other modules report metrics only when they are available, to differentiate this from zero values. But you are right, this could be breaking for some cases, and in any case this change wouldn't be needed to solve the lack of percentages, so we can leave this by now. |
I see you assigned it to yourself, does that mean you are going to make the required changes? Or do the changes I made cover this? |
No, sorry for the confusion, we sometimes do this to indicate the team that someone is helping and reviewing community contributions 🙂 I have assigned it to you too. I can follow with this. The pending thing I see would be to don't override |
Yeah I like that better, I'm on it |
I made a slight variation...
I added |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added
&& usageMem == 0
because I thinkusageMem
is the more appropriate field and if it is reported I think it should take precedence. What do you think?
Looks good to me.
If current code solves the problem you were having I think we can go on with it.
This pull request is now in conflicts. Could you fix it? 🙏
|
…s-memory-usage-bytes-take-2
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
/test |
In Windows native containers usage memory is reported using the workingSet. Use this value to calculate memory usage percentage. (cherry picked from commit 381e062)
What version will this be in? 7.14? |
We are not accepting new features in 7.13. But I think this can be considered a bugfix, let me backport this to 7.13 too. |
In Windows native containers usage memory is reported using the workingSet. Use this value to calculate memory usage percentage. (cherry picked from commit 381e062)
Sweet thanks! I'd love to implement as soon as I can. 😀 |
@jsoriano Any idea when the v7.13 release will be cut? |
@brianharwell we don't have closed dates for releases. But as an estimation, 7.13.0 will likely happen in two or three weeks. |
@brianharwell if you want to try with your own build, the 7.13 branch is already open. |
Similar issue has been reported with container metrics: #25657 |
What does this PR do?
This will use
workingSetBytes
for the memory usage ifusageMem
is zeroWhy is it important?
We have Windows containers running in Kubernetes but the kubelet only reports the working set bytes for a pod and not the memory usage bytes. As a result the field
kubernetes.pod.memory.usage.limit.pct
is reported as0
even though the pod has a memory limit.This is important because without
kubernetes.pod.memory.usage.limit.pct
we cannot alert or monitor based on how close the pod's memory is compared to it's memory limit.The memory usage bytes is reported for linux pods.
Here is a sample from the kubelet. Metricbeat uses this json to report pod metrics.
Checklist
I have zero experience with Go. I didn't want to ask someone else to make the change because this seemed rather simple.
CHANGELOG.next.asciidoc
orCHANGELOG-developer.next.asciidoc
.Author's Checklist
How to test this PR locally
Test using a Linux container and a Windows container. I can assist with both of these.
Related issues
Elastic Support Ticket #00711427
Use cases
N/A
Screenshots
Logs