-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Metricbeat] Expand docker/blkio data #16638
[Metricbeat] Expand docker/blkio data #16638
Conversation
@fearful-symmetry what do you mean by According to the documentation of blkio-controller Linux kernel returns the total time for all IO operations for |
@unixsurfer docker reports these stats per device/driver, hence why we also have major and minor numbers:
Metricbeat sums these down to a total per |
Well, if these times are quantities of time spent on something, I think it is ok to sum them, the same as we sum the bytes read or written. |
@fearful-symmetry they are not being populated in my system, on what distro do you see these metrics? maybe they require cgroups v2?
If these metrics are only exposed on the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Codewise it looks good to me, some possibly pending things:
- Confirm that these metrics are reported at least with "op"
Total
, so they are counted ingetNewStats
. - Document in the description of the fields what are the requirements for these metrics to work. cgroups v2?
- Changelog entry for the new fields.
Those metrics are from a different field, I'm just going on documentation and code, which suggests they're all the same. It appears that the time fields are broken down the same as the other fields.
Yah, I have...no idea. |
To sum them per device? IMHO, we can simply return the time per operation and don't return the sum for all operations. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is looking good, any chance you can update existing tests to include the changes? It will also need a changelog
0a6615c
to
3322351
Compare
jenkins, test this |
* initial commit of blkio enhancements * add test coverage * add changelog (cherry picked from commit ee2d287)
What does this PR do?
See #13798. This passes along additional cgroup metrics already gathered but not used by metricbeat. Those new fields are
wait_time
,service_time
andqueued
.I'm still not 100% sure about this. We treat the new time fields the same way we do everything else--we sum them and treat them as a total. I'm not sure if this is a way to go, if we should be using average time instead, or if there's a different field type we should be using of
long
. Also, I'm having trouble finding docker hosts where any of these new fields are even populated in the underlying docker stats, which isn't helping things.How to test this PR locally
Pull down on a macos/linux docker host, run
TestData