-
Notifications
You must be signed in to change notification settings - Fork 635
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Collect more CPU/disk/memory metrics #410
Conversation
/assign @wangzhen127 |
@xueweiz: GitHub didn't allow me to request PR reviews from the following users: kewu1992, vaibhav-rustagi. Note that only kubernetes members and repo collaborators can review this PR, and authors cannot review their own PRs. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
6fe74c5
to
7c69e9b
Compare
/lgtm |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: wangzhen127, xueweiz The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
Thanks a lot for the review Zhen! |
This PR adds more CPU/disk/memory metrics:
cpu_runnable_task_count
: helps detecting CPU pressure by measuring the number of "active" tasks on the nodecpu_usage_time
: helps detecting CPU pressure by measuring the CPU usage state breakdown (system
v.s.user
v.s.irq
...)disk_operation_count
: helps detecting disk IO pressure pattern (together with the 3 metrics below)disk_merged_operation_count
disk_operation_bytes_count
disk_operation_time
disk_bytes_used
: helps detecting disk space insufficiencymemory_bytes_used
: helps detecting memory pressure via breaking down memory usage by their state (buffer, cache, slab...)memory_anonymous_used
: helps detecting swap related problems (which could be related to disk IO pressure and memory pressure)memory_page_cache_used
: helps detecting page cache related problems (slow reads, disk IO pressure, memory pressure)memory_unevictable_used
: helps detecting swap related problemsmemory_dirty_used
: helps detecting filesystem and disk IO pressureThe details are documented in the README for system stats monitor. See a preview here.