-
Notifications
You must be signed in to change notification settings - Fork 8.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Provide pre-aggregated metrics in /stats API #119560
Comments
Pinging @elastic/kibana-core (Team:Core) |
Copying over the proposed aggregations from the Node clustering RFC: {
// ...
"process": {
"memory": {
"heap": {
"total_bytes": 533581824, // sum of coordinator + workers
"used_bytes": 296297424, // sum of coordinator + workers
"size_limit": 4345298944 // sum of coordinator + workers
},
"resident_set_size_bytes": 563625984 // sum of coordinator + workers
},
"pid": 52646, // pid of the coordinator
"event_loop_delay": 0.22967800498008728, // max of coordinator + workers
"uptime_ms": 1706021.930404 // uptime of the coordinator
},
// ...
} It seems this is pretty much in-line with @matschaffer's recommendations, just including a few missing fields that we'll need to retain in order to prevent this from being a breaking change. The snippet above is also missing This change will need to happen whenever introduce the ability for Kibana to run multiple processes, see #68626 |
#68626 has been closed, so I will close this one too. If we ever get back to the node clustering discussions, this should get back naturally in the discussions |
Part of #68626
Proposal
#104124 proposes removal of the
process
field. I'd like to propose another option: That we keep theprocess
field around but fill it with meaningful aggregates of theprocesses: []
data.I see two benefits from this:
Any current computations on
process
(for example https://github.com/elastic/kibana/blob/main/x-pack/plugins/monitoring/server/lib/kibana/get_kibanas_for_clusters.ts#L129-L138) can continue functioning as-is. So mixing pre/post node-clustering data should be no problem.It provides simple/fast ways to query for overview data. This is similar in practice to things like
docker.cpu.total.norm.pct
where we pre-compute the total CPU percentage based on the current number of cores at ingest time. We could potentially also do this at query time, but it makes any such query quite complex.Proposed aggregations
As a starting point, here are aggregates that I think might be useful. Please feel free to edit as the story develops:
memory.heap.total_in_bytes
: sum of all processesmemory.heap.used_in_bytes
: sum of all processesmemory.heap.size_limit
: sum of all processesmemory.resident_set_size_in_bytes
: sum of all processesevent_loop_delay
: max of all processesevent_loop_delay_histogram
: same stats, but aggregated over all procesesThe example doc I have seems to be missing cpu, which I'm sure we'll want in there as well.
The text was updated successfully, but these errors were encountered: