Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Provide pre-aggregated metrics in /stats API #119560

Closed
matschaffer opened this issue Nov 24, 2021 · 3 comments
Closed

Provide pre-aggregated metrics in /stats API #119560

matschaffer opened this issue Nov 24, 2021 · 3 comments
Labels
Team:Core Core services & architecture: plugins, logging, config, saved objects, http, ES client, i18n, etc

Comments

@matschaffer
Copy link
Contributor

Part of #68626

Proposal

#104124 proposes removal of the process field. I'd like to propose another option: That we keep the process field around but fill it with meaningful aggregates of the processes: [] data.

I see two benefits from this:

  1. Any current computations on process (for example https://github.com/elastic/kibana/blob/main/x-pack/plugins/monitoring/server/lib/kibana/get_kibanas_for_clusters.ts#L129-L138) can continue functioning as-is. So mixing pre/post node-clustering data should be no problem.

  2. It provides simple/fast ways to query for overview data. This is similar in practice to things like docker.cpu.total.norm.pct where we pre-compute the total CPU percentage based on the current number of cores at ingest time. We could potentially also do this at query time, but it makes any such query quite complex.

Proposed aggregations

As a starting point, here are aggregates that I think might be useful. Please feel free to edit as the story develops:

  • memory.heap.total_in_bytes: sum of all processes

  • memory.heap.used_in_bytes: sum of all processes

  • memory.heap.size_limit: sum of all processes

  • memory.resident_set_size_in_bytes: sum of all processes

  • event_loop_delay: max of all processes

  • event_loop_delay_histogram: same stats, but aggregated over all proceses

The example doc I have seems to be missing cpu, which I'm sure we'll want in there as well.

@matschaffer matschaffer added the Team:Core Core services & architecture: plugins, logging, config, saved objects, http, ES client, i18n, etc label Nov 24, 2021
@elasticmachine
Copy link
Contributor

Pinging @elastic/kibana-core (Team:Core)

@lukeelmers
Copy link
Member

Copying over the proposed aggregations from the Node clustering RFC:

{
  // ...
  "process": {
    "memory": {
      "heap": {
        "total_bytes": 533581824, // sum of coordinator + workers
        "used_bytes": 296297424, // sum of coordinator + workers
        "size_limit": 4345298944 // sum of coordinator + workers
      },
      "resident_set_size_bytes": 563625984 // sum of coordinator + workers
    },
    "pid": 52646, // pid of the coordinator
    "event_loop_delay": 0.22967800498008728, // max of coordinator + workers
    "uptime_ms": 1706021.930404 // uptime of the coordinator
  },
  // ...
}

It seems this is pretty much in-line with @matschaffer's recommendations, just including a few missing fields that we'll need to retain in order to prevent this from being a breaking change. The snippet above is also missing event_loop_delay_histogram which was added after the original RFC.

This change will need to happen whenever introduce the ability for Kibana to run multiple processes, see #68626

@pgayvallet
Copy link
Contributor

#68626 has been closed, so I will close this one too. If we ever get back to the node clustering discussions, this should get back naturally in the discussions

@pgayvallet pgayvallet closed this as not planned Won't fix, can't repro, duplicate, stale Jul 8, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Team:Core Core services & architecture: plugins, logging, config, saved objects, http, ES client, i18n, etc
Projects
None yet
Development

No branches or pull requests

4 participants