Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Calculate system metrics per node #792

Closed
4 tasks done
danielmitterdorfer opened this issue Oct 14, 2019 · 0 comments · Fixed by #803
Closed
4 tasks done

Calculate system metrics per node #792

danielmitterdorfer opened this issue Oct 14, 2019 · 0 comments · Fixed by #803
Assignees
Labels
enhancement Improves the status quo :Metrics How metrics are stored, calculated or aggregated :Reporting Command line reporting
Milestone

Comments

@danielmitterdorfer
Copy link
Member

danielmitterdorfer commented Oct 14, 2019

Rally captures not only request metrics (such as service time, latency and throughput) or cluster-wide metrics (such as GC times) but also system metrics per node (e.g. number of bytes written). Assuming an Elasticsearch metrics store, raw metrics are written to rally-metrics. When the node has been shutdown, the reporting component that is running on the coordinator node will then aggregate the raw results and store them in rally-races and rally-results.

With our move to a different execution model where nodes can be managed independently of the actual benchmark (see #697), this is not feasible anymore as there is no communication link between the mechanic managing individual nodes and the coordinator. Therefore, we need to aggregate per node metrics on each node. Consequently, these metrics will not be available in the command line report (because it might need to be displayed before the node is even shutdown as this will be coordinated by a component that is separate from Rally).

Subtasks:

  • Split calculation / aggregation of global and system metrics
  • Persist race-metadata early so they are available for enriching system metrics (at the end of a race)
  • Ensure we fallback gracefully with an in-memory metrics store (race metadata cannot be retrieved on remote nodes)
  • Adapt chart generator to use sum instead of median aggregation (there are now multiple samples - one per node - instead of one sample for system metrics)
@danielmitterdorfer danielmitterdorfer added enhancement Improves the status quo :Metrics How metrics are stored, calculated or aggregated :Reporting Command line reporting labels Oct 14, 2019
@danielmitterdorfer danielmitterdorfer added this to the 1.4.0 milestone Oct 14, 2019
@danielmitterdorfer danielmitterdorfer self-assigned this Oct 14, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Improves the status quo :Metrics How metrics are stored, calculated or aggregated :Reporting Command line reporting
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant