-
Notifications
You must be signed in to change notification settings - Fork 8.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Enterprise Search Stack Monitoring #114303
Conversation
…lugin patterns (including CCS, etc)
…terprise search stats in addition to metrics (they are fetched differently and allow us to reuse the stats code we have for the main page panel)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks pretty good! Small tweaks.
x-pack/plugins/monitoring/public/application/pages/enterprise_search/overview.tsx
Outdated
Show resolved
Hide resolved
x-pack/plugins/monitoring/public/components/enterprise_search/overview/overview.tsx
Outdated
Show resolved
Hide resolved
x-pack/plugins/monitoring/public/components/enterprise_search/overview/overview.tsx
Outdated
Show resolved
Hide resolved
Guessing if we address @phillipb 's concerns here we can merge this. |
@elasticmachine merge upstream |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
@elasticmachine merge upstream |
💚 Build Succeeded
Metrics [docs]Module Count
Async chunks
Page load bundle
History
To update your PR or re-run it, just comment with: cc @kovyrin |
The following labels were identified as gaps in your version labels and will be added automatically:
If any of these should not be on your pull request, please manually remove them. |
* Added enterprise search panel, corrected queries * Update the index pattern for Enterprise Search * Typescript error ignore * Our timestamp fields are called @timestamp (per ECS) * Adjust Enterprise Search index patterns with the rest of monitoring plugin patterns (including CCS, etc) * Initial implementation of the Enterprise Search overview panel (health only) * Add a basic stub for enterprise search response fields * Cleanup aggs configs * Bring back a file deleted by mistake * Started working on the overview page * Correctly use heap_max as the total heap * Ent search breadcrumbs * Simple overview * Allow the cluster_uuid filter to be skipped while fetching metrics * Cleanup * Switch to module-level uuid field and use both types of events * Add stats-based product usage metrics + apply filter paths to reduce traffic * Change the name of the ent search overview class * Move the standalone cluster hack in the the internal function * Change the overview page to show product usage metrics + introduce enterprise search stats in addition to metrics (they are fetched differently and allow us to reuse the stats code we have for the main page panel) * Cluster UUID is at the module level now * Simplify ent search pages structure, only have one overview page * Fix ent search icon * Add total instances * Product usage metric graphs * Simplify metrics loading in the overview page since we load all metrics anyways * Add more enterprise search overview metrics * Avoid duplicate labels * linting * Revert "Simplify metrics loading in the overview page since we load all metrics anyways" This reverts commit 4bd67ab. * Switch to multiple timeseries per graph * Reorder graphs and metrics for better experience * Typescript fixes * i18n fixes * Added a couple more JVM metrics * Completely covered JVM metrics * Convert Enterprise Search component to Typescript * Switch config setting back * Remove the nodes link since it raises more questions than it solves * Update jest snapshots with the new metrics * Remove console statement * Properly handle cases when aggregations return no data for Enterprise Search * Add a functional test for the Enterprise search cluster list panel * Add a functional test for Enterprise Search overview page * Update multicluster API response fixture with the new enterprise search response key * Default uptime value is 0 * update overview fixture * More fixture updates * Remove fixmes * Fix imports * Properly export type * Maybe fix the type checking error * PR Feedback * TS fixes Co-authored-by: cdelgado <carlos.delgado@elastic.co> Co-authored-by: Kibana Machine <42973632+kibanamachine@users.noreply.github.com> Co-authored-by: Jason Stoltzfus <jason.stoltzfus@elastic.co>
💚 Backport successful
This backport PR will be merged automatically after passing CI. |
* Added enterprise search panel, corrected queries * Update the index pattern for Enterprise Search * Typescript error ignore * Our timestamp fields are called @timestamp (per ECS) * Adjust Enterprise Search index patterns with the rest of monitoring plugin patterns (including CCS, etc) * Initial implementation of the Enterprise Search overview panel (health only) * Add a basic stub for enterprise search response fields * Cleanup aggs configs * Bring back a file deleted by mistake * Started working on the overview page * Correctly use heap_max as the total heap * Ent search breadcrumbs * Simple overview * Allow the cluster_uuid filter to be skipped while fetching metrics * Cleanup * Switch to module-level uuid field and use both types of events * Add stats-based product usage metrics + apply filter paths to reduce traffic * Change the name of the ent search overview class * Move the standalone cluster hack in the the internal function * Change the overview page to show product usage metrics + introduce enterprise search stats in addition to metrics (they are fetched differently and allow us to reuse the stats code we have for the main page panel) * Cluster UUID is at the module level now * Simplify ent search pages structure, only have one overview page * Fix ent search icon * Add total instances * Product usage metric graphs * Simplify metrics loading in the overview page since we load all metrics anyways * Add more enterprise search overview metrics * Avoid duplicate labels * linting * Revert "Simplify metrics loading in the overview page since we load all metrics anyways" This reverts commit 4bd67ab. * Switch to multiple timeseries per graph * Reorder graphs and metrics for better experience * Typescript fixes * i18n fixes * Added a couple more JVM metrics * Completely covered JVM metrics * Convert Enterprise Search component to Typescript * Switch config setting back * Remove the nodes link since it raises more questions than it solves * Update jest snapshots with the new metrics * Remove console statement * Properly handle cases when aggregations return no data for Enterprise Search * Add a functional test for the Enterprise search cluster list panel * Add a functional test for Enterprise Search overview page * Update multicluster API response fixture with the new enterprise search response key * Default uptime value is 0 * update overview fixture * More fixture updates * Remove fixmes * Fix imports * Properly export type * Maybe fix the type checking error * PR Feedback * TS fixes Co-authored-by: cdelgado <carlos.delgado@elastic.co> Co-authored-by: Kibana Machine <42973632+kibanamachine@users.noreply.github.com> Co-authored-by: Jason Stoltzfus <jason.stoltzfus@elastic.co> Co-authored-by: Oleksiy Kovyrin <oleksiy@kovyrin.net> Co-authored-by: cdelgado <carlos.delgado@elastic.co> Co-authored-by: Jason Stoltzfus <jason.stoltzfus@elastic.co>
* Added enterprise search panel, corrected queries * Update the index pattern for Enterprise Search * Typescript error ignore * Our timestamp fields are called @timestamp (per ECS) * Adjust Enterprise Search index patterns with the rest of monitoring plugin patterns (including CCS, etc) * Initial implementation of the Enterprise Search overview panel (health only) * Add a basic stub for enterprise search response fields * Cleanup aggs configs * Bring back a file deleted by mistake * Started working on the overview page * Correctly use heap_max as the total heap * Ent search breadcrumbs * Simple overview * Allow the cluster_uuid filter to be skipped while fetching metrics * Cleanup * Switch to module-level uuid field and use both types of events * Add stats-based product usage metrics + apply filter paths to reduce traffic * Change the name of the ent search overview class * Move the standalone cluster hack in the the internal function * Change the overview page to show product usage metrics + introduce enterprise search stats in addition to metrics (they are fetched differently and allow us to reuse the stats code we have for the main page panel) * Cluster UUID is at the module level now * Simplify ent search pages structure, only have one overview page * Fix ent search icon * Add total instances * Product usage metric graphs * Simplify metrics loading in the overview page since we load all metrics anyways * Add more enterprise search overview metrics * Avoid duplicate labels * linting * Revert "Simplify metrics loading in the overview page since we load all metrics anyways" This reverts commit 4bd67ab. * Switch to multiple timeseries per graph * Reorder graphs and metrics for better experience * Typescript fixes * i18n fixes * Added a couple more JVM metrics * Completely covered JVM metrics * Convert Enterprise Search component to Typescript * Switch config setting back * Remove the nodes link since it raises more questions than it solves * Update jest snapshots with the new metrics * Remove console statement * Properly handle cases when aggregations return no data for Enterprise Search * Add a functional test for the Enterprise search cluster list panel * Add a functional test for Enterprise Search overview page * Update multicluster API response fixture with the new enterprise search response key * Default uptime value is 0 * update overview fixture * More fixture updates * Remove fixmes * Fix imports * Properly export type * Maybe fix the type checking error * PR Feedback * TS fixes Co-authored-by: cdelgado <carlos.delgado@elastic.co> Co-authored-by: Kibana Machine <42973632+kibanamachine@users.noreply.github.com> Co-authored-by: Jason Stoltzfus <jason.stoltzfus@elastic.co>
Summary
This PR adds support for Enterprise Searc into the Stack Monitoring plugin. It relies on the new metricbeat module we're shipping in 7.16 (already merged + there is a PR to improve the metricsets) and that will be integrated into the Enterprise Search solution by default (running as a sidecar, controlled via the solution config).
The code in this PR is based primarily on the patterns and style of APM, Beats and Logstash modules and we tried to keep the changes extremely contained to avoid conflicts with any of the de-angularization work that is ongoing within the plugin. The overview page has been built with React (hence the react flag being enabled within the PR, we'll remove it before merging) to align with the new direction for the monitoring plugin.
Our team is planning to support and keep developing this code going forward and we're ready to make whatever changes necessary to align it with the status quo followed by other parts of Stack Monitoring. If any help is needed with testing of the changes, please let us know.
Event Structure
One thing of note in this PR is that Enterprise Search monitoring events fo not have a
cluster_uuid
field at the root level unlike all other events used by Stack Monitoring. Since the events are generated by metricbeat and elasticsearch metricbeat module has already added acluster_uuid
into the global schema as an alias for their field, we cannot use the same approach and we did not want to add the field to the global schema since it is not compatible with ECS. Instead, we had to change Stack monitoring logic for fetching time series to allow us to pass a flag to it to skip the implicit cluster_uuid filter applied to all queries. You can see the changes inget_metrics.ts
andget_series.ts
.Feature Progress
Screenshots
Main page
Enterprise Search Overview
Checklist
Delete any items that are not applicable to this PR.
Risk Matrix
Delete this section if it is not applicable to this PR.
Before closing this PR, invite QA, stakeholders, and other developers to identify risks that should be tested prior to the change/feature release.
When forming the risk matrix, consider some of the following examples and how they may potentially impact the change:
For maintainers