-
Notifications
You must be signed in to change notification settings - Fork 8.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implemented Alerting health status pusher by using task manager and status pooler for Kibana status plugins 'kibanahost/api/status' #79056
Implemented Alerting health status pusher by using task manager and status pooler for Kibana status plugins 'kibanahost/api/status' #79056
Conversation
…tatus pooler for Kibana status plugins 'kibanahost/api/status'
Pinging @elastic/kibana-alerting-services (Team:Alerting Services) |
…k-health-status # Please enter a commit message to explain why this merge is necessary, # especially if it merges an updated upstream into a topic branch. # # Lines starting with '#' will be ignored, and an empty message aborts # the commit.
…k-health-status # Please enter a commit message to explain why this merge is necessary, # especially if it merges an updated upstream into a topic branch. # # Lines starting with '#' will be ignored, and an empty message aborts # the commit.
…k-health-status # Please enter a commit message to explain why this merge is necessary, # especially if it merges an updated upstream into a topic branch. # # Lines starting with '#' will be ignored, and an empty message aborts # the commit.
…ed correct health task implementation
…k-health-status # Please enter a commit message to explain why this merge is necessary, # especially if it merges an updated upstream into a topic branch. # # Lines starting with '#' will be ignored, and an empty message aborts # the commit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I left a bunch of comments, which suggest some different shape changes on things - which may not be right, not sure :-). Perhaps would be worth a call to go over, I may be misunderstanding the design.
…k-health-status # Please enter a commit message to explain why this merge is necessary, # especially if it merges an updated upstream into a topic branch. # # Lines starting with '#' will be ignored, and an empty message aborts # the commit.
@@ -464,7 +465,7 @@ describe('utils', () => { | |||
status: 'error', | |||
lastExecutionDate: foundRule.executionStatus.lastExecutionDate, | |||
error: { | |||
reason: 'read', | |||
reason: AlertExecutionStatusErrorReasons.Read, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the cleanup here and above at https://github.com/elastic/kibana/pull/79056/files#diff-f6140d9a8de26e26d80f53651169d01af3446e59c854324b05f492fb708405cbR61 !
…k-health-status # Conflicts: # x-pack/plugins/triggers_actions_ui/public/application/sections/alerts_list/components/alerts_list.test.tsx
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, left a few comments/questions
) { | ||
try { | ||
const interval = (await config).healthCheck.interval; | ||
await taskManager.ensureScheduled({ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was anxious to see this running, so hacked an alert executor to throw an error, and then the config to run every minute :-). Makes me wonder if we want to schedule an additional one-time call at startup (maybe 1 minute after startup), so we can get the latest data shortly after startup, without having to wait for the hourly task to run. (could be another issue/PR)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We're getting much closer :)
There are a few things that still bug me (some duplication of code and nit. stuff) but the thing that is still keeping me from approving is that I don't understand why getHealth is on the AlertsClient instead of the Plugin Contract.
It doesn't use any user specific privileges, requires a fake request to be used in the task, and feels like a Plugin level API, so I think it should be moved there...
…k-health-status # Conflicts: # x-pack/plugins/alerts/server/alerts_client/alerts_client.ts
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
I think some dead code has slipped through in plugin.ts
, but other than that this is looking great! Well done :)
💚 Build SucceededMetrics [docs]distributable file count
page load bundle size
History
To update your PR or re-run it, just comment with: |
…tatus pooler for Kibana status plugins 'kibanahost/api/status' (elastic#79056) * Implemented Alerting health status pusher by using task manager and status pooler for Kibana status plugins 'kibanahost/api/status' * Exposed health task registration to alerts plugin * Fixed type error * Extended health API endpoint with info about decryption failures, added correct health task implementation * adjusted query * Tested locally and got it working as expected, fixed tests and type check * Added unit tests * Changed AlertExecutionStatusErrorReasons to be enum * Uppercase the enum * Replaced string values to enum * Fixed types * Extended AlertsClient with getHealth method * added return type to healthStatus$ * Added configurable health check interval and timestamps * Extended update core status interval to 5mins * Fixed failing tests * Registered alerts config * Fixed date for ok health state * fixed jest test * fixed task state * Fixed due to comments, moved getHealth to a plugin level * fixed type checks * Added sorting to the latest Ok state last update * adjusted error queries * Fixed jest tests * removed unused * fixed type check
…tatus pooler for Kibana status plugins 'kibanahost/api/status' (#79056) (#82907) * Implemented Alerting health status pusher by using task manager and status pooler for Kibana status plugins 'kibanahost/api/status' * Exposed health task registration to alerts plugin * Fixed type error * Extended health API endpoint with info about decryption failures, added correct health task implementation * adjusted query * Tested locally and got it working as expected, fixed tests and type check * Added unit tests * Changed AlertExecutionStatusErrorReasons to be enum * Uppercase the enum * Replaced string values to enum * Fixed types * Extended AlertsClient with getHealth method * added return type to healthStatus$ * Added configurable health check interval and timestamps * Extended update core status interval to 5mins * Fixed failing tests * Registered alerts config * Fixed date for ok health state * fixed jest test * fixed task state * Fixed due to comments, moved getHealth to a plugin level * fixed type checks * Added sorting to the latest Ok state last update * adjusted error queries * Fixed jest tests * removed unused * fixed type check
* master: (68 commits) [Fleet] Make stream id unique in agent policy (elastic#82447) skip flaky suite (elastic#82915) skip flaky suite (elastic#75794) Copy `dateAsStringRt` to observability plugin (elastic#82839) [Maps] rename connected_components/map folder to mb_map (elastic#82897) [Security Solution] Fix EventsViewer DnD cypress tests (elastic#82619) [Security Solution] Adds logging and performance fan out API for threat/Indicator matching (elastic#82546) Implemented Alerting health status pusher by using task manager and status pooler for Kibana status plugins 'kibanahost/api/status' (elastic#79056) [APM] Adds new configuration 'xpack.apm.maxServiceEnvironments' (elastic#82090) Move single use function in line (elastic#82885) [ML] Add unsigned_long support to data frame analytics and anomaly detection (elastic#82636) Add flot_chart dependency from shared_deps to Shareable Runtime (elastic#81649) [Security Solution][Detections] - Auto refresh all rules/monitoring tables (elastic#82062) [APM] Fix apm e2e runner script commands (elastic#82798) [Ingest Manager] Move cache functions to from registry to archive (elastic#82871) Update webpack-dev-server and webpack-cli (elastic#82844) [Uptime] Migrate to new es client (elastic#82003) Move parseAndVerify* functions to validation.ts (elastic#82845) Remove yeoman & yo (elastic#82825) [Canvas] Fix elements not being updated properly when filter is changed on workpad (elastic#81863) ...
…e-details-overlay * 'master' of github.com:elastic/kibana: (201 commits) Added `defaultActionMessage` to index threshold alert UI type definition (elastic#80936) [ILM] Migrate Delete phase and name field to Form Lib (elastic#82834) skip flaky suite (elastic#57426) [Alerting] adds an Run When field in the alert flyout to assign the action to an Action Group (elastic#82472) [APM] Expose APM event client as part of plugin contract (elastic#82724) [Logs UI] Fix errors during navigation (elastic#78319) Enable send to background in TSVB (elastic#82835) SavedObjects search_dsl: add match_phrase_prefix clauses when using prefix search (elastic#82693) [Ingest Manager] Unify install* under installPackage (elastic#82916) [Fleet] Make stream id unique in agent policy (elastic#82447) skip flaky suite (elastic#82915) skip flaky suite (elastic#75794) Copy `dateAsStringRt` to observability plugin (elastic#82839) [Maps] rename connected_components/map folder to mb_map (elastic#82897) [Security Solution] Fix EventsViewer DnD cypress tests (elastic#82619) [Security Solution] Adds logging and performance fan out API for threat/Indicator matching (elastic#82546) Implemented Alerting health status pusher by using task manager and status pooler for Kibana status plugins 'kibanahost/api/status' (elastic#79056) [APM] Adds new configuration 'xpack.apm.maxServiceEnvironments' (elastic#82090) Move single use function in line (elastic#82885) [ML] Add unsigned_long support to data frame analytics and anomaly detection (elastic#82636) ...
Current PR include the next features:
getHealth
, which is usingexecutionStatus
alert property to verify if any of alerts in the system has an errors of specific reason.getHealth().hasDecryptionFailures
.core.status.set
. Ne core status get the latest health state from 'alerting_health_check' task execution result.api/alerts/_health
with the new propertyalertingFrameworkHeath
.Resolve #75042