-
Notifications
You must be signed in to change notification settings - Fork 8.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Detection Engine] Adds Alert Suppression to ML Rules #181926
Conversation
This is mostly based on the current test plan. It's not wired up yet, nor are there any actual implementations.
These now have type errors, since ML rules don't yet accept suppression fields. We have our next task!
`node scripts/openapi/generate`
We're now asserting that suppression fields are present on the generated alerts, which they're not, because we haven't implemented them yet. That's the next step!
* Adds call getIsSuppressionActive in our rule executor, and necessary dependencies * Adds suppression fields to ML rule schema * Adds feature flag for ML suppression
I noticed that it doesn't look like we're including a lot of timing info in the ML executor; adding this to validate that, and document what we _are_ recording.
This will light up the paths that we need to implement. Next!
This adds all the parameters necessary to invoke this method (if relevant) in the ML rule executor. Given the relative simplicity of the ML rule type, I'm guessing that many of these values are irrelevant/unused in this case, but I haven't yet investigated that. Next step is to exercise this implementation against the FTR tests, and see if the behavior is what we expect. Once that's done, we can try to pare down what we need/use. I also added some TODOs in the course of this work to check some potential bugs I noticed.
Tests were failing as rules were being created without suppression params. Fixed!
We've got suppression fields making it into ML alerts for the first time! Now, to test the various suppression conditions.
I realized that most of these tests were using es_archiver to insert anomalies into an index, but our tests were only ever using a single one of those anomalies. In order to ensure these tests are independent of the data in that archive, I've created and leveraged a helper to delete all the persisted anomalies, and then use existing tooling to manually insert the anomalies needed for our tests. All of the current tests are green; there are just a few more permutations that still need to be implemented.
This tests all of the interesting permutations of alert suppression for ML rules, both with per-execution and interval suppression durations. I added a few TODOs noting unexpected (to me) behavior; we'll see what others think.
...tion_logic/trial_license_complete_tier/execution_logic/machine_learning_alert_suppression.ts
Outdated
Show resolved
Hide resolved
...tion_logic/trial_license_complete_tier/execution_logic/machine_learning_alert_suppression.ts
Outdated
Show resolved
Hide resolved
The behavior demonstrated in this test is in fact expected, as the suppression duration window applies to the alert creation time, not the original anomaly time.
/ci |
Most other rule types have both a "fill" task and a "fillAndContinue" task; this adds that pattern for ML rules on the Define step.
These are failing because I haven't yet enabled the suppression UI for ML rules. Once that's done, we can start validating these tests.
Since some of these fields won't be mapped in the alerts index, we can't always do the dynamic filter generation based on the suppression fields. Until we have direction on how to handle that, we can at least display the current alert by _id, and allow the analyst to expand the timeline from there.
This mainly just composes some existing hooks that were previously pieced together in the form itself (step_define_rule) into a new hook, which is agnostic of the form itself.
Not sure where this came from, whether it was a bad conflict or just some weird autocompletion that I missed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ResponseOps changes LGTM. Only changes there were in our schema-change test, indicating new parameters in the rules, which could cause BWC / ZDT issues in rollbacks. Conversation in thread #181926 (comment) sounds like this will be handled appropriately.
💛 Build succeeded, but was flaky
Failed CI StepsMetrics [docs]Module Count
Async chunks
Page load bundle
Unknown metric groupsReferences to deprecated APIs
History
To update your PR or re-run it, just comment with: cc @rylnd |
...tion_logic/trial_license_complete_tier/execution_logic/machine_learning_alert_suppression.ts
Outdated
Show resolved
Hide resolved
x-pack/plugins/security_solution/public/detections/components/alerts_table/actions.tsx
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Investigations code owner changes. Nice work!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great work, @rylnd
I tested suppression on rule interval and during rule execution and did not find issues.
Consider to add ftr tests as per #181926 (comment), since we had in past issues with enrichment not working with suppression.
Have you also had a chance to check whether tests flaky or not?
Since I see it's checked in description, but links lead to PR itself
Stability of new and changed tests is verified using the Flaky Test Runner in both ESS and Serverless. By default, use 200 runs for ESS and 200 runs for Serverless.
ESS - Cypress x 200
Serverless - Cypress x 200
ESS - API x 200
Serverless - API x 200
I also think that we should disable duration and missing fields checkboxes, when suppression fields controller is disabled. Screen.Recording.2024-07-02.at.16.38.16.mov |
As they rely on a feature flag to function.
@vitaliidm I'm looking into this but we need to revisit all of those suppression conditions in the rule form. I'm going to merge this PR as is, we can determine if there are any bugs with non-ML rules during testing tomorrow, and address the form fixes holistically as a followup. |
Flaky Test Runner Stats🟠 Some tests failed. - kibana-flaky-test-suite-runner#6447[❌] x-pack/test/security_solution_api_integration/test_suites/detections_response/detection_engine/rule_execution_logic/trial_license_complete_tier/configs/ess.config.ts: 161/200 tests passed. |
Flaky Test Runner Stats🟠 Some tests failed. - kibana-flaky-test-suite-runner#6448[❌] x-pack/test/security_solution_api_integration/test_suites/detections_response/detection_engine/rule_execution_logic/trial_license_complete_tier/configs/serverless.config.ts: 156/200 tests passed. |
Flaky Test Runner Stats🟠 Some tests failed. - kibana-flaky-test-suite-runner#6449[❌] Security Solution Detection Engine - Cypress: 173/200 tests passed. |
Flaky Test Runner Stats🟠 Some tests failed. - kibana-flaky-test-suite-runner#6450[❌] [Serverless] Security Solution Detection Engine - Cypress: 168/200 tests passed. |
This was requested during review of elastic#181926, and I'm circling back to that now.
## Summary This PR is a followup to #181926. It includes the following changes: - Refactoring some Rule Form logic with `useMemo` - Requested [in this discussion](#181926 (comment)) - Addressed in a5fcf4d - Adds FTR tests validating ML Suppression supports alert enrichment - Requested [during previous review](#181926 (comment)) - Addressed in d5aa551 - Disables ML Suppression fields as a group - Requested in [this comment](#181926 (comment)) - Addressed by 983945b ### Checklist - [x] [Unit or functional tests](https://www.elastic.co/guide/en/kibana/master/development-tests.html) were updated or added to match the most common scenarios - [x] Any UI touched in this PR is usable by keyboard only (learn more about [keyboard accessibility](https://webaim.org/techniques/keyboard/)) - [x] Any UI touched in this PR does not create any new axe failures (run axe in browser: [FF](https://addons.mozilla.org/en-US/firefox/addon/axe-devtools/), [Chrome](https://chrome.google.com/webstore/detail/axe-web-accessibility-tes/lhdoppojpmngadmnindnejefpokejbdd?hl=en-US))
## Summary This PR is a followup to elastic#181926. It includes the following changes: - Refactoring some Rule Form logic with `useMemo` - Requested [in this discussion](elastic#181926 (comment)) - Addressed in a5fcf4d - Adds FTR tests validating ML Suppression supports alert enrichment - Requested [during previous review](elastic#181926 (comment)) - Addressed in d5aa551 - Disables ML Suppression fields as a group - Requested in [this comment](elastic#181926 (comment)) - Addressed by 983945b ### Checklist - [x] [Unit or functional tests](https://www.elastic.co/guide/en/kibana/master/development-tests.html) were updated or added to match the most common scenarios - [x] Any UI touched in this PR is usable by keyboard only (learn more about [keyboard accessibility](https://webaim.org/techniques/keyboard/)) - [x] Any UI touched in this PR does not create any new axe failures (run axe in browser: [FF](https://addons.mozilla.org/en-US/firefox/addon/axe-devtools/), [Chrome](https://chrome.google.com/webstore/detail/axe-web-accessibility-tes/lhdoppojpmngadmnindnejefpokejbdd?hl=en-US)) (cherry picked from commit e2150de)
## Summary This PR is a followup to elastic#181926. It includes the following changes: - Refactoring some Rule Form logic with `useMemo` - Requested [in this discussion](elastic#181926 (comment)) - Addressed in a5fcf4d - Adds FTR tests validating ML Suppression supports alert enrichment - Requested [during previous review](elastic#181926 (comment)) - Addressed in d5aa551 - Disables ML Suppression fields as a group - Requested in [this comment](elastic#181926 (comment)) - Addressed by 983945b ### Checklist - [x] [Unit or functional tests](https://www.elastic.co/guide/en/kibana/master/development-tests.html) were updated or added to match the most common scenarios - [x] Any UI touched in this PR is usable by keyboard only (learn more about [keyboard accessibility](https://webaim.org/techniques/keyboard/)) - [x] Any UI touched in this PR does not create any new axe failures (run axe in browser: [FF](https://addons.mozilla.org/en-US/firefox/addon/axe-devtools/), [Chrome](https://chrome.google.com/webstore/detail/axe-web-accessibility-tes/lhdoppojpmngadmnindnejefpokejbdd?hl=en-US))
Summary
This PR introduces Alert Suppression for ML Detection Rules. This feature is behaviorally similar to alerting suppression for other Detection Engine Rule types, and nearly identical to the analogous features for EQL rules.
There are some additional UI behaviors introduced here as well, mainly intended to cover the shortcomings discovered in #183100. Those behaviors are:
See screenshots below for more info.
Intermediate Serverless Deployment
As per the "intermediate deployment" requirements for serverless, while the schema (and declared alert SO mappings) will be extended to allow this functionality, the user-facing features are currently hidden behind a feature flag. Once this is merged and released, we can issue a "final" deployment in which the feature flag is enabled, and the feature effectively released.
Screenshots
Steps to Review
by_field_name
andby_field_value
(which ends up not actually suppressing anything)by_field_name
(which suppresses all anomalies into a single alert)Related Issues
Checklist