[Security Solution][Detection Engine] ML Rule forms have incorrect autocomplete fields #183100

rylnd · 2024-05-09T21:31:09Z

Summary

There are several fields that leverage autocomplete on the rule forms:

Severity override
Risk Score override
Timestamp override
Rule name override
Custom highlighted fields
Alert Suppression "suppress by" fields

However, in the case of ML rules, there is no underlying "data source" associated with the rule (at least as far as the UI is concerned), which leads to some interesting behaviors:

When creating a new ML rule, any autocomplete components are populated with fields from the default security index pattern, since that's loaded/populated before the ML rule type is selected. At best, these fields are stale relative to the actual ML rule's data source(s), but at worst (and in the typical case) hese fields are absent from said data source.
When editing an ML rule, any autocomplete components no longer have "possible" values, and only contain the fields that were selected during rule creation.

In both of the above scenarios, workarounds are limited since it isn't possible to manually add a field to those autocomplete components (as far as I am aware).

Steps to reproduce:

Create an ML rule
Edit the rule
Attempt to add a field for Rule Name override

Expected behavior:
At the very least, I should be able to manually add fields to these components. Ideally, that process is aided by autocompleting from the ML indices directly, or else a static list of ECS fields.

Screenshots:

Any additional context:
This behavior looks to be as old as ML rules, which predate many of the autocomplete fields the rule creation/edit UI use.

This also relates to the recent issue with ML Rule Preview, which (with hindsight) I would now classify as a symptom of this broader issue: preview was based on the presence of an index argument (aka a data source), for which ML rules don't have an official concept.

The text was updated successfully, but these errors were encountered:

elasticmachine · 2024-05-09T21:31:21Z

Pinging @elastic/security-detection-engine (Team:Detection Engine)

These tests actually uncovered a deficiency in the ML rule creation flows, where autocomplete is not correct. This means that it's currently impossible to add/edit alert suppression for an existing ML rule (via the UI). Details in elastic#183100.

rylnd · 2024-05-10T20:41:19Z

Update: after some discussion we've elected to spike out the following: ML rules will use the alerts index fields for its autopopulation. If that's relatively straightforward, it will satisfy the second issue listed above, and we can split off a separate "ML autocomplete doesn't include anomaly fields" issue as needed.

rylnd · 2024-05-17T17:30:50Z

After some discussion I was pointed to the fact that the useRuleIndices hook already contains this logic (most of which was added in #133494), and is used in the expected places on the rule UI.

There are currently two issues with this implementation:

If the corresponding ML job is not installed, the hook currently falls back to using the default security index pattern. In the case of ML rules, this is the incorrect behavior as those indices have little/no relation to the anomaly indexes.
If the ML Job is not installed, we have no index to retrieve fields from, and no obvious fallback mechanism. This is potentially where @jgowdyelastic's "anomaly fields API" could be used.

rylnd · 2024-06-18T22:10:32Z

Update: we have addressed the majority of concerns in rylnd#9, and that will soon be merged to main via #181926.

@yctercero I think once that's merged we can close this, and perhaps open a broader ticket for a more formal integration with ML?

yctercero · 2024-06-19T00:19:40Z

Amazing! That sounds good. @approksiu and @ARWNightingale are working through new create rule flows. It may be great to sync up on the particularities of ML.

## Summary This PR introduces Alert Suppression for ML Detection Rules. This feature is behaviorally similar to alerting suppression for other Detection Engine Rule types, and nearly identical to the analogous features for EQL rules. There are some additional UI behaviors introduced here as well, mainly intended to cover the shortcomings discovered in #183100. Those behaviors are: 1. Populating the suppression field list with fields from the anomaly index(es). 1. Disabling the suppression UI if no selected ML jobs are running (because we cannot populate the list of fields on which they'll be suppressing). 1. Warning the user if _some_ selected ML jobs are not running (because the list of suppression fields may be incomplete). See screenshots below for more info. ### Intermediate Serverless Deployment As per the "intermediate deployment" requirements for serverless, while the schema (and declared alert SO mappings) will be extended to allow this functionality, the user-facing features are currently hidden behind a feature flag. Once this is merged and released, we can issue a "final" deployment in which the feature flag is enabled, and the feature effectively released. ## Screenshots * Overview of new UI fields <img width="1044" alt="Screenshot 2024-05-16 at 3 22 02 PM" src="https://github.com/elastic/kibana/assets/657252/8c07700d-5860-4d1e-a701-eac84fc35558"> * Example of Anomaly fields in suppression combobox <img width="881" alt="Screenshot 2024-06-06 at 5 14 17 PM" src="https://github.com/rylnd/kibana/assets/657252/9aa6ed99-1e02-44a0-ad1b-785136510d68"> * Suppression disabled due to no jobs running <img width="668" alt="Screenshot 2024-06-17 at 11 23 39 PM" src="https://github.com/elastic/kibana/assets/657252/a8636a52-31bd-4579-9bcd-d59d93c26984"> * Warning due to not all jobs running <img width="776" alt="Screenshot 2024-06-17 at 11 26 16 PM" src="https://github.com/elastic/kibana/assets/657252/f44c2400-570e-4fde-adce-e5841a2de08d"> ## Steps to Review 1. Review the Test Plan for an overview of behavior 2. Review Integration tests for an overview of implementation and edge cases 3. Review Cypress tests for an overview of UX changes 4. Testing on [Demo Instance](https://rylnd-pr-181926-ml-rule-alert-suppression.kbndev.co/) (elastic/changeme) 1. This instance has the relevant feature flag enabled, has some sample auditbeat data, as well as the [anomalies archive data](https://github.com/elastic/kibana/tree/main/x-pack/test/functional/es_archives/security_solution/anomalies) for the purposes of exercising an ML rule against "real" anomalies 1. There are a few example rules in the default space: 1. A simple [query rule](https://rylnd-pr-181926-ml-rule-alert-suppression.kbndev.co/app/security/rules/id/f6f5960d-7e4b-40c1-ae15-501112822130) against auditbeat data 1. An [ML rule](https://rylnd-pr-181926-ml-rule-alert-suppression.kbndev.co/app/security/rules/id/9122669e-b2e1-41ce-af25-eeae15aa9ece) with per-execution suppression on both `by_field_name` and `by_field_value` (which ends up not actually suppressing anything) 1. An [ML rule](https://rylnd-pr-181926-ml-rule-alert-suppression.kbndev.co/app/security/rules/id/0aabc280-00bd-42d4-82e6-65997c751797) with per-execution suppression on `by_field_name` (which suppresses all anomalies into a single alert) ## Related Issues - This feature was temporarily blocked by #183100, but those changes are now in this PR. ## Checklist - [x] Functional changes are hidden behind a feature flag. If not hidden, the PR explains why these changes are being implemented in a long-living feature branch. - [x] Functional changes are covered with a test plan and automated tests. * [Test Plan](elastic/security-team#9279) - [x] Stability of new and changed tests is verified using the [Flaky Test Runner](https://ci-stats.kibana.dev/trigger_flaky_test_runner) in both ESS and Serverless. By default, use 200 runs for ESS and 200 runs for Serverless. * [ESS - Cypress x 200](https://buildkite.com/elastic/kibana-flaky-test-suite-runner/builds/6449) * [Serverless - Cypress x 200](https://buildkite.com/elastic/kibana-flaky-test-suite-runner/builds/6450) * [ESS - API x 200](https://buildkite.com/elastic/kibana-flaky-test-suite-runner/builds/6447) * [Serverless - API x 200](https://buildkite.com/elastic/kibana-flaky-test-suite-runner/builds/6448) - [ ] Comprehensive manual testing is done by two engineers: the PR author and one of the PR reviewers. Changes are tested in both ESS and Serverless. - [ ] Mapping changes are accompanied by a technical design document. It can be a GitHub issue or an RFC explaining the changes. The design document is shared with and approved by the appropriate teams and individual stakeholders. - [ ] (OPTIONAL) OpenAPI specs changes include detailed descriptions and examples of usage and are ready to be released on https://docs.elastic.co/api-reference. NOTE: This is optional because at the moment we don't have yet any OpenAPI specs that would be fully "documented" and "GA-ready" for publishing on https://docs.elastic.co/api-reference. - [ ] Functional changes are communicated to the Docs team. A ticket is opened in https://github.com/elastic/security-docs using the [Internal documentation request (Elastic employees)](https://github.com/elastic/security-docs/issues/new?assignees=&labels=&projects=&template=docs-request-internal.yaml&title=%5BRequest%5D+) template. The following information is included: feature flags used, target ESS version, planned timing for ESS and Serverless releases. --------- Co-authored-by: Nastasha Solomon <79124755+nastasha-solomon@users.noreply.github.com> Co-authored-by: kibanamachine <42973632+kibanamachine@users.noreply.github.com>

rylnd · 2024-07-11T18:43:14Z

This was addressed (in part) by #181926, which added a new hook useRuleFields to retrieve the correct fields for a particular rule (including ML rules). It has only been applied to the ML Suppression fields, however, and so that new hook still needs to be leveraged by the other relevant UI fields.

rylnd added bug Fixes for quality problems that affect the customer experience Feature:ML Rule Security Solution Machine Learning rule type Feature:Rule Creation Security Solution Detection Rule Creation workflow Feature:Rule Edit Security Solution Detection Rule Editing workflow labels May 9, 2024

rylnd self-assigned this May 9, 2024

botelastic bot added the needs-team Issues missing a team label label May 9, 2024

rylnd added the Team:Detection Engine Security Solution Detection Engine Area label May 9, 2024

botelastic bot removed the needs-team Issues missing a team label label May 9, 2024

rylnd added the 8.15 candidate label May 14, 2024

rylnd mentioned this issue May 16, 2024

[Detection Engine] Adds Alert Suppression to ML Rules #181926

Merged

7 tasks

yctercero removed the 8.15 candidate label Aug 27, 2024

yctercero added the impact:high Addressing this issue will have a high level of impact on the quality/strength of our product. label Sep 28, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Security Solution][Detection Engine] ML Rule forms have incorrect autocomplete fields #183100

[Security Solution][Detection Engine] ML Rule forms have incorrect autocomplete fields #183100

rylnd commented May 9, 2024

elasticmachine commented May 9, 2024

rylnd commented May 10, 2024

rylnd commented May 17, 2024

rylnd commented Jun 18, 2024

yctercero commented Jun 19, 2024

rylnd commented Jul 11, 2024

[Security Solution][Detection Engine] ML Rule forms have incorrect autocomplete fields #183100

[Security Solution][Detection Engine] ML Rule forms have incorrect autocomplete fields #183100

Comments

rylnd commented May 9, 2024

Summary

elasticmachine commented May 9, 2024

rylnd commented May 10, 2024

rylnd commented May 17, 2024

rylnd commented Jun 18, 2024

yctercero commented Jun 19, 2024

rylnd commented Jul 11, 2024