Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Security Solution][Detection Engine] ML Rule forms have incorrect autocomplete fields #183100

Open
rylnd opened this issue May 9, 2024 · 6 comments
Assignees
Labels
bug Fixes for quality problems that affect the customer experience Feature:ML Rule Security Solution Machine Learning rule type Feature:Rule Creation Security Solution Detection Rule Creation workflow Feature:Rule Edit Security Solution Detection Rule Editing workflow impact:high Addressing this issue will have a high level of impact on the quality/strength of our product. Team:Detection Engine Security Solution Detection Engine Area

Comments

@rylnd
Copy link
Contributor

rylnd commented May 9, 2024

Summary

There are several fields that leverage autocomplete on the rule forms:

  • Severity override
  • Risk Score override
  • Timestamp override
  • Rule name override
  • Custom highlighted fields
  • Alert Suppression "suppress by" fields

However, in the case of ML rules, there is no underlying "data source" associated with the rule (at least as far as the UI is concerned), which leads to some interesting behaviors:

  1. When creating a new ML rule, any autocomplete components are populated with fields from the default security index pattern, since that's loaded/populated before the ML rule type is selected. At best, these fields are stale relative to the actual ML rule's data source(s), but at worst (and in the typical case) hese fields are absent from said data source.
  2. When editing an ML rule, any autocomplete components no longer have "possible" values, and only contain the fields that were selected during rule creation.

In both of the above scenarios, workarounds are limited since it isn't possible to manually add a field to those autocomplete components (as far as I am aware).

Steps to reproduce:

  1. Create an ML rule
  2. Edit the rule
  3. Attempt to add a field for Rule Name override

Expected behavior:
At the very least, I should be able to manually add fields to these components. Ideally, that process is aided by autocompleting from the ML indices directly, or else a static list of ECS fields.

Screenshots:
Screenshot 2024-05-09 at 4 30 19 PM

Any additional context:
This behavior looks to be as old as ML rules, which predate many of the autocomplete fields the rule creation/edit UI use.

This also relates to the recent issue with ML Rule Preview, which (with hindsight) I would now classify as a symptom of this broader issue: preview was based on the presence of an index argument (aka a data source), for which ML rules don't have an official concept.

@rylnd rylnd added bug Fixes for quality problems that affect the customer experience Feature:ML Rule Security Solution Machine Learning rule type Feature:Rule Creation Security Solution Detection Rule Creation workflow Feature:Rule Edit Security Solution Detection Rule Editing workflow labels May 9, 2024
@rylnd rylnd self-assigned this May 9, 2024
@botelastic botelastic bot added the needs-team Issues missing a team label label May 9, 2024
@rylnd rylnd added the Team:Detection Engine Security Solution Detection Engine Area label May 9, 2024
@elasticmachine
Copy link
Contributor

Pinging @elastic/security-detection-engine (Team:Detection Engine)

@botelastic botelastic bot removed the needs-team Issues missing a team label label May 9, 2024
rylnd added a commit to rylnd/kibana that referenced this issue May 9, 2024
These tests actually uncovered a deficiency in the ML rule creation
flows, where autocomplete is not correct. This means that it's
currently impossible to add/edit alert suppression for an existing ML
rule (via the UI).

Details in elastic#183100.
@rylnd
Copy link
Contributor Author

rylnd commented May 10, 2024

Update: after some discussion we've elected to spike out the following: ML rules will use the alerts index fields for its autopopulation. If that's relatively straightforward, it will satisfy the second issue listed above, and we can split off a separate "ML autocomplete doesn't include anomaly fields" issue as needed.

@rylnd
Copy link
Contributor Author

rylnd commented May 17, 2024

After some discussion I was pointed to the fact that the useRuleIndices hook already contains this logic (most of which was added in #133494), and is used in the expected places on the rule UI.

There are currently two issues with this implementation:

  1. If the corresponding ML job is not installed, the hook currently falls back to using the default security index pattern. In the case of ML rules, this is the incorrect behavior as those indices have little/no relation to the anomaly indexes.
  2. If the ML Job is not installed, we have no index to retrieve fields from, and no obvious fallback mechanism. This is potentially where @jgowdyelastic's "anomaly fields API" could be used.

@rylnd
Copy link
Contributor Author

rylnd commented Jun 18, 2024

Update: we have addressed the majority of concerns in rylnd#9, and that will soon be merged to main via #181926.

@yctercero I think once that's merged we can close this, and perhaps open a broader ticket for a more formal integration with ML?

@yctercero
Copy link
Contributor

Amazing! That sounds good. @approksiu and @ARWNightingale are working through new create rule flows. It may be great to sync up on the particularities of ML.

rylnd added a commit that referenced this issue Jul 2, 2024
## Summary
This PR introduces Alert Suppression for ML Detection Rules. This
feature is behaviorally similar to alerting suppression for other
Detection Engine Rule types, and nearly identical to the analogous
features for EQL rules.

There are some additional UI behaviors introduced here as well, mainly
intended to cover the shortcomings discovered in
#183100. Those behaviors are:

1. Populating the suppression field list with fields from the anomaly
index(es).
1. Disabling the suppression UI if no selected ML jobs are running
(because we cannot populate the list of fields on which they'll be
suppressing).
1. Warning the user if _some_ selected ML jobs are not running (because
the list of suppression fields may be incomplete).

See screenshots below for more info.

### Intermediate Serverless Deployment
As per the "intermediate deployment" requirements for serverless, while
the schema (and declared alert SO mappings) will be extended to allow
this functionality, the user-facing features are currently hidden behind
a feature flag. Once this is merged and released, we can issue a "final"
deployment in which the feature flag is enabled, and the feature
effectively released.


## Screenshots
* Overview of new UI fields
<img width="1044" alt="Screenshot 2024-05-16 at 3 22 02 PM"
src="https://github.com/elastic/kibana/assets/657252/8c07700d-5860-4d1e-a701-eac84fc35558">
* Example of Anomaly fields in suppression combobox
<img width="881" alt="Screenshot 2024-06-06 at 5 14 17 PM"
src="https://github.com/rylnd/kibana/assets/657252/9aa6ed99-1e02-44a0-ad1b-785136510d68">
* Suppression disabled due to no jobs running
<img width="668" alt="Screenshot 2024-06-17 at 11 23 39 PM"
src="https://github.com/elastic/kibana/assets/657252/a8636a52-31bd-4579-9bcd-d59d93c26984">
* Warning due to not all jobs running
<img width="776" alt="Screenshot 2024-06-17 at 11 26 16 PM"
src="https://github.com/elastic/kibana/assets/657252/f44c2400-570e-4fde-adce-e5841a2de08d">

## Steps to Review
1. Review the Test Plan for an overview of behavior
2. Review Integration tests for an overview of implementation and edge
cases
3. Review Cypress tests for an overview of UX changes
4. Testing on [Demo
Instance](https://rylnd-pr-181926-ml-rule-alert-suppression.kbndev.co/)
(elastic/changeme)
1. This instance has the relevant feature flag enabled, has some sample
auditbeat data, as well as the [anomalies archive
data](https://github.com/elastic/kibana/tree/main/x-pack/test/functional/es_archives/security_solution/anomalies)
for the purposes of exercising an ML rule against "real" anomalies
    1. There are a few example rules in the default space:
1. A simple [query
rule](https://rylnd-pr-181926-ml-rule-alert-suppression.kbndev.co/app/security/rules/id/f6f5960d-7e4b-40c1-ae15-501112822130)
against auditbeat data
1. An [ML
rule](https://rylnd-pr-181926-ml-rule-alert-suppression.kbndev.co/app/security/rules/id/9122669e-b2e1-41ce-af25-eeae15aa9ece)
with per-execution suppression on both `by_field_name` and
`by_field_value` (which ends up not actually suppressing anything)
1. An [ML
rule](https://rylnd-pr-181926-ml-rule-alert-suppression.kbndev.co/app/security/rules/id/0aabc280-00bd-42d4-82e6-65997c751797)
with per-execution suppression on `by_field_name` (which suppresses all
anomalies into a single alert)

## Related Issues
- This feature was temporarily blocked by
#183100, but those changes are
now in this PR.

## Checklist
- [x] Functional changes are hidden behind a feature flag. If not
hidden, the PR explains why these changes are being implemented in a
long-living feature branch.
- [x] Functional changes are covered with a test plan and automated
tests.
    * [Test Plan](elastic/security-team#9279)
- [x] Stability of new and changed tests is verified using the [Flaky
Test Runner](https://ci-stats.kibana.dev/trigger_flaky_test_runner) in
both ESS and Serverless. By default, use 200 runs for ESS and 200 runs
for Serverless.
* [ESS - Cypress x
200](https://buildkite.com/elastic/kibana-flaky-test-suite-runner/builds/6449)
* [Serverless - Cypress x
200](https://buildkite.com/elastic/kibana-flaky-test-suite-runner/builds/6450)
* [ESS - API x
200](https://buildkite.com/elastic/kibana-flaky-test-suite-runner/builds/6447)
* [Serverless - API x
200](https://buildkite.com/elastic/kibana-flaky-test-suite-runner/builds/6448)
- [ ] Comprehensive manual testing is done by two engineers: the PR
author and one of the PR reviewers. Changes are tested in both ESS and
Serverless.
- [ ] Mapping changes are accompanied by a technical design document. It
can be a GitHub issue or an RFC explaining the changes. The design
document is shared with and approved by the appropriate teams and
individual stakeholders.
- [ ] (OPTIONAL) OpenAPI specs changes include detailed descriptions and
examples of usage and are ready to be released on
https://docs.elastic.co/api-reference. NOTE: This is optional because at
the moment we don't have yet any OpenAPI specs that would be fully
"documented" and "GA-ready" for publishing on
https://docs.elastic.co/api-reference.
- [ ] Functional changes are communicated to the Docs team. A ticket is
opened in https://github.com/elastic/security-docs using the [Internal
documentation request (Elastic
employees)](https://github.com/elastic/security-docs/issues/new?assignees=&labels=&projects=&template=docs-request-internal.yaml&title=%5BRequest%5D+)
template. The following information is included: feature flags used,
target ESS version, planned timing for ESS and Serverless releases.

---------

Co-authored-by: Nastasha Solomon <79124755+nastasha-solomon@users.noreply.github.com>
Co-authored-by: kibanamachine <42973632+kibanamachine@users.noreply.github.com>
@rylnd
Copy link
Contributor Author

rylnd commented Jul 11, 2024

This was addressed (in part) by #181926, which added a new hook useRuleFields to retrieve the correct fields for a particular rule (including ML rules). It has only been applied to the ML Suppression fields, however, and so that new hook still needs to be leveraged by the other relevant UI fields.

@yctercero yctercero added the impact:high Addressing this issue will have a high level of impact on the quality/strength of our product. label Sep 28, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Fixes for quality problems that affect the customer experience Feature:ML Rule Security Solution Machine Learning rule type Feature:Rule Creation Security Solution Detection Rule Creation workflow Feature:Rule Edit Security Solution Detection Rule Editing workflow impact:high Addressing this issue will have a high level of impact on the quality/strength of our product. Team:Detection Engine Security Solution Detection Engine Area
Projects
None yet
Development

No branches or pull requests

3 participants