Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ML] Alerts as data integration for Anomaly Detection rule type #166349

Merged

Conversation

darnautov
Copy link
Contributor

@darnautov darnautov commented Sep 13, 2023

Summary

Part of #165958

Replaces usage of the deprecated alertFactory with the new alerts client and adds alerts-as-data integration for Anomaly Detection alerting rule type.

Alert instances are stored in .alerts-ml.anomaly-detection.alerts-default index and extends the common AlertSchema.

Result mappings
{
".internal.alerts-ml.anomaly-detection.alerts-default-000001": {
  "mappings": {
    "dynamic": "false",
    "_meta": {
      "namespace": "default",
      "kibana": {
        "version": "8.11.0"
      },
      "managed": true
    },
    "properties": {
      "@timestamp": {
        "type": "date"
      },
      "event": {
        "properties": {
          "action": {
            "type": "keyword"
          },
          "kind": {
            "type": "keyword"
          }
        }
      },
      "kibana": {
        "properties": {
          "alert": {
            "properties": {
              "action_group": {
                "type": "keyword"
              },
              "anomaly_score": {
                "type": "double"
              },
              "anomaly_timestamp": {
                "type": "date"
              },
              "case_ids": {
                "type": "keyword"
              },
              "duration": {
                "properties": {
                  "us": {
                    "type": "long"
                  }
                }
              },
              "end": {
                "type": "date"
              },
              "flapping": {
                "type": "boolean"
              },
              "flapping_history": {
                "type": "boolean"
              },
              "instance": {
                "properties": {
                  "id": {
                    "type": "keyword"
                  }
                }
              },
              "is_interim": {
                "type": "boolean"
              },
              "job_id": {
                "type": "keyword"
              },
              "last_detected": {
                "type": "date"
              },
              "maintenance_window_ids": {
                "type": "keyword"
              },
              "reason": {
                "type": "keyword"
              },
              "rule": {
                "properties": {
                  "category": {
                    "type": "keyword"
                  },
                  "consumer": {
                    "type": "keyword"
                  },
                  "execution": {
                    "properties": {
                      "uuid": {
                        "type": "keyword"
                      }
                    }
                  },
                  "name": {
                    "type": "keyword"
                  },
                  "parameters": {
                    "type": "flattened",
                    "ignore_above": 4096
                  },
                  "producer": {
                    "type": "keyword"
                  },
                  "revision": {
                    "type": "long"
                  },
                  "rule_type_id": {
                    "type": "keyword"
                  },
                  "tags": {
                    "type": "keyword"
                  },
                  "uuid": {
                    "type": "keyword"
                  }
                }
              },
              "start": {
                "type": "date"
              },
              "status": {
                "type": "keyword"
              },
              "time_range": {
                "type": "date_range",
                "format": "epoch_millis||strict_date_optional_time"
              },
              "top_influencers": {
                "type": "nested",
                "dynamic": "false",
                "properties": {
                  "influencer_field_name": {
                    "type": "keyword"
                  },
                  "influencer_field_value": {
                    "type": "keyword"
                  },
                  "influencer_score": {
                    "type": "double"
                  },
                  "initial_influencer_score": {
                    "type": "double"
                  },
                  "is_interim": {
                    "type": "boolean"
                  },
                  "job_id": {
                    "type": "keyword"
                  },
                  "timestamp": {
                    "type": "date"
                  }
                }
              },
              "top_records": {
                "type": "nested",
                "dynamic": "false",
                "properties": {
                  "actual": {
                    "type": "double"
                  },
                  "by_field_name": {
                    "type": "keyword"
                  },
                  "by_field_value": {
                    "type": "keyword"
                  },
                  "detector_index": {
                    "type": "integer"
                  },
                  "field_name": {
                    "type": "keyword"
                  },
                  "function": {
                    "type": "keyword"
                  },
                  "initial_record_score": {
                    "type": "double"
                  },
                  "is_interim": {
                    "type": "boolean"
                  },
                  "job_id": {
                    "type": "keyword"
                  },
                  "over_field_name": {
                    "type": "keyword"
                  },
                  "over_field_value": {
                    "type": "keyword"
                  },
                  "partition_field_name": {
                    "type": "keyword"
                  },
                  "partition_field_value": {
                    "type": "keyword"
                  },
                  "record_score": {
                    "type": "double"
                  },
                  "timestamp": {
                    "type": "date"
                  },
                  "typical": {
                    "type": "double"
                  }
                }
              },
              "url": {
                "type": "keyword",
                "index": false,
                "ignore_above": 2048
              },
              "uuid": {
                "type": "keyword"
              },
              "workflow_status": {
                "type": "keyword"
              },
              "workflow_tags": {
                "type": "keyword"
              }
            }
          },
          "space_ids": {
            "type": "keyword"
          },
          "version": {
            "type": "version"
          }
        }
      },
      "tags": {
        "type": "keyword"
      }
    }
  }
}
}

Checklist

@darnautov darnautov added :ml Feature:Alerting/RuleTypes Issues related to specific Alerting Rules Types Team:ML Team label for ML (also use :ml) Feature:Alerting/Alerts-as-Data Issues related to Alerts-as-data and RuleRegistry v8.11.0 labels Sep 13, 2023
@darnautov darnautov self-assigned this Sep 13, 2023
@darnautov darnautov marked this pull request as ready for review September 18, 2023 14:23
context: ANOMALY_DETECTION_AAD_INDEX_NAME,
mappings: {
fieldMap: {
[ALERT_ANOMALY_DETECTION_JOB_ID]: {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@droberts195 could you please review the mappings for alert-as-data index? only additive changes will be allowed after release so I want to double-check it's correct.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left some comments on the details but I don't understand enough about how these alerts will be searched to say with certainty what the correct answers are. There are very complicated tradeoffs to be made. Whatever is done, it's important that some comments are added to say why we've decided on those tradeoffs, as it's inevitable that somebody will complain about our decision in the future whichever way we go.

The decision on the mappings here needs to go with a best practice guide on how to search the resulting alerts.

[ALERT_ANOMALY_IS_INTERIM]: { type: ES_FIELD_TYPES.BOOLEAN, array: false, required: false },
[ALERT_ANOMALY_TIMESTAMP]: { type: ES_FIELD_TYPES.DATE, array: false, required: false },
[ALERT_TOP_RECORDS]: {
type: ES_FIELD_TYPES.NESTED,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nested is a very complex type to allow the sub-objects to be searched individually. It results in secret documents being indexed for each element of the array.

The other alternative here would be object. Whether nested is really needed here is probably the most important decision to make about these mappings before merging the PR. I don't know enough about how these alerts will be searched to know the answer.

There are more details in the "Arrays of objects" note on https://www.elastic.co/guide/en/elasticsearch/reference/current/array.html and the linked docs for the nested type.

The same question applies to influencers below.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We discussed it with @peteharverson and agreed it might be useful to fetch alerts with certain influencers, e.g.

GET .alerts-ml.anomaly-detection.alerts-default/_search
{
  "query": {
    "nested": {
      "path": "kibana.alert.top_influencers",
      "query": {
        "bool": {
          "must": [
            {
              "match": {
                "kibana.alert.top_influencers.influencer_field_name": "key"
              }
            }
          ]
        }
      }
    }
  }
}

If I'm not mistaken it's only possible with the nested type.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with @droberts195 that we need to be especially cautious with using the nested type. There are tons of pitfalls with the nested field type per the official docs: https://www.elastic.co/guide/en/elasticsearch/reference/current/nested.html. Additionally, Kibana does not handle nested fields well, so this will make consuming the data stored in these nested field types complicated as well.

If we use the object type, we'd still be able to perform the query to match the influencers. The biggest limitation is that we can't query for multiple fields that appear within the array because of the way ES flattens arrays: https://www.elastic.co/guide/en/elasticsearch/reference/current/nested.html#nested-arrays-flattening-objects

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're running into general limitations of Elasticsearch and Kibana functionality.

"Show me ML alerts where one of the influencer fields was airline" will be possible with an object mapping.

But, "Show me ML alerts where one of the influencer fields was airline and its influencer score was greater than 90" will not be reliable with object. This is because influence_field_name: airline and influencer_score > 90 might match on different array elements. So it might be that destination_airport was the influencer with the score above 90, not airline.

Given the difficulties of working with nested in Kibana, and other drawbacks with flattened, it sounds like object is the mapping to go for here. But this will restrict the search functionality that can be built on top of these alerts to some extent. If these limitations are likely to cause problems then we need to hold this functionality back for a later release and think through a more major redesign of the document structure.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Many thanks for the feedback and suggestions @droberts195, @kobelb. Updated to the object type in 33d037c.

Support for queries like this should be sufficient, cc @peteharverson

GET .alerts-ml.anomaly-detection.alerts-default/_search
{
  "query": {
    "term": {
      "kibana.alert.top_influencers.influencer_field_name": {
        "value": "key"
      }
    }
  }
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Dima. We discussed this a bit and came to the conclusion that object was the best option of the 3, even though none of the options are perfect. So what you've done tallies with the conclusion we came to.

type: ES_FIELD_TYPES.NESTED,
array: true,
required: false,
dynamic: false,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One thing to understand here is that when we write anomalies to the anomaly results indices we added the partition/by/over fields twice:

  1. We add them as partition_field_name/partition_field_value etc.
  2. We add them using their original field names, for example airline/AAL

In the anomaly results indices we try to map the original field names as well as the predictable ones. This means that you can search anomaly results with a terms search on the airline field and get hits.

Setting dynamic: false here means that these sub-objects won't be searchable by original field names. If the values have been populated by copying the _source of the original anomalies in full then the original fields will be visible in the _source of the alerts, but the alerts won't be searchable by these original field names.

Unfortunately, trying to create the original field names in the mappings in the anomaly results indices has turned out to be a very difficult problem, and has resulted in a lot of complicated code and support cases when things go wrong.

Therefore I think it would be best not to try and do the same for alerts. (In other words leave this part of the mappings as it is now.) But it's important to realise that, as a result, it won't be possible to search for alerts for airline/AAL, but instead it will be necessary to search for partition_field_name/airline and partition_field_value/AAL. This then ties back to the object vs nested decision, because you can only search for two fields together in arrays of objects if you use the nested type.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point regarding searching by the original field names and allowing dynamic mappings. It only affects .alerts-ml.anomaly-detection.alerts-default indices so it shouldn't be an issue.
To clarify, will we be able to query for airline:AAL with the object type?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be possible to use the flattened mapping to be able to search for airline:AAL inside the copy of the original _source?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's important to know the actual field name. With the flattened mapping we can end up with similar field values coming from completely different fields if I understand it correctly. WDYT @droberts195?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To clarify, will we be able to query for airline:AAL with the object type?

No, only if there was an explicit mapping for airline within the object. And getting those explicit mappings in is extremely hard for fields that vary by job.

Would it be possible to use the flattened mapping

flattened might be ideal if alert_top_records was a single object rather than an array. I am not sure how flattened will behave with an array of objects though - that would need some experimentation.

One more thing about flattened is that it treats all sub-values as keywords. So then it definitely won't be possible to do a search like, "Give me all alerts where one of the top records has a record score greater than 90."

It all comes back to how we expect users to be searching these alerts. If we have a feel for how people will want to search, filter and sort then it will be easier to create mappings that support those use cases (or at least say with certainty that it's impossible to support all of them and document the use cases we do support).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated type to object in 33d037c

context: ANOMALY_DETECTION_AAD_INDEX_NAME,
mappings: {
fieldMap: {
[ALERT_ANOMALY_DETECTION_JOB_ID]: {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left some comments on the details but I don't understand enough about how these alerts will be searched to say with certainty what the correct answers are. There are very complicated tradeoffs to be made. Whatever is done, it's important that some comments are added to say why we've decided on those tradeoffs, as it's inevitable that somebody will complain about our decision in the future whichever way we go.

The decision on the mappings here needs to go with a best practice guide on how to search the resulting alerts.

},
},
[ALERT_TOP_INFLUENCERS]: {
type: ES_FIELD_TYPES.NESTED,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@XavierM Does the alerts table support KQL queries against nested fields? I don't believe we have any other nested fields in the AAD indices as of now.

Copy link
Contributor

@droberts195 droberts195 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The index mappings LGTM.

(I haven't reviewed the rest of the PR, only the index mappings.)

@darnautov
Copy link
Contributor Author

@elasticmachine merge upstream

…ection-alerts-as-data

# Conflicts:
#	x-pack/plugins/ml/tsconfig.json
Copy link
Member

@jgowdyelastic jgowdyelastic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code LGTM

@kibana-ci
Copy link
Collaborator

💚 Build Succeeded

Metrics [docs]

Public APIs missing comments

Total count of every public API that lacks a comment. Target amount is 0. Run node scripts/build_api_docs --plugin [yourplugin] --stats comments for more detailed information.

id before after diff
@kbn/alerts-as-data-utils 29 30 +1

Public APIs missing exports

Total count of every type that is part of your API that should be exported but is not. This will cause broken links in the API documentation system. Target amount is 0. Run node scripts/build_api_docs --plugin [yourplugin] --stats exports for more detailed information.

id before after diff
ml 32 33 +1

Page load bundle

Size of the bundles that are downloaded on every page load. Target size is below 100kb

id before after diff
alerting 19.8KB 19.8KB +4.0B
Unknown metric groups

API count

id before after diff
@kbn/alerts-as-data-utils 29 30 +1

References to deprecated APIs

id before after diff
ml 151 149 -2

History

To update your PR or re-run it, just comment with:
@elasticmachine merge upstream

cc @darnautov

Copy link
Contributor

@peteharverson peteharverson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.
Tested against some anomaly detection rules and inspected index mapping and docs.

@darnautov darnautov merged commit 3ad5add into elastic:main Sep 28, 2023
@kibanamachine kibanamachine added the backport:skip This commit does not require backporting label Sep 28, 2023
@darnautov darnautov deleted the ml-165958-anomaly-detection-alerts-as-data branch September 28, 2023 13:46
ymao1 added a commit that referenced this pull request Sep 29, 2023
…tened alerts docs (#167439)

Resolves #166946

## Summary

The rule registry has traditionally written out AAD docs with flattened
keys, like

```
{
  "kibana.alert.rule.name": "test"
}
```

The framework alerts client has been writing out AAD docs as objects,
like

```
{
  "kibana": {
    "alert": {
      "rule": {
        "name": "test"
      }
    }
  }
}
```

We've identified a few places where we're updating the docs where having
this divergence makes things more difficult, so this is to switch the
framework to writing flattened alert docs before onboarding more rule
types.

This PR is targeted for 8.11, which is also when we onboarded the index
threshold rule type to FAAD. The only other rule type using FAAD to
write docs is ES query, which landed in 8.10 so there will be a followup
issue to handle the case of updating unflattened ES query AAD docs from
8.10

## To Verify

### ES Query and Index Threshold AaD

Create these rules that trigger alerts and verify that their AaD docs
are written out as flattened. For the ES Query rule type, select a
Metrics/Logs consumer and verify that they appear on the O11y alerts
table.

### ML alerts

ML alerts added in #166349 looked
like:

<details>
  <summary>Unflattened</summary>

```
{
	"kibana": {
		"alert": {
			"url": "/app/ml/explorer/?_g=(ml%3A(jobIds%3A!(rt-anomaly-mean-value))%2Ctime%3A(from%3A'2023-09-28T14%3A57%3A00.000Z'%2Cmode%3Aabsolute%2Cto%3A'2023-09-28T15%3A17%3A00.000Z'))&_a=(explorer%3A(mlExplorerFilter%3A(filterActive%3A!t%2CfilteredFields%3A!(key%2Cthird-key)%2CinfluencersFilterQuery%3A(bool%3A(minimum_should_match%3A1%2Cshould%3A!((match_phrase%3A(key%3Athird-key)))))%2CqueryString%3A'key%3A%22third-key%22')%2CmlExplorerSwimlane%3A()))",
			"reason": "Alerts are raised based on real-time scores. Remember that scores may be adjusted over time as data continues to be analyzed.",
			"job_id": "rt-anomaly-mean-value",
			"anomaly_score": 73.63508175828011,
			"is_interim": false,
			"anomaly_timestamp": 1695913620000,
			"top_records": [{
				"job_id": "rt-anomaly-mean-value",
				"record_score": 73.63516446528412,
				"initial_record_score": 73.63516446528412,
				"detector_index": 0,
				"is_interim": false,
				"timestamp": 1695913620000,
				"partition_field_name": "key",
				"partition_field_value": "third-key",
				"function": "mean",
				"actual": [
					3
				],
				"typical": [
					4.187715468532429
				]
			}],
			"top_influencers": [{
				"job_id": "rt-anomaly-mean-value",
				"influencer_field_name": "key",
				"influencer_field_value": "third-key",
				"influencer_score": 73.63508175828011,
				"initial_influencer_score": 73.63508175828011,
				"is_interim": false,
				"timestamp": 1695913620000
			}],
			"action_group": "anomaly_score_match",
			"flapping": false,
			"flapping_history": [
				true,
				false,
				false,
				false
			],
			"instance": {
				"id": "rt-anomaly-mean-value"
			},
			"maintenance_window_ids": [],
			"rule": {
				"category": "Anomaly detection alert",
				"consumer": "alerts",
				"execution": {
					"uuid": "e9e681d4-c8e4-43eb-82e5-a58bdf7ffe12"
				},
				"name": "rt-ad-alert-influencer",
				"parameters": {
					"severity": 5,
					"resultType": "influencer",
					"includeInterim": false,
					"jobSelection": {
						"jobIds": [
							"rt-anomaly-mean-value"
						],
						"groupIds": []
					},
					"lookbackInterval": null,
					"topNBuckets": null
				},
				"producer": "ml",
				"revision": 0,
				"rule_type_id": "xpack.ml.anomaly_detection_alert",
				"tags": [],
				"uuid": "9e1d6bc0-5e10-11ee-8416-3bf48cca0922"
			},
			"status": "active",
			"uuid": "c9c1f075-9985-4c55-8ff8-22349cb30269",
			"workflow_status": "open",
			"duration": {
				"us": "99021000000"
			},
			"start": "2023-09-28T15:07:12.868Z",
			"time_range": {
				"gte": "2023-09-28T15:07:12.868Z"
			}
		},
		"space_ids": [
			"default"
		],
		"version": "8.11.0"
	},
	"@timestamp": "2023-09-28T15:08:51.889Z",
	"event": {
		"action": "active",
		"kind": "signal"
	},
	"tags": []
}
```
</details>

Now they look like:

<details>
  <summary>Flattened</summary>

```
{
	"kibana.alert.url": "/app/ml/explorer/?_g=(ml%3A(jobIds%3A!(rt-anomaly-mean-value))%2Ctime%3A(from%3A'2023-09-28T15%3A03%3A00.000Z'%2Cmode%3Aabsolute%2Cto%3A'2023-09-28T15%3A23%3A00.000Z'))&_a=(explorer%3A(mlExplorerFilter%3A(filterActive%3A!t%2CfilteredFields%3A!(key%2Cthird-key)%2CinfluencersFilterQuery%3A(bool%3A(minimum_should_match%3A1%2Cshould%3A!((match_phrase%3A(key%3Athird-key)))))%2CqueryString%3A'key%3A%22third-key%22')%2CmlExplorerSwimlane%3A()))",
	"kibana.alert.reason": "Alerts are raised based on real-time scores. Remember that scores may be adjusted over time as data continues to be analyzed.",
	"kibana.alert.job_id": "rt-anomaly-mean-value",
	"kibana.alert.anomaly_score": 72.75515452061356,
	"kibana.alert.is_interim": false,
	"kibana.alert.anomaly_timestamp": 1695913980000,
	"kibana.alert.top_records": [{
		"job_id": "rt-anomaly-mean-value",
		"record_score": 72.75515452061356,
		"initial_record_score": 72.75515452061356,
		"detector_index": 0,
		"is_interim": false,
		"timestamp": 1695913980000,
		"partition_field_name": "key",
		"partition_field_value": "third-key",
		"function": "mean",
		"actual": [
			0.5
		],
		"typical": [
			4.138745343296527
		]
	}],
	"kibana.alert.top_influencers": [{
		"job_id": "rt-anomaly-mean-value",
		"influencer_field_name": "key",
		"influencer_field_value": "third-key",
		"influencer_score": 72.75515452061356,
		"initial_influencer_score": 72.75515452061356,
		"is_interim": false,
		"timestamp": 1695913980000
	}],
	"kibana.alert.rule.category": "Anomaly detection alert",
	"kibana.alert.rule.consumer": "alerts",
	"kibana.alert.rule.execution.uuid": "17fef3d3-d595-4362-837e-b2a73650169e",
	"kibana.alert.rule.name": "rt-ad-alert-influencer",
	"kibana.alert.rule.parameters": {
		"severity": 5,
		"resultType": "influencer",
		"includeInterim": false,
		"jobSelection": {
			"jobIds": [
				"rt-anomaly-mean-value"
			],
			"groupIds": []
		},
		"lookbackInterval": null,
		"topNBuckets": null
	},
	"kibana.alert.rule.producer": "ml",
	"kibana.alert.rule.revision": 0,
	"kibana.alert.rule.rule_type_id": "xpack.ml.anomaly_detection_alert",
	"kibana.alert.rule.tags": [],
	"kibana.alert.rule.uuid": "757c7610-5e11-11ee-8bc6-a95c3ced4757",
	"kibana.space_ids": [
		"default"
	],
	"@timestamp": "2023-09-28T15:14:52.057Z",
	"event.action": "active",
	"event.kind": "signal",
	"kibana.alert.action_group": "anomaly_score_match",
	"kibana.alert.flapping": false,
	"kibana.alert.flapping_history": [
		true,
		false,
		false,
		false
	],
	"kibana.alert.instance.id": "rt-anomaly-mean-value",
	"kibana.alert.maintenance_window_ids": [],
	"kibana.alert.status": "active",
	"kibana.alert.uuid": "ac1f0d7c-461b-4fc6-b4c3-04416ac876d3",
	"kibana.alert.workflow_status": "open",
	"kibana.alert.duration.us": "99028000000",
	"kibana.alert.start": "2023-09-28T15:13:13.028Z",
	"kibana.alert.time_range": {
		"gte": "2023-09-28T15:13:13.028Z"
	},
	"kibana.version": "8.11.0",
	"tags": []
}
```
</details>
ymao1 added a commit that referenced this pull request Oct 2, 2023
…tened alerts docs (#167691)

Resolves #166946

## PRs to this feature branch
* #167439
* #167583

## Summary

The rule registry has traditionally written out AAD docs with flattened
keys, like

```
{
  "kibana.alert.rule.name": "test"
}
```

The framework alerts client has been writing out AAD docs as objects,
like

```
{
  "kibana": {
    "alert": {
      "rule": {
        "name": "test"
      }
    }
  }
}
```

We've identified a few places where we're updating the docs where having
this divergence makes things more difficult, so this is to switch the
framework to writing flattened alert docs before onboarding more rule
types.

This PR is targeted for 8.11, which is also when we onboarded the index
threshold rule type and the ML anomaly detection rule type to FAAD. For
the ES query rule, which started writing unflattened AaD docs in 8.10,
this PR adds special handling to ensure that those unflattened docs are
correctly updated with flattened fields.

## To Verify

### ES Query and Index Threshold AaD

Create these rules that trigger alerts and verify that their AaD docs
are written out as flattened. For the ES Query rule type, select a
Metrics/Logs consumer and verify that they appear on the O11y alerts
table.

### ML alerts

ML alerts added in #166349 looked
like:

<details>
  <summary>Unflattened</summary>

```
{
	"kibana": {
		"alert": {
			"url": "/app/ml/explorer/?_g=(ml%3A(jobIds%3A!(rt-anomaly-mean-value))%2Ctime%3A(from%3A'2023-09-28T14%3A57%3A00.000Z'%2Cmode%3Aabsolute%2Cto%3A'2023-09-28T15%3A17%3A00.000Z'))&_a=(explorer%3A(mlExplorerFilter%3A(filterActive%3A!t%2CfilteredFields%3A!(key%2Cthird-key)%2CinfluencersFilterQuery%3A(bool%3A(minimum_should_match%3A1%2Cshould%3A!((match_phrase%3A(key%3Athird-key)))))%2CqueryString%3A'key%3A%22third-key%22')%2CmlExplorerSwimlane%3A()))",
			"reason": "Alerts are raised based on real-time scores. Remember that scores may be adjusted over time as data continues to be analyzed.",
			"job_id": "rt-anomaly-mean-value",
			"anomaly_score": 73.63508175828011,
			"is_interim": false,
			"anomaly_timestamp": 1695913620000,
			"top_records": [{
				"job_id": "rt-anomaly-mean-value",
				"record_score": 73.63516446528412,
				"initial_record_score": 73.63516446528412,
				"detector_index": 0,
				"is_interim": false,
				"timestamp": 1695913620000,
				"partition_field_name": "key",
				"partition_field_value": "third-key",
				"function": "mean",
				"actual": [
					3
				],
				"typical": [
					4.187715468532429
				]
			}],
			"top_influencers": [{
				"job_id": "rt-anomaly-mean-value",
				"influencer_field_name": "key",
				"influencer_field_value": "third-key",
				"influencer_score": 73.63508175828011,
				"initial_influencer_score": 73.63508175828011,
				"is_interim": false,
				"timestamp": 1695913620000
			}],
			"action_group": "anomaly_score_match",
			"flapping": false,
			"flapping_history": [
				true,
				false,
				false,
				false
			],
			"instance": {
				"id": "rt-anomaly-mean-value"
			},
			"maintenance_window_ids": [],
			"rule": {
				"category": "Anomaly detection alert",
				"consumer": "alerts",
				"execution": {
					"uuid": "e9e681d4-c8e4-43eb-82e5-a58bdf7ffe12"
				},
				"name": "rt-ad-alert-influencer",
				"parameters": {
					"severity": 5,
					"resultType": "influencer",
					"includeInterim": false,
					"jobSelection": {
						"jobIds": [
							"rt-anomaly-mean-value"
						],
						"groupIds": []
					},
					"lookbackInterval": null,
					"topNBuckets": null
				},
				"producer": "ml",
				"revision": 0,
				"rule_type_id": "xpack.ml.anomaly_detection_alert",
				"tags": [],
				"uuid": "9e1d6bc0-5e10-11ee-8416-3bf48cca0922"
			},
			"status": "active",
			"uuid": "c9c1f075-9985-4c55-8ff8-22349cb30269",
			"workflow_status": "open",
			"duration": {
				"us": "99021000000"
			},
			"start": "2023-09-28T15:07:12.868Z",
			"time_range": {
				"gte": "2023-09-28T15:07:12.868Z"
			}
		},
		"space_ids": [
			"default"
		],
		"version": "8.11.0"
	},
	"@timestamp": "2023-09-28T15:08:51.889Z",
	"event": {
		"action": "active",
		"kind": "signal"
	},
	"tags": []
}
```
</details>

Now they look like:

<details>
  <summary>Flattened</summary>

```
{
	"kibana.alert.url": "/app/ml/explorer/?_g=(ml%3A(jobIds%3A!(rt-anomaly-mean-value))%2Ctime%3A(from%3A'2023-09-28T15%3A03%3A00.000Z'%2Cmode%3Aabsolute%2Cto%3A'2023-09-28T15%3A23%3A00.000Z'))&_a=(explorer%3A(mlExplorerFilter%3A(filterActive%3A!t%2CfilteredFields%3A!(key%2Cthird-key)%2CinfluencersFilterQuery%3A(bool%3A(minimum_should_match%3A1%2Cshould%3A!((match_phrase%3A(key%3Athird-key)))))%2CqueryString%3A'key%3A%22third-key%22')%2CmlExplorerSwimlane%3A()))",
	"kibana.alert.reason": "Alerts are raised based on real-time scores. Remember that scores may be adjusted over time as data continues to be analyzed.",
	"kibana.alert.job_id": "rt-anomaly-mean-value",
	"kibana.alert.anomaly_score": 72.75515452061356,
	"kibana.alert.is_interim": false,
	"kibana.alert.anomaly_timestamp": 1695913980000,
	"kibana.alert.top_records": [{
		"job_id": "rt-anomaly-mean-value",
		"record_score": 72.75515452061356,
		"initial_record_score": 72.75515452061356,
		"detector_index": 0,
		"is_interim": false,
		"timestamp": 1695913980000,
		"partition_field_name": "key",
		"partition_field_value": "third-key",
		"function": "mean",
		"actual": [
			0.5
		],
		"typical": [
			4.138745343296527
		]
	}],
	"kibana.alert.top_influencers": [{
		"job_id": "rt-anomaly-mean-value",
		"influencer_field_name": "key",
		"influencer_field_value": "third-key",
		"influencer_score": 72.75515452061356,
		"initial_influencer_score": 72.75515452061356,
		"is_interim": false,
		"timestamp": 1695913980000
	}],
	"kibana.alert.rule.category": "Anomaly detection alert",
	"kibana.alert.rule.consumer": "alerts",
	"kibana.alert.rule.execution.uuid": "17fef3d3-d595-4362-837e-b2a73650169e",
	"kibana.alert.rule.name": "rt-ad-alert-influencer",
	"kibana.alert.rule.parameters": {
		"severity": 5,
		"resultType": "influencer",
		"includeInterim": false,
		"jobSelection": {
			"jobIds": [
				"rt-anomaly-mean-value"
			],
			"groupIds": []
		},
		"lookbackInterval": null,
		"topNBuckets": null
	},
	"kibana.alert.rule.producer": "ml",
	"kibana.alert.rule.revision": 0,
	"kibana.alert.rule.rule_type_id": "xpack.ml.anomaly_detection_alert",
	"kibana.alert.rule.tags": [],
	"kibana.alert.rule.uuid": "757c7610-5e11-11ee-8bc6-a95c3ced4757",
	"kibana.space_ids": [
		"default"
	],
	"@timestamp": "2023-09-28T15:14:52.057Z",
	"event.action": "active",
	"event.kind": "signal",
	"kibana.alert.action_group": "anomaly_score_match",
	"kibana.alert.flapping": false,
	"kibana.alert.flapping_history": [
		true,
		false,
		false,
		false
	],
	"kibana.alert.instance.id": "rt-anomaly-mean-value",
	"kibana.alert.maintenance_window_ids": [],
	"kibana.alert.status": "active",
	"kibana.alert.uuid": "ac1f0d7c-461b-4fc6-b4c3-04416ac876d3",
	"kibana.alert.workflow_status": "open",
	"kibana.alert.duration.us": "99028000000",
	"kibana.alert.start": "2023-09-28T15:13:13.028Z",
	"kibana.alert.time_range": {
		"gte": "2023-09-28T15:13:13.028Z"
	},
	"kibana.version": "8.11.0",
	"tags": []
}
```
</details>

---------

Co-authored-by: kibanamachine <42973632+kibanamachine@users.noreply.github.com>
darnautov added a commit that referenced this pull request Nov 10, 2023
## Summary

With alerts-as-data integration added in
#166349, we're enabled to
incorporate alerts historical data into views in the ML UI to see how it
correlates with the anomaly results.

This PR add alerts data to the Anomaly Explorer page. If selected
anomaly detection jobs have associated alerting rules, we show a new
"Alerts" panel.
It contains: 

<img width="1675" alt="image"
src="https://github.com/elastic/kibana/assets/5236598/1945d1f1-7f12-4a03-8ebd-e0b36c8fce68">

#### A line chart with alerts count over time using the Lens embeddable

It support sync cursor with the Anomaly swim lane making it easier to
align anomalous buckets with alerts spikes.

<img width="1189" alt="image"
src="https://github.com/elastic/kibana/assets/5236598/343b9bcf-bfa4-479d-bf8f-c1572402aa42">

#### Summary of the alerting rules
Shows an aggregated information for each alerting rule associated with
the current job selection:
  - An indicator if alerting rule is active
  - Total number of alerts 
  - Duration of the latest alerts 
  - Start time for active rules and Recovery time for recovered
  
Rules summary has a descending order based on the following criteria: 

- Number of active alerts in rule 
- Total number of alerts in rule 
- Duration of the most recent alert in rule 

<img width="1032" alt="image"
src="https://github.com/elastic/kibana/assets/5236598/899f37f3-dd8c-4cb6-b7f6-263ed86d20ee">

#### Alert details 

It contains an alerts table provided by `triggersActionsUI` plugin. For
each alert the user can:
- Open alerts details page
- Attach an alert to a new case
- Attach n alert to an existing case 

<img width="1177" alt="image"
src="https://github.com/elastic/kibana/assets/5236598/d3b7768a-bae2-404f-b364-ff7d7493cb9b">


#### Alert context menu 

When an anomaly swim lane cells are selected, and there are alerts
within the chosen time range, a context menu displaying alert details is
shown.

<img width="1202" alt="image"
src="https://github.com/elastic/kibana/assets/5236598/2b684c51-db5a-4f8c-bda9-c3e9aabde0d4">


### Checklist

- [x] Any text added follows [EUI's writing
guidelines](https://elastic.github.io/eui/#/guidelines/writing), uses
sentence case text and includes [i18n
support](https://github.com/elastic/kibana/blob/main/packages/kbn-i18n/README.md)
- [ ]
[Documentation](https://www.elastic.co/guide/en/kibana/master/development-documentation.html)
was added for features that require explanation or tutorials
- [x] [Unit or functional
tests](https://www.elastic.co/guide/en/kibana/master/development-tests.html)
were updated or added to match the most common scenarios
- [x] Any UI touched in this PR is usable by keyboard only (learn more
about [keyboard accessibility](https://webaim.org/techniques/keyboard/))
- [x] Any UI touched in this PR does not create any new axe failures
(run axe in browser:
[FF](https://addons.mozilla.org/en-US/firefox/addon/axe-devtools/),
[Chrome](https://chrome.google.com/webstore/detail/axe-web-accessibility-tes/lhdoppojpmngadmnindnejefpokejbdd?hl=en-US))
- [x] This renders correctly on smaller devices using a responsive
layout. (You can test this [in your
browser](https://www.browserstack.com/guide/responsive-testing-on-local-server))
- [x] This was checked for [cross-browser
compatibility](https://www.elastic.co/support/matrix#matrix_browsers)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport:skip This commit does not require backporting Feature:Alerting/Alerts-as-Data Issues related to Alerts-as-data and RuleRegistry Feature:Alerting/RuleTypes Issues related to specific Alerting Rules Types :ml release_note:enhancement Team:ML Team label for ML (also use :ml) v8.11.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants