From f830abb3c35d4241915b3f4920a16d8da60a2309 Mon Sep 17 00:00:00 2001 From: Kibana Machine <42973632+kibanamachine@users.noreply.github.com> Date: Mon, 24 May 2021 18:17:35 -0400 Subject: [PATCH] [Alerting] Update README (#100478) (#100502) * Updating readme * Updating readme * Fix plugin list docs Co-authored-by: ymao1 --- docs/developer/plugin-list.asciidoc | 2 +- x-pack/plugins/alerting/README.md | 386 ++++++++++++++++------------ 2 files changed, 216 insertions(+), 172 deletions(-) diff --git a/docs/developer/plugin-list.asciidoc b/docs/developer/plugin-list.asciidoc index a4e0a8f51490a..8b6217837644f 100644 --- a/docs/developer/plugin-list.asciidoc +++ b/docs/developer/plugin-list.asciidoc @@ -317,7 +317,7 @@ which will load the visualization's editor. |{kib-repo}blob/{branch}/x-pack/plugins/alerting/README.md[alerting] -|The Kibana alerting plugin provides a common place to set up alerts. You can: +|The Kibana Alerting plugin provides a common place to set up rules. You can: |{kib-repo}blob/{branch}/x-pack/plugins/apm/readme.md[apm] diff --git a/x-pack/plugins/alerting/README.md b/x-pack/plugins/alerting/README.md index eb64d71be565e..0d5b0bf415fed 100644 --- a/x-pack/plugins/alerting/README.md +++ b/x-pack/plugins/alerting/README.md @@ -1,21 +1,21 @@ -# Kibana alerting +# Kibana Alerting -The Kibana alerting plugin provides a common place to set up alerts. You can: +The Kibana Alerting plugin provides a common place to set up rules. You can: -- Register types of alerts -- List the types of registered alerts -- Perform CRUD actions on alerts +- Register types of rules +- List the types of registered rules +- Perform CRUD actions on rules ---- Table of Contents -- [Kibana alerting](#kibana-alerting) +- [Kibana Alerting](#kibana-alerting) - [Terminology](#terminology) - [Usage](#usage) - - [Alerts API keys](#alerts-api-keys) - - [Plugin status](#plugin-status) - - [Alert types](#alert-types) + - [Alerting API Keys](#alerting-api-keys) + - [Plugin Status](#plugin-status) + - [Rule Types](#rule-types) - [Methods](#methods) - [Executor](#executor) - [Action variables](#action-variables) @@ -24,52 +24,62 @@ Table of Contents - [Tests](#tests) - [Example](#example) - [Role Based Access-Control](#role-based-access-control) - - [Alert Navigation](#alert-navigation) - - [Experimental RESTful API](#restful-api) - - [`GET /api/alerts/alert/{id}/state`: Get alert state](#get-apialertidstate-get-alert-state) - - [`GET /api/alerts/alert/{id}/_instance_summary`: Get alert instance summary](#get-apialertidstate-get-alert-instance-summary) - - [`POST /api/alerts/alert/{id}/_update_api_key`: Update alert API key](#post-apialertidupdateapikey-update-alert-api-key) - - [Alert instance factory](#alert-instance-factory) - - [Templating actions](#templating-actions) + - [Alerting Navigation](#alert-navigation) + - [Internal HTTP APIs](#internal-http-apis) + - [`GET /internal/alerting/rule/{id}/state`: Get rule state](#get-internalalertingruleidstate-get-rule-state) + - [`GET /internal/alerting/rule/{id}/_alert_summary`: Get rule alert summary](#get-internalalertingruleidalertsummary-get-rule-alert-summary) + - [`POST /internal/alerting/rule/{id}/_update_api_key`: Update rule API key](#post-internalalertingruleidupdateapikey-update-rule-api-key) + - [Alert Factory](#alert-factory) + - [Templating Actions](#templating-actions) - [Examples](#examples) - ## Terminology -**Alert Type**: A function that takes parameters and executes actions to alert instances. +> Disclaimer: We are actively working to update the terminology of the Alerting Framework. While all user-facing terminology has been updated, much of the codebase is still a work in progress. + + +> References to `rule` and `rule type` entities are still named `AlertType` within the codebase. -**Alert**: A configuration that defines a schedule, an alert type w/ parameters, state information and actions. +> References to `alert` and `alert factory` entities are still named `AlertInstance` and `alertInstanceFactory` within the codebase. -**Alert Instance**: The instance(s) created from an alert type execution. +**Rule Type**: A function that takes parameters and executes actions on alerts. -A Kibana alert detects a condition and executes one or more actions when that condition occurs. Alerts work by going through the followings steps: +**Rule**: A configuration that defines a schedule, a rule type w/ parameters, state information and actions. -1. Run a periodic check to detect a condition (the check is provided by an Alert Type) -2. Convert that condition into one or more stateful Alert Instances -3. Map Alert Instances to pre-defined Actions, using templating -4. Execute the Actions +**Alert**: The alert(s) created from a rule execution. + +A Kibana rule detects a condition and executes one or more actions when that condition occurs. Rules work by going through the followings steps: + +1. Run a periodic check to detect a condition (the check is provided by a rule type). +2. Convert that condition into one or more stateful alerts. +3. Map alerts to pre-defined actions, using templating. +4. Execute the actions. ## Usage -1. Develop and register an alert type (see alert types -> example). -2. Configure feature level privileges using RBAC -3. Create an alert using the RESTful API [Documentation](https://www.elastic.co/guide/en/kibana/master/alerts-api-update.html) (see alerts -> create). +1. Develop and register a rule type (see rule types -> example). +2. Configure feature level privileges using RBAC. +3. Create a rule using the RESTful API [Documentation](https://www.elastic.co/guide/en/kibana/master/alerting-apis.html) (see rules -> create). + +## Alerting API Keys -## Alerts API keys +When we create a rule, we generate a new API key. -When we create an alert, we generate a new API key. +When we update, enable, or disable a rule, we must invalidate the old API key and create a new one. -When we update, enable, or disable an alert, we must invalidate the old API key and create a new one. +To manage the invalidation process for API keys, we use the saved object type `api_key_pending_invalidation`. This saved object stores all API keys that were marked for invalidation anytime rules were updated, enabled or disabled. + +For security plugin invalidation, we schedule a task to check if the `api_key_pending_invalidation` saved object contains new API keys that were marked for invalidation earlier than the configured delay. The default schedule for running this task is every 5 minutes. -To manage the invalidation process for API keys, we use the saved object `api_key_pending_invalidation`. This object stores all API keys that were marked for invalidation when alerts were updated. -For security plugin invalidation, we schedule a task to check if the`api_key_pending_invalidation` saved object contains new API keys that are marked for invalidation earlier than the configured delay. The default value for running the task is 5 mins. To change the schedule for the invalidation task, use the kibana.yml configuration option `xpack.alerting.invalidateApiKeysTask.interval`. + To change the default delay for the API key invalidation, use the kibana.yml configuration option `xpack.alerting.invalidateApiKeysTask.removalDelay`. -## Plugin status +## Plugin Status -The plugin status of an alert is customized by including information about checking failures for the framework decryption: -``` +The plugin status of the Alerting Framework is customized by including information about checking for failures during framework decryption: + +```js core.status.set( combineLatest([ core.status.derivedStatus$, @@ -85,9 +95,10 @@ core.status.set( ) ); ``` + To check for framework decryption failures, we use the task `alerting_health_check`, which runs every 60 minutes by default. To change the default schedule, use the kibana.yml configuration option `xpack.alerting.healthCheck.interval`. -## Alert types +## Rule Types ### Methods @@ -97,44 +108,59 @@ The following table describes the properties of the `options` object. |Property|Description|Type| |---|---|---| -|id|Unique identifier for the alert type. For convention purposes, ids starting with `.` are reserved for built in alert types. We recommend using a convention like `.mySpecialAlert` for your alert types to avoid conflicting with another plugin.|string| -|name|A user-friendly name for the alert type. These will be displayed in dropdowns when choosing alert types.|string| -|actionGroups|An explicit list of groups the alert type may schedule actions for, each specifying the ActionGroup's unique ID and human readable name. Alert `actions` validation will use this configuartion to ensure groups are valid. We highly encourage using `kbn-i18n` to translate the names of actionGroup when registering the AlertType. |Array<{id:string, name:string}>| -|defaultActionGroupId|Default ID value for the group of the alert type.|string| -|recoveryActionGroup|An action group to use when an alert instance goes from an active state, to an inactive one. This action group should not be specified under the `actionGroups` property. If no recoveryActionGroup is specified, the default `recovered` action group will be used. |{id:string, name:string}| -|actionVariables|An explicit list of action variables the alert type makes available via context and state in action parameter templates, and a short human readable description. Alert UI will use this to display prompts for the users for these variables, in action parameter editors. We highly encourage using `kbn-i18n` to translate the descriptions. |{ context: Array<{name:string, description:string}, state: Array<{name:string, description:string}>| -|validate.params|When developing an alert type, you can choose to accept a series of parameters. You may also have the parameters validated before they are passed to the `executor` function or created as an alert saved object. In order to do this, provide a `@kbn/config-schema` schema that we will use to validate the `params` attribute.|@kbn/config-schema| -|executor|This is where the code of the alert type lives. This is a function to be called when executing an alert on an interval basis. For full details, see executor section below.|Function| -|producer|The id of the application producing this alert type.|string| -|minimumLicenseRequired|The value of a minimum license. Most of the alerts are licensed as "basic".|string| +|id|Unique identifier for the rule type. By convention, IDs starting with `.` are reserved for built-in rule types. We recommend using a convention like `.mySpecialRule` for your rule types to avoid conflicting with another plugin.|string| +|name|A user-friendly name for the rule type. These will be displayed in dropdowns when choosing rule types.|string| +|actionGroups|An explicit list of groups the rule type may schedule actions for, each specifying the ActionGroup's unique ID and human readable name. Each rule type's `actions` validation will use this list to ensure configured groups are valid. We highly encourage using `kbn-i18n` to translate the names of actionGroup when registering the rule type. |Array<{id:string, name:string}>| +|defaultActionGroupId|ID value for the default action group for the rule type.|string| +|recoveryActionGroup|The action group to use when an alert goes from an active state to an inactive one. This action group should not be specified under the `actionGroups` property. If no recoveryActionGroup is specified, the default `recovered` action group will be used. |{id:string, name:string}| +|actionVariables|An explicit list of action variables that the rule type makes available via context and state in action parameter templates, and a short human readable description for each. The Alerting UI will use this to display prompts for the users for these variables, in action parameter editors. We highly encourage using `kbn-i18n` to translate the descriptions. |{ context: Array<{name:string, description:string}, state: Array<{name:string, description:string}>| +|validate.params|When developing a rule type, you can choose to accept a series of parameters. You may also choose to have the parameters validated before they are passed to the `executor` function or created as a saved object. In order to do this, provide a `@kbn/config-schema` schema that we will use to validate the `params` attribute.|@kbn/config-schema| +|executor|This is where the code for the rule type lives. This is a function to be called when executing a rule on an interval basis. For full details, see the executor section below.|Function| +|producer|The id of the application producing this rule type.|string| +|minimumLicenseRequired|The value of a minimum license. Most of the rules are licensed as "basic".|string| ### Executor -This is the primary function for an alert type. Whenever the alert needs to execute, this function will perform the execution. It receives a variety of parameters. The following table describes the properties the executor receives. +This is the primary function for a rule type. Whenever the rule needs to execute, this function will perform the execution. It receives a variety of parameters. The following table describes the properties the executor receives. **executor(options)** |Property|Description| |---|---| |services.scopedClusterClient|This is an instance of the Elasticsearch client. Use this to do Elasticsearch queries in the context of the user who created the alert when security is enabled.| -|services.savedObjectsClient|This is an instance of the saved objects client. This provides the ability to do CRUD on any saved objects within the same space the alert lives in.

The scope of the saved objects client is tied to the user who created the alert (only when security isenabled).| -|services.alertInstanceFactory(id)|This [alert instance factory](#alert-instance-factory) creates instances of alerts and must be used in order to execute actions. The id you give to the alert instance factory is a unique identifier to the alert instance.| +|services.savedObjectsClient|This is an instance of the saved objects client. This provides the ability to perform CRUD operations on any saved object that lives in the same space as the rule.

The scope of the saved objects client is tied to the user who created the rule (only when security is enabled).| +|services.alertInstanceFactory(id)|This [alert factory](#alert-factory) creates alerts and must be used in order to execute actions. The id you give to the alert factory is a unique identifier for the alert.| |services.log(tags, [data], [timestamp])|Use this to create server logs. (This is the same function as server.log)| -|startedAt|The date and time the alert type started execution.| -|previousStartedAt|The previous date and time the alert type started a successful execution.| -|params|Parameters for the execution. This is where the parameters you require will be passed in. (example threshold). Use alert type validation to ensure values are set before execution.| -|state|State returned from previous execution. This is the alert level state. What the executor returns will be serialized and provided here at the next execution.| -|alertId|The id of this alert.| -|spaceId|The id of the space of this alert.| -|namespace|The namespace of the space of this alert; same as spaceId, unless spaceId === 'default', then namespace = undefined.| -|name|The name of this alert.| -|tags|The tags associated with this alert.| -|createdBy|The userid that created this alert.| -|updatedBy|The userid that last updated this alert.| +|startedAt|The date and time the rule type started execution.| +|previousStartedAt|The previous date and time the rule type started a successful execution.| +|params|Parameters for the execution. This is where the parameters you require will be passed in. (e.g. threshold). Use rule type validation to ensure values are set before execution.| +|state|State returned from the previous execution. This is the rule level state. What the executor returns will be serialized and provided here at the next execution.| +|alertId|The id of this rule.| +|spaceId|The id of the space of this rule.| +|namespace|The namespace of the space of this rule. This is the same as `spaceId`, unless `spaceId === "default"`, in which case the namespace = `undefined`.| +|name|The name of this rule. This will eventually be removed in favor of `rule.name`.| +|tags|The tags associated with this rule. This will eventually be removed in favor of `rule.tags`.| +|createdBy|The user ID of the user that created this rule. This will eventually be removed in favor of `rule.createdBy`.| +|updatedBy|The user ID of the user that last updated this rule. This will eventually be removed in favor of `rule.updatedBy`.| +|rule.name|The name of this rule.| +|rule.tags|The tags associated with this rule.| +|rule.consumer|The consumer of this rule type.| +|rule.producer|The producer of this rule type.| +|rule.ruleTypeId|The ID of the rule type for this rule.| +|rule.ruleTypeName|The user-friendly name of the rule type for this rule.| +|rule.enabled|Whether this rule is currently enabled.| +|rule.schedule|The configured schedule interval of this rule.| +|rule.actions|The configured actions for this rule.| +|rule.createdBy|The user ID of the user that created this rule.| +|rule.updatedBy|The user ID of the user that last updated this rule.| +|rule.createdAt|The date and time this rule was created.| +|rule.updatedAt|The date and this this rule was last updated.| +|rule.throttle|The configured throttle interval for this rule.| +|rule.notifyWhen|The configured notification type for this rule.| ### Action Variables -The `actionVariables` property should contain the **flattened** names of the state and context variables available when an executor calls `alertInstance.scheduleActions(actionGroup, context)`. These names are meant to be used in prompters in the alerting user interface, are used as text values for display, and can be inserted into to an action parameter text entry field via UI gesture (eg, clicking a menu item from a menu built with these names). They should be flattened, so if a state or context variable is an object with properties, these should be listed with the "parent" property/properties in the name, separated by a `.` (period). +The `actionVariables` property should contain the **flattened** names of the state and context variables available when an executor calls `alertInstance.scheduleActions(actionGroup, context)`. These names are meant to be used in prompters in the Alerting UI, are used as text values for display, and can be inserted into to an action parameter text entry field via a UI gesture (e.g., clicking a menu item from a menu built with these names). They should be flattened, so if a state or context variable is an object with properties, these should be listed with the "parent" property/properties in the name, separated by a `.` (period). For example, if the `context` has one variable `foo` which is an object that has one property `bar`, and there are no `state` variables, the `actionVariables` value would be in the following shape: @@ -148,65 +174,67 @@ For example, if the `context` has one variable `foo` which is an object that has ## Licensing -Currently most of the alerts are free features. But some alert types are subscription features, such as the tracking containment alert. +Currently most rule types are free features. But some rule types are subscription features, such as the tracking containment rule. ## Documentation -You should create asciidoc for the new alert type. -* For stack alerts, add an entry to the alert type index - [`docs/user/alerting/alert-types.asciidoc`](../../../docs/user/alerting/alert-types.asciidoc) which points to a new document for the alert type that should be in the directory [`docs/user/alerting/stack-alerts`](../../../docs/user/alerting/stack-alerts). +You should create asciidoc for each new rule type you develop: -* Solution specific alert documentation should live within the docs for the solution. +- For stack rules, add an entry to the rule type index - [`docs/user/alerting/stack-rules.asciidoc`](../../../docs/user/alerting/stack-rules.asciidoc) which points to a new document for the rule type that should live in the directory [`docs/user/alerting/stack-rules`](../../../docs/user/alerting/stack-rules). -We suggest following the template provided in `docs/alert-type-template.asciidoc`. The [Index Threshold alert type](https://www.elastic.co/guide/en/kibana/master/alert-type-index-threshold.html) is an example of documentation created following the template. +- Solution specific rule documentation should live within the docs for the solution. + +We suggest following the template provided in `docs/rule-type-template.asciidoc`. The [Index Threshold rule type](https://www.elastic.co/guide/en/kibana/master/rule-type-index-threshold.html) is an example of documentation created following the template. ## Tests -The alert type should have jest tests and optionaly functional tests. -In the the tests we recomend to test the expected alert execution result with a different input params, the structure of the created alert and the params validation. The rest will be guaranteed as a framework functionality. +The rule type should have jest tests and, optionally, functional tests. +In the tests, we recommend testing the expected rule execution result with different input params, testing the structure of the created rule and testing the parameter validation. The rest will be guaranteed as a framework functionality. ### Example -This example receives server and threshold as parameters. It will read the CPU usage of the server and schedule actions to be executed (asynchronously by the task manager) if the reading is greater than the threshold. +This example rule type receives server and threshold as parameters. It will read the CPU usage of the server and schedule actions to be executed (asynchronously by the task manager) if the usage is greater than the threshold. ```typescript import { schema } from '@kbn/config-schema'; import { AlertType, AlertExecutorOptions } from '../../../alerting/server'; +// These type names will eventually be updated to reflect the new terminology import { - AlertTypeParams, - AlertTypeState, - AlertInstanceState, - AlertInstanceContext, + AlertTypeParams, + AlertTypeState, + AlertInstanceState, + AlertInstanceContext, } from '../../../alerting/common'; ... -interface MyAlertTypeParams extends AlertTypeParams { +interface MyRuleTypeParams extends AlertTypeParams { server: string; threshold: number; } -interface MyAlertTypeState extends AlertTypeState { +interface MyRuleTypeState extends AlertTypeState { lastChecked: Date; } -interface MyAlertTypeInstanceState extends AlertInstanceState { +interface MyRuleTypeAlertState extends AlertInstanceState { cpuUsage: number; } -interface MyAlertTypeInstanceContext extends AlertInstanceContext { +interface MyRuleTypeAlertContext extends AlertInstanceContext { server: string; hasCpuUsageIncreased: boolean; } -type MyAlertTypeActionGroups = 'default' | 'warning'; +type MyRuleTypeActionGroups = 'default' | 'warning'; -const myAlertType: AlertType< - MyAlertTypeParams, - MyAlertTypeState, - MyAlertTypeInstanceState, - MyAlertTypeInstanceContext, - MyAlertTypeActionGroups +const myRuleType: AlertType< + MyRuleTypeParams, + MyRuleTypeState, + MyRuleTypeAlertState, + MyRuleTypeAlertContext, + MyRuleTypeActionGroups > = { - id: 'my-alert-type', - name: 'My alert type', + id: 'my-rule-type', + name: 'My rule type', validate: { params: schema.object({ server: schema.string(), @@ -235,13 +263,20 @@ const myAlertType: AlertType< }, minimumLicenseRequired: 'basic', async executor({ - alertId, + alertId, startedAt, previousStartedAt, services, params, state, - }: AlertExecutorOptions) { + rule, + }: AlertExecutorOptions< + MyRuleTypeParams, + MyRuleTypeState, + MyRuleTypeAlertState, + MyRuleTypeAlertContext, + MyRuleTypeActionGroups + >) { // Let's assume params is { server: 'server_1', threshold: 0.8 } const { server, threshold } = params; @@ -250,47 +285,50 @@ const myAlertType: AlertType< // Only execute if CPU usage is greater than threshold if (currentCpuUsage > threshold) { - // The first argument is a unique identifier the alert instance is about. In this scenario - // the provided server will be used. Also, this id will be used to make `getState()` return - // previous state, if any, on matching identifiers. - const alertInstance = services.alertInstanceFactory(server); + // The first argument is a unique identifier for the alert. In this + // scenario the provided server will be used. Also, this ID will be + // used to make `getState()` return previous state, if any, on + // matching identifiers. + const alert = services.alertInstanceFactory(server); - // State from last execution. This will exist if an alert instance was created and executed - // in the previous execution - const { cpuUsage: previousCpuUsage } = alertInstance.getState(); + // State from the last execution. This will exist if an alert was + // created and executed in the previous execution + const { cpuUsage: previousCpuUsage } = alert.getState(); // Replace state entirely with new values - alertInstance.replaceState({ + alert.replaceState({ cpuUsage: currentCpuUsage, }); - // 'default' refers to the id of a group of actions to be scheduled for execution, see 'actions' in create alert section - alertInstance.scheduleActions('default', { + // 'default' refers to the id of a group of actions to be scheduled + // for execution, see 'actions' in create rule section + alert.scheduleActions('default', { server, hasCpuUsageIncreased: currentCpuUsage > previousCpuUsage, }); } - // Returning updated alert type level state, this will become available + // Returning updated rule type level state, this will become available // within the `state` function parameter at the next execution return { - // This is an example attribute you could set, it makes more sense to use this state when - // the alert type executes multiple instances but wants a single place to track certain values. + // This is an example attribute you could set, it makes more sense + // to use this state when the rule type executes multiple + // alerts but wants a single place to track certain values. lastChecked: new Date(), }; }, producer: 'alerting', }; -server.newPlatform.setup.plugins.alerting.registerType(myAlertType); +server.newPlatform.setup.plugins.alerting.registerType(myRuleType); ``` ## Role Based Access-Control -Once you have registered your AlertType, you need to grant your users privileges to use it. -When registering a feature in Kibana you can specify multiple types of privileges which are granted to users when they're assigned certain roles. +Once you have registered your AlertType, you need to grant your users privileges to use it. +When registering a feature in Kibana, you can specify multiple types of privileges which are granted to users when they're assigned certain roles. Assuming your feature introduces its own AlertTypes, you'll want to control which roles have all/read privileges for these AlertTypes when they're inside the feature. -In addition, when users are inside your feature you might want to grant them access to AlertTypes from other features, such as built-in AlertTypes or AlertTypes provided by other features. +In addition, when users are inside your feature, you might want to grant them access to AlertTypes from other features, such as built-in stack rules or rule types provided by other features. You can control all of these abilities by assigning privileges to the Alerting Framework from within your own feature, for example: @@ -304,11 +342,11 @@ features.registerKibanaFeature({ alerting: { all: [ // grant `all` over our own types - 'my-application-id.my-alert-type', - 'my-application-id.my-restricted-alert-type', + 'my-application-id.my-rule-type', + 'my-application-id.my-restricted-rule-type', // grant `all` over the built-in IndexThreshold '.index-threshold', - // grant `all` over Uptime's TLS AlertType + // grant `all` over Uptime's TLS rule type 'xpack.uptime.alerts.actionGroups.tls' ], }, @@ -317,10 +355,10 @@ features.registerKibanaFeature({ alerting: { read: [ // grant `read` over our own type - 'my-application-id.my-alert-type', + 'my-application-id.my-rule-type', // grant `read` over the built-in IndexThreshold '.index-threshold', - // grant `read` over Uptime's TLS AlertType + // grant `read` over Uptime's TLS rule type 'xpack.uptime.alerts.actionGroups.tls' ], }, @@ -330,11 +368,12 @@ features.registerKibanaFeature({ ``` In this example we can see the following: -- Our feature grants any user who's assigned the `all` role in our feature the `all` role in the Alerting framework over every alert of the `my-application-id.my-alert-type` type which is created _inside_ the feature. What that means is that this privilege will allow the user to execute any of the `all` operations (listed below) on these alerts as long as their `consumer` is `my-application-id`. Below that you'll notice we've done the same with the `read` role, which is grants the Alerting Framework's `read` role privileges over these very same alerts. -- In addition, our feature grants the same privileges over any alert of type `my-application-id.my-restricted-alert-type`, which is another hypothetical alertType registered by this feature. It's worth noting though that this type has been omitted from the `read` role. What this means is that only users with the `all` role will be able to interact with alerts of this type. -- Next, lets look at the `.index-threshold` and `xpack.uptime.alerts.actionGroups.tls` types. These have been specified in both `read` and `all`, which means that all the users in the feature will gain privileges over alerts of these types (as long as their `consumer` is `my-application-id`). The difference between these two and the previous two is that they are _produced_ by other features! `.index-threshold` is a built-in type, provided by the _Built-In Alerts_ feature, and `xpack.uptime.alerts.actionGroups.tls` is an AlertType provided by the _Uptime_ feature. Specifying these type here tells the Alerting Framework that as far as the `my-application-id` feature is concerned, the user is privileged to use them (with `all` and `read` applied), but that isn't enough. Using another feature's AlertType is only possible if both the producer of the AlertType, and the consumer of the AlertType, explicitly grant privileges to do so. In this case, the _Built-In Alerts_ & _Uptime_ features would have to explicitly add these privileges to a role and this role would have to be granted to this user. -It's important to note that any role can be granted a mix of `all` and `read` privileges accross multiple type, for example: +- Our feature grants any user who's assigned the `all` role in our feature the `all` role in the Alerting framework over every rule of the `my-application-id.my-rule-type` type which is created _inside_ the feature. What that means is that this privilege will allow the user to execute any of the `all` operations (listed below) on these rules as long as their `consumer` is `my-application-id`. Below that you'll notice we've done the same with the `read` role, which grants the Alerting Framework's `read` role privileges over these very same rules. +- In addition, our feature grants the same privileges over any rule of type `my-application-id.my-restricted-rule-type`, which is another hypothetical rule type registered by this feature. It's worth noting that this type has been omitted from the `read` role. What this means is that only users with the `all` role will be able to interact with rules of this type. +- Next, lets look at the `.index-threshold` and `xpack.uptime.alerts.actionGroups.tls` types. These have been specified in both `read` and `all`, which means that all the users in the feature will gain privileges over rules of these types (as long as their `consumer` is `my-application-id`). The difference between these two and the previous two is that they are _produced_ by other features! `.index-threshold` is a built-in stack rule type, provided by the _Stack Rules_ feature, and `xpack.uptime.alerts.actionGroups.tls` is a rule type provided by the _Uptime_ feature. Specifying these types here tells the Alerting Framework that as far as the `my-application-id` feature is concerned, the user is privileged to use them (with `all` and `read` applied), but that isn't enough. Using another feature's rule type is only possible if both the producer of the rule type and the consumer of the rule type explicitly grant privileges to do so. In this case, the _Stack Rules_ & _Uptime_ features would have to explicitly add these privileges to a role and this role would have to be granted to this user. + +It's important to note that any role can be granted a mix of `all` and `read` privileges accross multiple types, for example: ```typescript features.registerKibanaFeature({ @@ -355,10 +394,10 @@ features.registerKibanaFeature({ app: ['lens', 'kibana'], alerting: { all: [ - 'my-application-id.my-alert-type' + 'my-application-id.my-rule-type' ], read: [ - 'my-application-id.my-restricted-alert-type' + 'my-application-id.my-restricted-rule-type' ], }, savedObject: { @@ -372,16 +411,19 @@ features.registerKibanaFeature({ }); ``` -In the above example, you note that instead of denying users with the `read` role any access to the `my-application-id.my-restricted-alert-type` type, we've decided that these users _should_ be granted `read` privileges over the _resitricted_ AlertType. -As part of that same change, we also decided that not only should they be allowed to `read` the _restricted_ AlertType, but actually, despite having `read` privileges to the feature as a whole, we do actually want to allow them to create our basic 'my-application-id.my-alert-type' AlertType, as we consider it an extension of _reading_ data in our feature, rather than _writing_ it. +In the above example, note that instead of denying users with the `read` role any access to the `my-application-id.my-restricted-rule-type` type, we've decided that these users _should_ be granted `read` privileges over the _restricted_ rule type. +As part of that same change, we also decided that not only should they be allowed to `read` the _restricted_ rule type, but actually, despite having `read` privileges to the feature as a whole, we do actually want to allow them to create our basic 'my-application-id.my-rule-type' rule type, as we consider it an extension of _reading_ data in our feature, rather than _writing_ it. ### `read` privileges vs. `all` privileges When a user is granted the `read` role in the Alerting Framework, they will be able to execute the following api calls: + - `get` -- `getAlertState` +- `getRuleState` +- `getAlertSummary` - `find` When a user is granted the `all` role in the Alerting Framework, they will be able to execute all of the `read` privileged api calls, but in addition they'll be granted the following calls: + - `create` - `delete` - `update` @@ -390,24 +432,26 @@ When a user is granted the `all` role in the Alerting Framework, they will be ab - `updateApiKey` - `muteAll` - `unmuteAll` -- `muteInstance` -- `unmuteInstance` +- `muteAlert` +- `unmuteAlert` Finally, all users, whether they're granted any role or not, are privileged to call the following: -- `listAlertTypes`, but the output is limited to displaying the AlertTypes the user is perivileged to `get` + +- `listAlertTypes`, but the output is limited to displaying the rule types the user is privileged to `get`. Attempting to execute any operation the user isn't privileged to execute will result in an Authorization error thrown by the AlertsClient. ## Alert Navigation -When registering an Alert Type, you'll likely want to provide a way of viewing alerts of that type within your own plugin, or perhaps you want to provide a view for all alerts created from within your solution within your own UI. -In order for the Alerting framework to know that your plugin has its own internal view for displaying an alert, you must resigter a navigation handler within the framework. +When registering a rule type, you'll likely want to provide a way of viewing rules of that type within your own plugin, or perhaps you want to provide a view for all rules created from within your solution within your own UI. + +In order for the Alerting Framework to know that your plugin has its own internal view for displaying a rule, you must register a navigation handler within the framework. -A navigation handler is nothing more than a function that receives an Alert and its corresponding AlertType, and is expected to then return the path *within your plugin* which knows how to display this alert. +A navigation handler is nothing more than a function that receives a rule and its corresponding AlertType, and is expected to then return the path *within your plugin* which knows how to display this rule. The signature of such a handler is: -``` +```typescript type AlertNavigationHandler = ( alert: SanitizedAlert, alertType: AlertType @@ -420,43 +464,43 @@ By specifying _alerting_ as a dependency of your *public* (client side) plugin, ### registerNavigation The _registerNavigation_ api allows you to register a handler for a specific alert type within your solution: -``` +```typescript alerting.registerNavigation( 'my-application-id', - 'my-application-id.my-alert-type', - (alert: SanitizedAlert, alertType: AlertType) => `/my-unique-alert/${alert.id}` + 'my-application-id.my-rule-type', + (alert: SanitizedAlert, alertType: AlertType) => `/my-unique-rule/${rule.id}` ); ``` -This tells the Alerting framework that, given an alert of the AlertType whose ID is `my-application-id.my-unique-alert-type`, if that Alert's `consumer` value (which is set when the alert is created by your plugin) is your application (whose id is `my-application-id`), then it will navigate to your application using the path `/my-unique-alert/${the id of the alert}`. +This tells the Alerting Framework that, given a rule of the AlertType whose ID is `my-application-id.my-unique-rule-type`, if that rule's `consumer` value (which is set when the rule is created by your plugin) is your application (whose id is `my-application-id`), then it will navigate to your application using the path `/my-unique-rule/${the id of the rule}`. -The navigation is handled using the `navigateToApp` api, meaning that the path will be automatically picked up by your `react-router-dom` **Route** component, so all you have top do is configure a Route that handles the path `/my-unique-alert/:id`. +The navigation is handled using the `navigateToApp` API, meaning that the path will be automatically picked up by your `react-router-dom` **Route** component, so all you have top do is configure a Route that handles the path `/my-unique-rule/:id`. You can look at the `alerting-example` plugin to see an example of using this API, which is enabled using the `--run-examples` flag when you run `yarn start`. ### registerDefaultNavigation -The _registerDefaultNavigation_ api allows you to register a handler for any alert type within your solution: +The _registerDefaultNavigation_ API allows you to register a handler for any rule type within your solution: ``` alerting.registerDefaultNavigation( 'my-application-id', - (alert: SanitizedAlert, alertType: AlertType) => `/my-other-alerts/${alert.id}` + (alert: SanitizedAlert, alertType: AlertType) => `/my-other-rules/${rule.id}` ); ``` -This tells the Alerting framework that, given any alert whose `consumer` value is your application, as long as then it will navigate to your application using the path `/my-other-alerts/${the id of the alert}`. +This tells the Alerting Framework that any rule whose `consumer` value is your application can be navigated to in your application using the path `/my-other-rules/${the id of the rule}`. -### balancing both APIs side by side -As we mentioned, using `registerDefaultNavigation` will tell the Alerting Framework that your application can handle any type of Alert we throw at it, as long as your application created it, using the handler you provide it. +### Balancing both APIs side by side +As we mentioned, using `registerDefaultNavigation` will tell the Alerting Framework that your application can handle any type of rule we throw at it, as long as your application created it, using the handler you provided. -The only case in which this handler will not be used to evaluate the navigation for an alert (assuming your application is the `consumer`) is if you have also used `registerNavigation` api, along side your `registerDefaultNavigation` usage, to handle that alert's specific AlertType. +The only case in which this handler will not be used to evaluate the navigation for a rule (assuming your application is the `consumer`) is if you have also used the `registerNavigation` API, alongside your `registerDefaultNavigation` usage, to handle that rule's specific AlertType. -You can use the `registerNavigation` api to specify as many AlertType specific handlers as you like, but you can only use it once per AlertType as we wouldn't know which handler to use if you specified two for the same AlertType. For the same reason, you can only use `registerDefaultNavigation` once per plugin, as it covers all cases for your specific plugin. +You can use the `registerNavigation` API to specify as many AlertType specific handlers as you like, but you can only use it once per AlertType as we wouldn't know which handler to use if you specified two for the same AlertType. For the same reason, you can only use `registerDefaultNavigation` once per plugin, as it covers all cases for your specific plugin. ## Internal HTTP APIs -Using of the rule type requires you to create a rule that will contain parameters and actions for a given rule type. API description for CRUD operations is a part of the [user documentation](https://www.elastic.co/guide/en/kibana/master/alerting-apis.html). -API listed below are internal and should not be consumed by plugin outside the alerting plugins. +We provide public APIs for performing CRUD operations on rules. Descriptions for these APIs are available in the [user documentation](https://www.elastic.co/guide/en/kibana/master/alerting-apis.html). +In addition to the public APIs, we provide the following internal APIs. Internal APIs should not be consumed by plugins outside of the alerting plugins. ### `GET /internal/alerting/rule/{id}/state`: Get rule state @@ -489,56 +533,56 @@ Query: |---|---|---| |id|The id of the rule you're trying to update the API key for. System will use user in request context to generate an API key for.|string| -## Alert instance factory +## Alert Factory **alertInstanceFactory(id)** -One service passed in to alert types is an alert instance factory. This factory creates instances of alerts and must be used in order to execute actions. The `id` you give to the alert instance factory is a unique identifier to the alert instance (ex: server identifier if the instance is about the server). The instance factory will use this identifier to retrieve the state of previous instances with the same `id`. These instances support state persisting between alert type execution, but will clear out once the alert instance stops executing. +One service passed in to each rule type is the alert factory. This factory creates alerts and must be used in order to execute actions. The `id` you give to the alert factory is the unique identifier for the alert (e.g. the server identifier if the alert is about servers). The alert factory will use this identifier to retrieve the state of previous alerts with the same `id`. These alerts support persisting state between rule executions, but will clear out once the alert stops firing. -Note that the `id` only needs to be unique **within the scope of a specific alert**, not unique across all alerts or alert types. For example, Alert 1 and Alert 2 can both create an alert instance with an `id` of `"a"` without conflicting with one another. But if Alert 1 creates 2 alert instances, then they must be differentiated with `id`s of `"a"` and `"b"`. +Note that the `id` only needs to be unique **within the scope of a specific rule**, not unique across all rules or rule types. For example, Rule 1 and Rule 2 can both create an alert with an `id` of `"a"` without conflicting with one another. But if Rule 1 creates 2 alerts, then they must be differentiated with `id`s of `"a"` and `"b"`. -This factory returns an instance of `AlertInstance`. The alert instance class has the following methods, note that we have removed the methods that you shouldn't touch. +This factory returns an instance of `AlertInstance`. The `AlertInstance` class has the following methods. Note that we have removed the methods that you shouldn't touch. |Method|Description| |---|---| -|getState()|Get the current state of the alert instance.| -|scheduleActions(actionGroup, context)|Called to schedule the execution of actions. The actionGroup is a string `id` that relates to the group of alert `actions` to execute and the context will be used for templating purposes. `scheduleActions` or `scheduleActionsWithSubGroup` should only be called once per alert instance.| -|scheduleActionsWithSubGroup(actionGroup, subgroup, context)|Called to schedule the execution of actions within a subgroup. The actionGroup is a string `id` that relates to the group of alert `actions` to execute, the `subgroup` is a dynamic string that denotes a subgroup within the actionGroup and the context will be used for templating purposes. `scheduleActions` or `scheduleActionsWithSubGroup` should only be called once per alert instance.| -|replaceState(state)|Used to replace the current state of the alert instance. This doesn't work like react, the entire state must be provided. Use this feature as you see fit. The state that is set will persist between alert type executions whenever you re-create an alert instance with the same id. The instance state will be erased when `scheduleActions` or `scheduleActionsWithSubGroup` aren't called during an execution.| +|getState()|Get the current state of the alert.| +|scheduleActions(actionGroup, context)|Call this to schedule the execution of actions. The actionGroup is a string `id` that relates to the group of alert `actions` to execute and the context will be used for templating purposes. `scheduleActions` or `scheduleActionsWithSubGroup` should only be called once per alert.| +|scheduleActionsWithSubGroup(actionGroup, subgroup, context)|Call this to schedule the execution of actions within a subgroup. The actionGroup is a string `id` that relates to the group of alert `actions` to execute, the `subgroup` is a dynamic string that denotes a subgroup within the actionGroup and the context will be used for templating purposes. `scheduleActions` or `scheduleActionsWithSubGroup` should only be called once per alert.| +|replaceState(state)|Used to replace the current state of the alert. This doesn't work like React, the entire state must be provided. Use this feature as you see fit. The state that is set will persist between rule executions whenever you re-create an alert with the same id. The alert state will be erased when `scheduleActions` or `scheduleActionsWithSubGroup` aren't called during an execution.| -### when should I use `scheduleActions` and `scheduleActionsWithSubGroup`? +### When should I use `scheduleActions` and `scheduleActionsWithSubGroup`? The `scheduleActions` or `scheduleActionsWithSubGroup` methods are both used to achieve the same thing: schedule actions to be run under a specific action group. -It's important to note though, that when an actions are scheduled for an instance, we check whether the instance was already active in this action group after the previous execution. If it was, then we might throttle the actions (adhering to the user's configuration), as we don't consider this a change in the instance. +It's important to note that when actions are scheduled for an alert, we check whether the alert was already active in this action group after the previous execution. If it was, then we might throttle the actions (adhering to the user's configuration), as we don't consider this a change in the alert. -What happens though, if the instance _has_ changed, but they just happen to be in the same action group after this change? This is where subgroups come in. By specifying a subgroup (using the `scheduleActionsWithSubGroup` method), the instance becomes active within the action group, but it will also keep track of the subgroup. -If the subgroup changes, then the framework will treat the instance as if it had been placed in a new action group. It is important to note though, we only use the subgroup to denote a change if both the current execution and the previous one specified a subgroup. +What happens though, if the alert _has_ changed, but they just happen to be in the same action group after this change? This is where subgroups come in. By specifying a subgroup (using the `scheduleActionsWithSubGroup` method), the alert becomes active within the action group, but it will also keep track of the subgroup. +If the subgroup changes, then the framework will treat the alert as if it had been placed in a new action group. It is important to note that we only use the subgroup to denote a change if both the current execution and the previous one specified a subgroup. You might wonder, why bother using a subgroup if you can just add a new action group? -Action Groups are static, and have to be define when the Alert Type is defined. +Action Groups are static, and have to be define when the rule type is defined. Action Subgroups are dynamic, and can be defined on the fly. This approach enables users to specify actions under specific action groups, but they can't specify actions that are specific to subgroups. As subgroups fall under action groups, we will schedule the actions specified for the action group, but the subgroup allows the AlertType implementer to reuse the same action group for multiple different active subgroups. -## Templating actions +## Templating Actions -There needs to be a way to map alert context into action parameters. For this, we started off by adding template support. Any string within the `params` of an alert saved object's `actions` will be processed as a template and can inject context or state values. +There needs to be a way to map rule context into action parameters. For this, we started off by adding template support. Any string within the `params` of a rule saved object's `actions` will be processed as a template and can inject context or state values. -When an alert instance executes, the first argument is the `group` of actions to execute and the second is the context the alert exposes to templates. We iterate through each action params attributes recursively and render templates if they are a string. Templates have access to the following "variables": +When an alert executes, the first argument is the `group` of actions to execute and the second is the context the rule exposes to templates. We iterate through each action parameter attributes recursively and render templates if they are a string. Templates have access to the following "variables": -- `context` - provided by context argument of `.scheduleActions(...)` and `.scheduleActionsWithSubGroup(...)` on an alert instance -- `state` - the alert instance's `state` provided by the most recent `replaceState` call on an alert instance -- `alertId` - the id of the alert -- `alertInstanceId` - the alert instance id -- `alertName` - the name of the alert -- `spaceId` - the id of the space the alert exists in -- `tags` - the tags set in the alert +- `context` - provided by context argument of `.scheduleActions(...)` and `.scheduleActionsWithSubGroup(...)` on an alert. +- `state` - the alert's `state` provided by the most recent `replaceState` call on an alert. +- `alertId` - the id of the rule +- `alertInstanceId` - the alert id +- `alertName` - the name of the rule +- `spaceId` - the id of the space the rule exists in +- `tags` - the tags set in the rule -The templating engine is [mustache]. General definition for the [mustache variable] is a double-brace {{}}. All variables are HTML-escaped by default and if there is a requirement to render unescaped HTML, it should be applied the triple mustache: `{{{name}}}`. Also, can be used `&` to unescape a variable. +The templating engine is [mustache]. General definition for the [mustache variable] is a double-brace {{}}. All variables are HTML-escaped by default and if there is a requirement to render unescaped HTML, it should be applied with the triple mustache: `{{{name}}}`. Also, `&` can be used to unescape a variable. ### Examples -The following code would be within an alert type. As you can see `cpuUsage ` will replace the state of the alert instance and `server` is the context for the alert instance to execute. The difference between the two is `cpuUsage ` will be accessible at the next execution. +The following code would be within a rule type. As you can see `cpuUsage` will replace the state of the alert and `server` is the context for the alert to execute. The difference between the two is that `cpuUsage` will be accessible at the next execution. ``` alertInstanceFactory('server_1') @@ -550,13 +594,13 @@ alertInstanceFactory('server_1') }); ``` -Below is an example of an alert that takes advantage of templating: +Below is an example of a rule that takes advantage of templating: ``` { ... "id": "123", - "name": "cpu alert", + "name": "cpu rule", "actions": [ { "group": "default", @@ -565,21 +609,21 @@ Below is an example of an alert that takes advantage of templating: "from": "example@elastic.co", "to": ["destination@elastic.co"], "subject": "A notification about {{context.server}}", - "body": "The server {{context.server}} has a CPU usage of {{state.cpuUsage}}%. This message for {{alertInstanceId}} was created by the alert {{alertId}} {{alertName}}." + "body": "The server {{context.server}} has a CPU usage of {{state.cpuUsage}}%. This message for {{alertInstanceId}} was created by the rule {{alertId}} {{alertName}}." } } ] } ``` -The templating system will take the alert and alert type as described above and convert the action parameters to the following: +The templating system will take the rule and rule type as described above and convert the action parameters to the following: ``` { "from": "example@elastic.co", "to": ["destination@elastic.co"], "subject": "A notification about server_1" - "body": "The server server_1 has a CPU usage of 80%. This message for server_1 was created by the alert 123 cpu alert" + "body": "The server server_1 has a CPU usage of 80%. This message for server_1 was created by the rule 123 cpu rule" } ```