Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ResponseOps][FE] Alert creation delay based on user definition #176346

Merged
merged 32 commits into from
Feb 15, 2024

Conversation

doakalexi
Copy link
Contributor

@doakalexi doakalexi commented Feb 6, 2024

Resolves #173009

Summary

Adds a new input for the user to define the alertDelay. This input is available for life-cycled alerts (stack and o11y) rule types.

Checklist

Delete any items that are not applicable to this PR.

To verify

  • Using the UI create a rule with the alertDelay field set.
  • Verify that the field is saved properly and that you can edit the alertDelay
  • Verify that you can add the alert delay to existing rules. Create a rule in a different branch and switch to this one. Edit the rule and set the alertDelay. Verify that the rule saves and works as expected.

@doakalexi
Copy link
Contributor Author

/ci

@doakalexi doakalexi changed the title Alerting/alert delay actual fe [ResponseOps][FE] Alert creation delay based on user definition Feb 6, 2024
@doakalexi doakalexi added Team:ResponseOps Label for the ResponseOps team (formerly the Cases and Alerting teams) release_note:skip Skip the PR/issue when compiling release notes v8.13.0 labels Feb 6, 2024
@doakalexi
Copy link
Contributor Author

/ci

@doakalexi doakalexi marked this pull request as ready for review February 7, 2024 14:54
@doakalexi doakalexi requested a review from a team as a code owner February 7, 2024 14:54
@elasticmachine
Copy link
Contributor

Pinging @elastic/response-ops (Team:ResponseOps)

@lcawl
Copy link
Contributor

lcawl commented Feb 12, 2024

Related to #176346 (comment) ...

IMO it would look nice to have this new field align with the "check every" field. For example, "Alert every" (or "Alert after") xx "matches" (or "consecutive matches"). Then if I'm understanding correctly, you would have a default value of "1", since generally we create alerts after a single match.

@doakalexi
Copy link
Contributor Author

Related to #176346 (comment) ...

IMO it would look nice to have this new field align with the "check every" field. For example, "Alert every" (or "Alert after") xx "matches" (or "consecutive matches"). Then if I'm understanding correctly, you would have a default value of "1", since generally we create alerts after a single match.

Resolved in this commit 7faa680 , thanks for your help! 🙂

@jeramysoucy
Copy link
Contributor

jeramysoucy commented Feb 13, 2024

Per discussion with team this morning, I believe we need to add the new field as an "omitted" property for AAD usage. Since we aren't including this as AAD, we won't need any updates once we switch AAD attribute lists from "omit" to "include" in PR Replace Encrypted Saved Object AAD exclude list with include list #167705
...
cc: @jeramysoucy I know we aren't supposed to be changing AAD stuff while the work is underway, but this change seems like one we should make.

Thanks for the ping @pmuellr!

Sorry all, I deleted my last comment because I was wrong - even I get confused about encrypted objects stuff sometimes.

This is a little complicated, so bear with me...

  1. If possible, we should not make any changes to any encrypted saved object (ESO) until Replace Encrypted Saved Object AAD exclude list with include list #167705 lands AND we have clear guidelines documented for consuming teams.

  2. If we absolutely have to introduce a new attribute to an ESO, we need to consider serverless zero downtime upgrades (ZDT). Because the current default is to include attributes in AAD (they have to be explicitly excluded), any new attributes will have to be included in AAD for the previous version of Kibana to successfully decrypt them (see Potential challenges with Encrypted Saved Objects in Serverless #156023). The previous version of Kibana will not have access to the updated exclude list in the new version of Kibana, and will attempt to use any new attributes in AAD by default.

  3. The only exceptions to # 2, are:
    a. If the ESO will NEVER be decrypted by the older version of Kibana (cc: @kobelb I think you've brought this up in the past). In this case, you can exclude the attribute with no adverse effect in ZDT.
    b. If you stage your changes to the ESO over 2 serverless releases. Stage 1, add the attribute name to the exclude list of the ESO, but do not implement the attribute yet (no populating/usage of attribute). Stage 2, for the next serverless release, begin populating/using the attribute. In this case, the previous version of Kibana will know to exclude the attribute before it ever encounters it.

Clear as mud?

@doakalexi If you keep the default behavior of including the new attribute, there will not be a problem with ZDT, however, the attribute will always and forever be included in AAD. If the preference is to not include this attribute, I would recommend staging the changes to coincide with serverless releases (3b above).

Regarding #167705, I can update this PR if necessary to account for the new attribute.

I am happy to schedule a sync meeting on this to go over the details if necessary. Please let me know.

@pmuellr
Copy link
Member

pmuellr commented Feb 13, 2024

If you keep the default behavior of including the new attribute, there will not be a problem with ZDT, however, the attribute will always and forever be included in AAD. If the preference is to not include this attribute, I would recommend staging the changes to coincide with serverless releases (3b above).

Pretty sure we don't want this in the AAD, unless it ends up being the far easiestt way of doing this.

Which implies doing the "staging" of changes across releases. @kobelb is this our first one?

@kobelb
Copy link
Contributor

kobelb commented Feb 13, 2024

Which implies doing the "staging" of changes across releases. @kobelb is this our first one?

It is... I think we have two options here:

  1. Release an intermediate version of Kibana that only includes the change to omit the query delay from the AAD. This would allow us to roll-back to the intermediate version and have all alerting rules with a query delay specified continue to run. However, this would result in these alerting rules "running incorrectly" and ignoring the specified query delay.
  2. Release an intermediate version of Kibana that knows how to run alerting rules with the query delay specified, but not allow users to set the query delay. This would allow us to roll-back to this version and have all alerting rules "run correctly".

I'm guessing that 1) is going to be a lot less work than 2). Does that seem right @doakalexi and/or @pmuellr?

@pmuellr
Copy link
Member

pmuellr commented Feb 13, 2024

So, my read is that

  1. would mean blocking this PR to the "next release". Meanwhile, we'll merge a PR to add the (non-existant) field to the AAD omit list, to allow roll backs to work. That's all that would be in the PR. After the next release, this PR would be unblocked.

  2. would mean splitting the PR into one which adds the field - but not allow it to be set or gotten (UX + http), nor reference it (in the framework rule run processing). And then another which adds everything else in the PR (setters/getters, usage in rule run processing).

So, seems to me like 1) is easier. They're both weird, as the field isn't mapped, so literaly that first PR for 1) will be a one-liner (in theory) that adds it to the AAD omit list, where it's not actually even referenced anywhere.

I suppose we should actually try both, to see if we missed anything. Which will be a little difficult, seems like, testing a version upgrade. We're going to need an easy way to test these "upgrade" / "rollback", using the same or very similiar "versions" of things, for situations like this ...

@doakalexi
Copy link
Contributor Author

doakalexi commented Feb 13, 2024

I wanted to make sure everyone is aware that we already merged a PR with all the backend changes #175851 last week. This PR is for the front-end and only adds the alertDelay to the UI.

@ymao1
Copy link
Contributor

ymao1 commented Feb 13, 2024

As @doakalexi said, the backend changes to update the API to create a rule with alert_delay and use it in the task runner was already merged and that change has been deployed to serverless, as of today. I reviewed that PR and completely missed the AAD part :(

Since it's not available in the UI and the API change is undocumented and the field is optional, I believe there shouldn't be any rules that have this field. Given this, should we do this from Patrick's suggestion

we'll merge a PR to add the (non-existant) field to the AAD omit list, to allow roll backs to work. That's all that would be in the PR. After the next release, this PR would be unblocked.

Just merge a PR to add the field to the AAD omit list, to be included in the next release?

@mdefazio
Copy link
Contributor

A few thoughts on this. Haven't had a chance to dig too far into this, but at surface level, some reactions:

Screenshot--2024-02-14--CleanShot

@mdefazio
Copy link
Contributor

@doakalexi Thanks for pointing me to the right UI 🤦

I think the updated version is much better and solves many of my concerns. The only last (optional IMO) change is to consider hiding this initially if most users will not need this.

I'd prefer the icon only option on the right, to save vertical space. But without knowing how much this is requested, i'm fine waiting to do this part later.
Screenshot--2024-02-14--Untitled

@doakalexi
Copy link
Contributor Author

doakalexi commented Feb 14, 2024

Hi @jeramysoucy, after discussing internally the team has decided to keep the default behavior of including the new attribute. We should update #167705 to account for the new attribute.

Thanks for your input and help with this!

@jeramysoucy
Copy link
Contributor

Hi @jeramysoucy, after discussing internally the team has decided to keep the default behavior of including the new attribute. We should update #167705 to account for the new attribute.

Thanks for your input and help with this!

Thank you for the follow up @doakalexi! I know this is not ideal, and hopefully we can get #167705 completed soon to make life a little bit easier. If response ops needs to make other changes to ESOs in the meantime, please feel free to add me or the kibana security team for review. Always happy to help!

Copy link
Member

@pmuellr pmuellr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Everything looked ok to me before, except the AAD / ZDT aspect, which we have now resolved. Still LGTM given the changes since my review ...

@doakalexi
Copy link
Contributor Author

@doakalexi Thanks for pointing me to the right UI 🤦

I think the updated version is much better and solves many of my concerns. The only last (optional IMO) change is to consider hiding this initially if most users will not need this.

I'd prefer the icon only option on the right, to save vertical space. But without knowing how much this is requested, i'm fine waiting to do this part later. Screenshot--2024-02-14--Untitled

Thanks so much for your review, I am going to go ahead and make this change in a separate PR.

@kibana-ci
Copy link
Collaborator

💛 Build succeeded, but was flaky

Failed CI Steps

Test Failures

  • [job] [logs] FTR Configs #73 / InfraOps App Metrics UI Hosts View "after all" hook for "should be accessible from the Inventory page"
  • [job] [logs] FTR Configs #73 / InfraOps App Metrics UI Hosts View "before all" hook for "should be accessible from the Inventory page"

Metrics [docs]

Async chunks

Total size of all lazy-loaded chunks that will be downloaded as the user navigates the app

id before after diff
triggersActionsUi 1.6MB 1.6MB +1.8KB

Page load bundle

Size of the bundles that are downloaded on every page load. Target size is below 100kb

id before after diff
triggersActionsUi 108.2KB 108.2KB +37.0B
Unknown metric groups

References to deprecated APIs

id before after diff
triggersActionsUi 53 54 +1

History

To update your PR or re-run it, just comment with:
@elasticmachine merge upstream

@doakalexi doakalexi merged commit 68d6ab2 into elastic:main Feb 15, 2024
37 checks passed
kibanamachine pushed a commit to kibanamachine/kibana that referenced this pull request Feb 15, 2024
…tic#176346)

Resolves elastic#173009

## Summary

Adds a new input for the user to define the `alertDelay`. This input is
available for life-cycled alerts (stack and o11y) rule types.

### Checklist

Delete any items that are not applicable to this PR.

- [x] Any text added follows [EUI's writing
guidelines](https://elastic.github.io/eui/#/guidelines/writing), uses
sentence case text and includes [i18n
support](https://github.com/elastic/kibana/blob/main/packages/kbn-i18n/README.md)
- [x] [Unit or functional
tests](https://www.elastic.co/guide/en/kibana/master/development-tests.html)
were updated or added to match the most common scenarios

### To verify

- Using the UI create a rule with the `alertDelay` field set.
- Verify that the field is saved properly and that you can edit the
`alertDelay`
- Verify that you can add the alert delay to existing rules. Create a
rule in a different branch and switch to this one. Edit the rule and set
the `alertDelay`. Verify that the rule saves and works as expected.

---------

Co-authored-by: Lisa Cawley <lcawley@elastic.co>
(cherry picked from commit 68d6ab2)
@kibanamachine
Copy link
Contributor

💚 All backports created successfully

Status Branch Result
8.13

Note: Successful backport PRs will be merged automatically after passing CI.

Questions ?

Please refer to the Backport tool documentation

kibanamachine added a commit that referenced this pull request Feb 15, 2024
#176346) (#177051)

# Backport

This will backport the following commits from `main` to `8.13`:
- [[ResponseOps][FE] Alert creation delay based on user definition
(#176346)](#176346)

<!--- Backport version: 9.4.3 -->

### Questions ?
Please refer to the [Backport tool
documentation](https://github.com/sqren/backport)

<!--BACKPORT [{"author":{"name":"Alexi
Doak","email":"109488926+doakalexi@users.noreply.github.com"},"sourceCommit":{"committedDate":"2024-02-15T17:13:06Z","message":"[ResponseOps][FE]
Alert creation delay based on user definition (#176346)\n\nResolves
https://github.com/elastic/kibana/issues/173009\r\n\r\n##
Summary\r\n\r\nAdds a new input for the user to define the `alertDelay`.
This input is\r\navailable for life-cycled alerts (stack and o11y) rule
types.\r\n\r\n### Checklist\r\n\r\nDelete any items that are not
applicable to this PR.\r\n\r\n- [x] Any text added follows [EUI's
writing\r\nguidelines](https://elastic.github.io/eui/#/guidelines/writing),
uses\r\nsentence case text and includes
[i18n\r\nsupport](https://github.com/elastic/kibana/blob/main/packages/kbn-i18n/README.md)\r\n-
[x] [Unit or
functional\r\ntests](https://www.elastic.co/guide/en/kibana/master/development-tests.html)\r\nwere
updated or added to match the most common scenarios\r\n\r\n\r\n### To
verify\r\n\r\n- Using the UI create a rule with the `alertDelay` field
set.\r\n- Verify that the field is saved properly and that you can edit
the\r\n`alertDelay`\r\n- Verify that you can add the alert delay to
existing rules. Create a\r\nrule in a different branch and switch to
this one. Edit the rule and set\r\nthe `alertDelay`. Verify that the
rule saves and works as
expected.\r\n\r\n---------\r\n\r\nCo-authored-by: Lisa Cawley
<lcawley@elastic.co>","sha":"68d6ab21354bcf0504dc3664b818ab07f94340bc","branchLabelMapping":{"^v8.14.0$":"main","^v(\\d+).(\\d+).\\d+$":"$1.$2"}},"sourcePullRequest":{"labels":["release_note:skip","Team:ResponseOps","v8.13.0","v8.14.0"],"title":"[ResponseOps][FE]
Alert creation delay based on user
definition","number":176346,"url":"https://github.com/elastic/kibana/pull/176346","mergeCommit":{"message":"[ResponseOps][FE]
Alert creation delay based on user definition (#176346)\n\nResolves
https://github.com/elastic/kibana/issues/173009\r\n\r\n##
Summary\r\n\r\nAdds a new input for the user to define the `alertDelay`.
This input is\r\navailable for life-cycled alerts (stack and o11y) rule
types.\r\n\r\n### Checklist\r\n\r\nDelete any items that are not
applicable to this PR.\r\n\r\n- [x] Any text added follows [EUI's
writing\r\nguidelines](https://elastic.github.io/eui/#/guidelines/writing),
uses\r\nsentence case text and includes
[i18n\r\nsupport](https://github.com/elastic/kibana/blob/main/packages/kbn-i18n/README.md)\r\n-
[x] [Unit or
functional\r\ntests](https://www.elastic.co/guide/en/kibana/master/development-tests.html)\r\nwere
updated or added to match the most common scenarios\r\n\r\n\r\n### To
verify\r\n\r\n- Using the UI create a rule with the `alertDelay` field
set.\r\n- Verify that the field is saved properly and that you can edit
the\r\n`alertDelay`\r\n- Verify that you can add the alert delay to
existing rules. Create a\r\nrule in a different branch and switch to
this one. Edit the rule and set\r\nthe `alertDelay`. Verify that the
rule saves and works as
expected.\r\n\r\n---------\r\n\r\nCo-authored-by: Lisa Cawley
<lcawley@elastic.co>","sha":"68d6ab21354bcf0504dc3664b818ab07f94340bc"}},"sourceBranch":"main","suggestedTargetBranches":["8.13"],"targetPullRequestStates":[{"branch":"8.13","label":"v8.13.0","branchLabelMappingKey":"^v(\\d+).(\\d+).\\d+$","isSourceBranch":false,"state":"NOT_CREATED"},{"branch":"main","label":"v8.14.0","branchLabelMappingKey":"^v8.14.0$","isSourceBranch":true,"state":"MERGED","url":"https://github.com/elastic/kibana/pull/176346","number":176346,"mergeCommit":{"message":"[ResponseOps][FE]
Alert creation delay based on user definition (#176346)\n\nResolves
https://github.com/elastic/kibana/issues/173009\r\n\r\n##
Summary\r\n\r\nAdds a new input for the user to define the `alertDelay`.
This input is\r\navailable for life-cycled alerts (stack and o11y) rule
types.\r\n\r\n### Checklist\r\n\r\nDelete any items that are not
applicable to this PR.\r\n\r\n- [x] Any text added follows [EUI's
writing\r\nguidelines](https://elastic.github.io/eui/#/guidelines/writing),
uses\r\nsentence case text and includes
[i18n\r\nsupport](https://github.com/elastic/kibana/blob/main/packages/kbn-i18n/README.md)\r\n-
[x] [Unit or
functional\r\ntests](https://www.elastic.co/guide/en/kibana/master/development-tests.html)\r\nwere
updated or added to match the most common scenarios\r\n\r\n\r\n### To
verify\r\n\r\n- Using the UI create a rule with the `alertDelay` field
set.\r\n- Verify that the field is saved properly and that you can edit
the\r\n`alertDelay`\r\n- Verify that you can add the alert delay to
existing rules. Create a\r\nrule in a different branch and switch to
this one. Edit the rule and set\r\nthe `alertDelay`. Verify that the
rule saves and works as
expected.\r\n\r\n---------\r\n\r\nCo-authored-by: Lisa Cawley
<lcawley@elastic.co>","sha":"68d6ab21354bcf0504dc3664b818ab07f94340bc"}}]}]
BACKPORT-->

Co-authored-by: Alexi Doak <109488926+doakalexi@users.noreply.github.com>
fkanout pushed a commit to fkanout/kibana that referenced this pull request Mar 4, 2024
…tic#176346)

Resolves elastic#173009

## Summary

Adds a new input for the user to define the `alertDelay`. This input is
available for life-cycled alerts (stack and o11y) rule types.

### Checklist

Delete any items that are not applicable to this PR.

- [x] Any text added follows [EUI's writing
guidelines](https://elastic.github.io/eui/#/guidelines/writing), uses
sentence case text and includes [i18n
support](https://github.com/elastic/kibana/blob/main/packages/kbn-i18n/README.md)
- [x] [Unit or functional
tests](https://www.elastic.co/guide/en/kibana/master/development-tests.html)
were updated or added to match the most common scenarios


### To verify

- Using the UI create a rule with the `alertDelay` field set.
- Verify that the field is saved properly and that you can edit the
`alertDelay`
- Verify that you can add the alert delay to existing rules. Create a
rule in a different branch and switch to this one. Edit the rule and set
the `alertDelay`. Verify that the rule saves and works as expected.

---------

Co-authored-by: Lisa Cawley <lcawley@elastic.co>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
release_note:skip Skip the PR/issue when compiling release notes Team:ResponseOps Label for the ResponseOps team (formerly the Cases and Alerting teams) v8.13.0 v8.14.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Alert creation delay based on user definition
10 participants