-
Notifications
You must be signed in to change notification settings - Fork 8.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Alerting] Explanation and approach to migrating and removing "siem-detection-engine-rule-actions" saved object #112327
Comments
Pinging @elastic/security-detections-response (Team:Detections and Resp) |
If you go with 3, wouldn't you still have the possibility of losing notifications because you can't guarantee the migration is complete by the time the Kibana server is up and running? I think that option sounds risky because you'd be running migrations while users are allowed to interact with their alerts (if I am understanding correctly). Would there be a possibility of a race condition? For example, a user adds an action which gets put onto The Kibana App team ran into the same requirement and managed to work around it by leveraging ad-hoc migrations when loading and saving the saved objects. I wonder if something similar could be achieved? (cc @flash1293 if you have any words of advice for how that went!).
|
The way we solved this for Graph is doing the reference lookup on read in the UI - this works for workspaces because they are only ever consumed this way, but it's not nice for a bunch of reasons:
I just skimmed the problem statement above but as far as I understood the reference ids are in the saved object, they are just not correctly placed in the |
++ without this, I don't think there's a way to guarantee |
Yea, I think that only solves half the problem though. The ids will be correct, but they still won't be able to migrate the content of
Or the other way:
As far as I understood from Core, the SO migration system doesn't support this kind of lookup during migrations. |
That is being changed in a small upcoming PR to eliminate any questions or issues with any approach we choose and to make it so we don't have to carry baggage of doing saved object resolves at all. Edited the ticket with these words to reflect this now:
|
Ah, got it, thanks for the clarification. We discussed lookups during migration with the core team and decided against it for performance reasons. Not super happy with the ad-hoc migration in there, but the plan with leveraging telemetry sounds good for phasing it out. |
More context on that here: #34996 (comment)
I agree that, assuming I understand everything correctly, this currently sounds like the most viable (and least risky) path forward. |
Following some offline meetings/discussion, we've come up with an option that should satisfy all parties for 7.16. Please review and let me know if you have questions/concerns: The core ideas are to support both versions of actions, to funnel users through our own custom import/export APIs, and to check/migrate rules' actions when the rule is touched (create/import/save). There are a few pieces of work involved with this:
Users can then update their rules ad hoc by touching them, or they can update all rules at once by exporting and reimporting (via our custom import/export APIs). In 8.x, once we have confidence (via telemetry), we can remove support for sidecar actions and remove our own import/export in favor of the SOM. Until then, we will discourage/prevent Detections users from managing rules there. |
We removed the legacy notification system during part 1 here:
#109722
Which included the code for the alerting side car saved object:
You can see these saved object type through the query:
The contents of this saved object "side car" is near duplication of actions that should now be directly on the rules/alerts themselves and should no longer exist as a separate saved object:
As you can see we have two other saved object id's within our object which are:
These are both saved object id's that are references to two different SO types of
action
andalert
you can query against like so:GET .kibana/_doc/action:21f1c6e0-09d5-11ec-908c-57a10fa3ee91
And then:
GET .kibana/_doc/alert:fb1046a0-0452-11ec-9b15-d13d79d162f3
Note: These do not have proper reference id's within them. This doesn't mean we can't add a saved object migration to move them before 7.16.0 or in conjunction with our plans which we might do depending on how we decide to migrate these. However, we have decided to do this migration before 7.16.0 in a small PR to eliminate this question or any issues around the SO id's being re-generated.
Second note is that we have this duplicated field:
Which looking through the history seems to have been there for over a year and both are filled out.
What our goals are is to migrate all of the
siem-detection-engine-rule-actions
by:siem-detection-engine-rule-actions
and moving their "actions" into the "alert" SO. Note we have to transform some data since the side car of "siem-detection-engine-rule-actions" has mismatched labels such as it using snake_case "action_type_id" vs. the alert using camel case.notifyWhen
to beonThrottleInterval
on the alert during migrationthrottle
to be the value ofruleThrottle
if available, otherwisealertThrottle
, otherwise we can default to1h
to be safe.muteAll
tofalse
siem-detection-engine-rule-actions
This will then be a migration to where existing alerts/rules which had this "side car model" will now operate with the newer mechanisms and all the additional SO will be removed.
Approaches:
👎 1. If we do nothing then we have left over saved objects and the user will have to manually check all their notifications and re-adjust them. Users will have N number of left over saved objects and we run the risk of later migration issues down the road. We could optionally just delete these/remove them and do no migrations. Other risks is the user never notices that they suddenly do not have notifications if we just put in release notes they have to check for notifications that are unset during the upgrade.
🤷 2. We could write a REST interface which uses the alerting client API and put something such as a banner up and require users to click on the banner to migrate these by clicking on the button. Or we try to auto-migrate these. Risks are that the API keys will be changed out for the new user, or if they never visit the page then the risk is that they suddenly do not have notifications.
😁 3. We could at startup run a migration process either shortly after regular Kibana migration to do all the heavy lifting of manually migrating these with the same API key. Risks are that the code has to be maintained for a while and the alerting team could change their format of their API keys, etc... If we could have an API on startup to use with this approach that wouldn't change the API key that would be great and would mitigate this risk more. Draft PR for this is up as a proof of concept so far. Since this requires a "join" we cannot use the normal Kibana startup migration so far. Discussions are here, here, and here.
Conclusions and decisions:
So far for approach 1, no one has expressed they would like to do that approach. We do not have explicit stake holder/business/product telling us "no" on approach 1 but several people have voiced concerns over a do-nothing approach.
The text was updated successfully, but these errors were encountered: