Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Discuss] [Security Solution] [Alerting] HTTP route RFC for unified rule management #95060

Open
dhurley14 opened this issue Mar 22, 2021 · 10 comments
Assignees
Labels
dependencies Pull requests that update a dependency file discuss Feature:Alerting Feature:Detection Rules Security Solution rules and Detection Engine Team:Detection Rule Management Security Detection Rule Management Team Team:Detections and Resp Security Detection Response Team Team:ResponseOps Label for the ResponseOps team (formerly the Cases and Alerting teams) technical debt Improvement of the software architecture and operational architecture Theme: rac label obsolete v8.0.0

Comments

@dhurley14
Copy link
Contributor

dhurley14 commented Mar 22, 2021

Introduction

RFC to define the needs of the security solution for a unified rule management API. Not sure if something like this exists elsewhere but figured I'd throw this together.

Security routes to be added to alerting framework

We make use of bulk actions #53144 such as bulk create, bulk update (bulk enable essentially), bulk delete.. so we need some new routes to handle these operations. These routes also require creating additional apis on the alerts client to enable bulk operations on the saved objects which currently (as of 7.12 I believe) have all of these operations available except for bulk delete.

We also need import export but it looks like @ymao1 has been working away at that which is great :) #94151 and #94152

Privileges and index bootstrap routes are used for establishing the .siem-signals index which will be the unified alerting index. I think Garrett is handling this though but if anyone has additional comments those would be welcome.

Prepackaged rules route - Not sure how this will look in 7.13 after this lands

Find rules status route is undocumented but maybe should be documented? Also do we want this in the security solution routes or in alerting? My feeling is it belongs in the security solution but I could see utility in having it be available for other solutions too. This could also exist in the get framework health route from alerting.

We also need a route for updating tags but I'm not sure if tags are security-specific so this route could remain in the security solution.

Lastly, new routes for adding actions and enabling / disabling our pre-built rules (tagged with __immutable__) Also add validation to prevent updating immutable rules too.

Route validation

Currently alerting is utilizing kbn config for route validation. If possible, it would be great if the rule management api and the solutions utilized the same validation library. My suggestion would be to introduce io-ts in the newer apis like import / export, bulk etc. and then gradually update the other routes to utilize it as well.

Open questions

We currently merge the rule status saved object with the rule data in our "find_rules" route. Hopefully this status saved object will be removed in 7.13 and be replaced with events in the event log plugin. Regardless, the _find route returns the rule with some current status information merged into the response. Either we copy this functionality over into the _find route of alerting or we tell the UI to query the rule health / rule status api. The easier option is to have the same functionality as currently exists in the security solution route.

Related issues from alerting: #90375, #90381, #90379

@dhurley14 dhurley14 added the Team:Detections and Resp Security Detection Response Team label Mar 22, 2021
@elasticmachine
Copy link
Contributor

Pinging @elastic/security-detections-response (Team:Detections and Resp)

@ymao1
Copy link
Contributor

ymao1 commented Mar 22, 2021

The initial implementation of import/export for the alerting framework will be done via the saved object management UI, so I'm not sure it will exactly meet your needs in this iteration. We are addressing the need for importing/exporting rules and connectors in order to support the deprecation of legacy multi-tenancy, so the expected workflow will be that a user will use the saved object management UI or the saved object import/export routes/APIs. In the next step, we would expect to provide routes/APIs in the alerting framework for import/export that provide an RBAC layer on top of the saved objects API but there is no issue open for that right now.

@pmuellr pmuellr added the Team:ResponseOps Label for the ResponseOps team (formerly the Cases and Alerting teams) label Mar 22, 2021
@elasticmachine
Copy link
Contributor

Pinging @elastic/kibana-alerting-services (Team:Alerting Services)

@banderror
Copy link
Contributor

Just for reference, these are our current endpoints and route handlers:

I think one thing to note is that our routes are not simple wrappers around AlertsClient and data specific to only alerting. They contain pieces of code specific to the Detections domain, for example:

So the big question imho is how are we going to separate responsibilities and design the final framework (whatever it is). Is it going to be only Alerting, or Alerting + ETL on top, or Alerting + unified Detection Engine, or... ? That said, if/when this is more clarified, would be definitely good to consolidate HTTP APIs for rules, but at this point it seems like too early to make any decisions on routes.

Maybe it would be worth going through all our routes and figure out all the generic (alerting) and specific (detections) parts.

Another idea is let's start with adding some missing APIs to alerting (methods to AlertsClient etc) that we've been needing for a long time. This is a meta ticket.

I think bulk CRUD methods would be a good thing to add because of 2 reasons:

  • There's a lot of bulk CRUD code in our API routes, and sometimes it's implemented like await Promise.all() or something like that which doesn't scale with the number of items to process, and we already had issues with that. We would benefit from extracting this mechanics to Alerting and implementing it properly.
  • We will need bulk CRUD stuff after my efforts with Event Log are done -- in order to optimize our rules/_find and rules/_find_statuses endpoints.

Lastly, I was thinking about leveraging data plugin for creating an RPC infrastructure -- just like they do it in Threat Hunting. They can add a server-side function and call it from the client app -- without introducing new HTTP endpoints. HTTP endpoints as they are now are public (I'd say even "published"), meaning that when we add a new endpoint we agree on maintaining it and avoiding breaking changes untill the next major. This is hard, and would be good to have an option to bypass this in those situations where an app needs a new app-specific "endpoint", but it's not generic/useful enough to publish it for all users of Kibana. This is not super related to this RFC, but I'd still like to mention this idea here.

@ymao1 ymao1 self-assigned this Mar 25, 2021
@ymao1
Copy link
Contributor

ymao1 commented Mar 29, 2021

Addressing the immediate ask of bulk rule actions:

  • Bulk create/delete - Currently single create and delete functions will throw an error if any step of the create/delete process fails. Assume the bulk version of these functions should not throw an error if any one operation fails, rather it should return an array of responses, either success or errors with reason?
  • Bulk get - This was not mentioned in the issue description - is that because it's not needed or because it's a lower priority?
  • Bulk update - This was only mentioned in the context of bulk enabling/disabling in the issue description. Is this because we are considering rules to be immutable and therefore will never be updating them? Is the ask then that there's a bulkEnable and bulkDisable function on the alerts client? Note that currently, enabling/disabling a rule updates the updatedBy and updatedAt fields for a rule. Does that violate rule immutability?

Agree with @banderror that there is a lot of security specific functionality in the security routes outside of purely calling the alerts client so providing these bulk actions will not immediately remove the need for the security routes since there are additional. Out of the additional actions that are performed before and after calling the alerts client, do we know which ones might be needed by observability as well? That might help guide the alerting framework as to what might belong in the alert client.

@ymao1
Copy link
Contributor

ymao1 commented Mar 29, 2021

Find rules status route is undocumented but maybe should be documented? Also do we want this in the security solution routes or in alerting? My feeling is it belongs in the security solution but I could see utility in having it be available for other solutions too. This could also exist in the get framework health route from alerting.

Are you suggesting that the rules with error status should be returned as part of the /api/alerting/_health endpoint in addition to whether there are any rules in an error state?

Lastly, new routes for adding actions

Is this route requested because security creates a separate rule for notifications instead of including the actions as part of rule creation? What would be the requirements of this new route?

@gmmorris
Copy link
Contributor

Find rules status route is undocumented but maybe should be documented?

I believe this omission is intentional - this is an API we consider internal, open to change and hence not on the public docs.
While we do want other teams in Elastic to use it, as we can work closely with them when a change is needed, we don't want this on external docs.
@mikecote can confirm whether I'm right about that. 😬

The Platform leads are in the process of formalising clearer guidance around the distinction between internal/external apis and what manner of documentation we should provide for each. Hopefully this'll be clearer once we finish that work 😄 . (cc @kobelb )

@ymao1
Copy link
Contributor

ymao1 commented Mar 31, 2021

Find rules status route is undocumented but maybe should be documented?

I believe this omission is intentional - this is an API we consider internal, open to change and hence not on the public docs.

I thought this was in reference to the find rule status route inside security solutions and whether this should be lifted up into the alerting framework?

@gmmorris
Copy link
Contributor

Find rules status route is undocumented but maybe should be documented?

I believe this omission is intentional - this is an API we consider internal, open to change and hence not on the public docs.

I thought this was in reference to the find rule status route inside security solutions and whether this should be lifted up into the alerting framework?

Oh, sorry, probably my bad. 😬
Ignore me 😅

@ymao1 ymao1 removed their assignment Apr 1, 2021
@dontcallmesherryli dontcallmesherryli added the dependencies Pull requests that update a dependency file label Apr 1, 2021
@dontcallmesherryli dontcallmesherryli added Theme: rac label obsolete Feature:Detection Rules Security Solution rules and Detection Engine labels Apr 14, 2021
@dontcallmesherryli
Copy link

@oatkiller assigning this to you to start grooming/planning for this ticket.

@peluja1012 peluja1012 added the Team:Detection Rule Management Security Detection Rule Management Team label Sep 15, 2021
@peluja1012 peluja1012 added the technical debt Improvement of the software architecture and operational architecture label Oct 26, 2021
@kobelb kobelb added the needs-team Issues missing a team label label Jan 31, 2022
@botelastic botelastic bot removed the needs-team Issues missing a team label label Jan 31, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
dependencies Pull requests that update a dependency file discuss Feature:Alerting Feature:Detection Rules Security Solution rules and Detection Engine Team:Detection Rule Management Security Detection Rule Management Team Team:Detections and Resp Security Detection Response Team Team:ResponseOps Label for the ResponseOps team (formerly the Cases and Alerting teams) technical debt Improvement of the software architecture and operational architecture Theme: rac label obsolete v8.0.0
Projects
No open projects
Development

No branches or pull requests