-
Notifications
You must be signed in to change notification settings - Fork 8.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make alert params searchable #50213
Comments
Pinging @elastic/kibana-stack-services (Team:Stack Services) |
Relates to "Make rules sortable, filterable, and aggregatable" in #50222. |
This would be good. I've been discussing with users how to be able to report on their rules e.g. in a dashboard, canvas etc which would need to aggregate based on alert parameters. You can just about do something with scripted fields at the moment but it's not ideal. |
Also see this issue: Request for alerting internal tags structure. The gist is to add a new "tags" property to alerts, but would be separate from the existing Would having this be good enough to solve the requirements? I think it depends on how/what you want to search on. I don't really want to go down the path of adding mappings for alertType-specific data, seems like the migration problem would be ... messy. And not sure what the other options are. |
For some other efforts outside of alerting where we use tags both external and internal within a saved object structure we decided on using a leading underscore to designate that the internal tags should remain internal. tags
_tags fwiw. That Saved Object has nothing to do with alerting but figured I should mention it. |
I've only just began looking into this but I want to summarise what we currently know as it doesn't look like this will be easy to support and I want to make sure we have a good understanding of the context. Current StateBefore we talk about the need, lets just describe what we currently have.
The fields we want to make searchable as part of this issue are In the face of it querying by these internals objects is straight forward, but as this Saved Object needs to support all types of alerts and actions, we have a challenge as each alert type can have different shapes to these fields. To support these multiple shapes we tell Elasticsearch not to create a mapping for these objects - which means that querying by their shape isn't actually possible. The NeedThat said, we'd like to be able to query for specific Alerts based values that are stored in these fields, so that we can:
Possible SolutionsSo, the reason we can't currently support querying against these fields is clear, but there are a few approaches we could take to make these requirements possible- none of which are straight forward, so some discussion is needed to understand the cost-value ratio. One assumption that I'm making for all of these is that we want to rely on ES for this and not do any of these operations in memory as that would make it inefficient and hard to support pagination. Create a Saved Object type for each AlertTypeThis approach would mean that whenever a new AlertType is created we generate a brand new type of SavedObject with its own mapping. There are a few challenges with this approach:
There's also a clear limitation to this approach: @mikecote has already told me that there is a danger here of a mapping explosion which I need to investigate further. Enable dynamic mapping + create a deep objectInstead of splitting the SavedObject types between the AlertTypes, we can take an approach similar to what SavedObjects itself does. For example, this would mean that given an alert of type "example.always-firing" with an action of type ".index" you would store the data like so:
Instead of what we currently do which is this:
You may note how we have the addition of the "example.always-firing" and ".index" keys as appropriate under the But this too has challenges:
Static mapping + flattened objectAnother option is to standardise the shape of params across all AlertTypes such that each AlertType will specify the exact shape and types of their params and we'll merge these shapes together into one static shape which will be used to define the mappings of these This is the simplest solution in terms of the mapping, but introduces a whole set of challenges in the framework:
Challenges across the boardAll of the above options will require changes in the SavedObjectsClient as you can only sort/filter by a rood field at the moment, and supporting "deep" fields inside of "objects" isn't currently Next stepsAs you can see, none of these options are straight forward and there isn't a clear winner. This all requires a lot more investigation, and playing around with the code locally I think that the second option (Enable dynamic mapping + create a deep object) produces the most maintainable option, but it has potential issues that still need investigation which is what I'll likely be looking into next. If anyone has thoughts or concerns on these options (or perhaps a 4th option we can investigate) I'm all ears. :) |
I'm worried about any solution that introduces new mappings for alertType-specific data, due to all the challenges pointed out ^^^. It's not completely clear that we need an ES solution here; from the SIEM issue #50222:
I read "table scan" as "do as much of a query as you can with ES, then do the remaining filtering/mapping/aggs on the results in JS". That will work if the number of alerts is "reasonable", which I don't know if it is. I think we should also find out if having an "internal tags" via #58417 would be good enough for now. This would presumably be a parallel of the current tags structure, but not editable (or probably viewable) in the UI, only programmatically. Not nearly the same as ES fields, but may be useful enough for common needs. There was also mention of scripted fields in the discussion above, but I'm not sure how we might use them. |
@FrankHassanabad Would being able to query/filter/sort by a set of internal tags be enough for you? |
FYI, scripted fields seems to work and maybe at least better than doing it
in js.
…On Wed, May 13, 2020 at 1:49 PM Gidi Meir Morris ***@***.***> wrote:
@FrankHassanabad <https://github.com/FrankHassanabad> Would being able to
query/filter/sort by a set of internal tags be enough for you?
It looks like making the params/config actually searchable would a
significant piece of work that we'd need to be cautious before picking up.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#50213 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABYDJMPTGPFNH7JXRPMRVC3RRKJOXANCNFSM4JL2VCDQ>
.
--
Matthew Adams
{
“title”: "Senior Solution Architect”,
“location”: “5 Southampton St.Covent Garden, London WC2E 7HA”,
“url”: “elastic.co”,
}
Search. Observe. Protect.
|
Sorry but could you be more specific? |
I lifted that term from relation DB's where the definition is:
[1] Ref: https://en.wikipedia.org/wiki/Full_table_scan Don't know if there is a ES adapted term but basically anytime I have to read all of the Saved Objects into memory either through buffering or streaming I count that as a table scan and since it's done through network calls from ES -> Kibana it is adds to the expense.
Yeah that's our ultimate goal. We are hoping to avoid any operation that causes us to iterate over all the alerts in memory from ES -> Kibana due to:
Can we get aggs as well with arrays in Elastic Search? The use case is that sometimes we need to do unique counts of items and display them. If this PR is merged we might have it? If we were able to query/filter/sort/aggs I think that covers all of our use cases to avoiding table-like scans. As an aside: If you still are going to allow mapping I would suggest combining two parts of your approaches above. This part:
and this part:
So that it becomes this:
Then you have exact mappings and avoid mapping explosions and conflicts and the Saved Objects migration system takes over when we update our mappings. Then we are responsible as a team for migrating our mappings altogether. For what is worth also ... On the community forums and community slack, users are opening up their I know that might not be exactly what we may want this quickly but it is something we have to keep in mind that users are granting each other privileges to their saved objects index so they can write their own dashboards against static SIEM rules. Since things are "beta" I think they would be ok with updates but might become frustrated if their dashboards are no longer possible. On the flip side, if this brings more features they couldn't have before such as sorting/filtering/querying/aggs then they will be very delighted even if we have to advise them on how to update there existing dashboards. This might be an unsupported or discouraged thing that users are opening up saved object indexes? However I want to point it out as it's already happened. |
If, for example, you create a filtered alias to just pick the rules from .kibana then add an index pattern in kibana with this scripted field using painless def mitre = []; You can then aggregate on the tactic ID for reporting. |
Are sorting and aggregations absolutely necessary? If not, elastic/elasticsearch#33003 is a potential solution worth evaluating. |
New possible solution mentioned in option 4 of #67290. That issue explores options to solve Elasticsearch merging objects on update when the mapping has enabled: false. This may not be a good approach but worth mentioning. Probably doesn't solve sorting or aggregations which would make elastic/elasticsearch#33003 more a solution worth evaluating as @kobelb mentioned.
The step further that would be required to make the values searchable would be do split the values into different mapped fields. This would require orchestration between field name and value field. |
I've also worked on a POC that splits the Here are the good things I discovered during the POC:
Here are the bad things I discovered during the POC:
I want to explore creating a separate saved object for the params next and see how that goes. There are still other options outside of these, but it is good to know each's pros/cons to make sure we're making the right investment for the long term. |
I took a quick look at this approach. It would rely on using the join field type, and has_child queries. Since the requirements are to use the same field across both saved object types, it doesn't seem like a good approach since saved object attributes are stored in different fields in Elasticsearch by design. There may be alternatives in this approach, but they don't seem worth investigating to me at this time. |
After looking at all the options that are on the table, I want to propose going with the "deep object with static mappings" approach mentioned here. UsageThis approach checks the boxes for filtering, searching, and sorting by doing something like the following on my POC:
Addressing the concernsI have raised a few down points, but most of them are not an issue if I think long-term. Below is how I see each being resolved:
This would be a good time to split the data access logic out of the alerts client and create a separate file. This is where the pre-save and post-read logic can live, and it would also keep the alerts client smaller as more code moves over.
However we decide to map these parameters, we will have to distinguish them by alert type id. Hence, I think long term, we may want to prevent using
This approach is not as bad as the saved object type per alert type as only the parameters are added to the mappings and only the alert types that opt-in for now. In the future, and maybe as we rename
This problem is short-term until we would make all the alert types require mappings for their parameters. I wouldn't make a design decision based on this. If we allow alert types to define migrations (#50216), a simple no-op migration would move their params from the default
This problem will exist for any approach that requires mapping. There is a high likelihood that errors here will happen, and we should work with the core team to see what we can do to ensure Kibana can still start when this migration fails. Alert types that use
This problem is short-term as well. I think long term, we should force all alert types to define their mappings. Here as well, I wouldn't make a design decision based on this.
Here we could do something like the saved objects API does where it requires
I have something worked out with the core team that would solve this problem and seems in line with what other teams ask for. This approach will also work for us to support migrations per alert type. Next stepsI will set up a design meeting to ensure the @elastic/kibana-alerting-services stands by this proposal before starting implementation. After, I will circle back with the Security and Observability teams who requested this to confirm the approach solves what they're asking for. |
I spoke with @romseygeek from the ES Search team and came to the same conclusion of my proposal where having a type associated with each param so we can build mappings is the best way forward for us. 👍 |
After a design discussion and further iterations on a POC, @ymao1 and I have something ready for approval. The underlying fundamentals have changed. If you're interested, feel free to read on. Otherwise, @spong @sqren, we'll be in touch soon to get your 👍 before making a PR to master. @elastic/kibana-alerting-services it was hard to find time to do a follow-up session, so I'm opting for async to keep momentum on this issue. The details are below. TL;DR
ApproachThe approach we have settled on is There are some decent good sides going with this approach:
Relatively, there are not as many downsides going with this approach:
Answering the earlier concernsWhat do we do what two AlertTypes use the same name for a field? How do we handle migrations within a specific AlertType? And what about across all types? What happens if the shape is wrong? Do we validate ourselves? Rely on plugins? If this means dynamically merging mappings and types on the "way into the framework" and then exposing it as "portions" of this type on the "way out of the framework" back to the solution - are we introducing a lot of complexity that will be hard to maintain in the future? Future thinkingIn the future, it seems best to create an Elasticsearch index per alert type. This will allow cross-index queries, sorting, and filtering properly while letting Elasticsearch handle issues when mappings differ between indices at search time. The queries would be done the same way as the approach indicated above, and it feels like the approach we should do when writing alerts as data (index per alert type). This approach brings an implementation similar to where we want to be while also addressing the problem for alerts as data. |
I had a quick chat with @kobelb about the approach, one protection worth investigating is when an alert type doesn't use searchable params but ends up being searchable because another alert type with similar fields that defined the mappings. It could be worth adding some protections to the _find API. |
Are we catching this during a build step (CI) or is there a risk that these issues will go undiscovered until end users notify us?
+1 |
@sqren it would be done during a build step to notify ASAP when something is wrong. |
We have discovered a blocker going with the approach stated above where We have made many attempts at solving this problem. All the options turned into a not-so-great idea for the team to implement and support in the long run. We have decided to abort this issue and revisit if ever the saved object types can be created in their own Elasticsearch index (see #70471 (comment)). In the meantime, we will try to make the alert parameters filterable by using Elasticsearch’s “flattened” type (see #92010). We opened an issue (see #92011) to explore supporting free-text searching on alert information (metadata). We will prioritize this issue once we have some requests. For numbers, we won’t be able to do something at this time. Potentially elastic/elasticsearch#61550 could solve the problem. Solutions who cannot wait will have to create their sidecar objects with alerts and do filtering, sorting and searching within those instead. The lessons learned here apply to the upcoming alert instance as data story to denormalize alert parameters and make them appropriately indexed in Elasticsearch. |
Fixed in #92036 |
I've decided to reopen this issue, as I know this is still a high priority request. I feel like we should still keep this issue open as a sort of open and unsolved problem statement. |
Yes, we absolutely need this based on what I know about our product backlog. Great news! |
From #50222:
The main focus of this issue is making alert params (more than action config) searchable, sortable and filterable if there's extra work necessary to support this in actions, we can create a follow up issue.
The text was updated successfully, but these errors were encountered: