-
Notifications
You must be signed in to change notification settings - Fork 8.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support for wildcard
fields
#60933
Comments
Pinging @elastic/kibana-app (Team:KibanaApp) |
Pinging @elastic/kibana-app-arch (Team:AppArch) |
Thanks for creating this. In general it would be helpful if you state something like "mostly behaves the same" if you could list the differences, since they might have a high impact on whether and how we can solve that issue or not. Especially useful are answers to the questions:
But in general every API/behavioral difference to the |
The wildcard field compares to
While @jimczi and @jpountz have thought of this as predominantly a keyword field with wildcard optimisations I think the last feature in this table is important. For large machine-generated content such as:
With values >32k we physically can't use
In these cases, the answer to the usual " |
Thanks for the detailed comparison. This is really helpful. While it looks nearly the same, there is one thing that will make a difference for Kibana:
I'll remove the KibanaApp label from this, since given the list above there is nothing outside App Arch area that would require additions (assuming that we would still mark this as |
This is a long comment so the "TL/DR" is I think it's worth Kibana giving wildcard fields some special treatment in log message analytics. Wildcards in log message analyticsWhenever I'm helping support diagnosing elasticsearch cluster failures we have to sift through large log files and I use elasticsearch+kibana. The log messages can be big -here's the range of logged message sizes from a recent typical case: These fall beyond what would be useful or possible to map as Identifying the message type involves copying and pasting parts of the log as a query clause which is where the problems come in. Let's take this example of using a mouse to select the part of a message about a particular failing node - With the wildcard field these sorts of selections could be handled simply - the user selection is wrapped with asterisks and it matches in a predictable way without the searcher or the elastic db admin having to consider tokenisation policies. It does make me wonder how KQL or filter bars may organise these selections (KQL may be clunky if the copy/pasted values contain special chars and filter pills aren't easily ORed). I see little or no use for sorting or aggregations on a log message field like this so I wonder if we should have the option to disable that particular wildcard field behaviour either at the elasticsearch level or the kibana level. Maybe we need to think of the "wildcard-on-big-log-messages" and "wildcard-on-shorter-keyword-like fields" as two distinct use cases in Kibana/elasticsearch? |
Related - a regex debugger would be very useful: #66735 |
@markharwood can this be closed now that |
I still have a suspicion large wildcard fields shouldn't be included in Kibana's drop-down lists for sorting or aggs along with the "proper" keyword fields. Admins and users alike will be frustrated by the circuit-breaker exceptions these would cause. We know wildcard will be useful on large fields and we removed any "ignore_above" limits for them. I just can't see large fields making sense for sorting or aggs. Not sure how Kibana adds protection for that. |
I was just now discussing how I expect we'll want to use |
@markharwood I'm seeing this as an orthogonal issue that shouldn't be Kibana's concern, but Elasticsearch: If a field shouldn't be aggregated via Kibana, then it shouldn't be reported as aggregatable in |
Good point. I'm not convinced there's nothing left to be thought about in Kibana-land. |
As wildcard fields can't be distinguished from keyword fields from Kibana, I think that this one should be a question for Elasticsearch too? |
That sounds like adding a different field-expansion list for wildcard/regex queries than the existing general-purpose one? As for the aggregatable Y/N question, there's 2 options
With 2) there's questions about how Kibana might pick up a change in elasticsearch field_caps too if we make that dynamic. Maybe that's just a manual index-pattern refresh in Kibana. |
No I don't. For the record, it might also be ok to not do anything and rely on circuit breakers to abort aggs on stack traces. |
I think that was Jim's working assumption - the question is whether users and admins are going to be happy with that. |
@mattkime @petrklapka is this closed by mistake or actually confirmed to be working? |
@rayafratkina Thanks for bringing this to my attention as I should leave some notes -
For more refined handling of these fields we'll need a method of identifying them as their true type - #120284 |
Elasticsearch has a new
wildcard
field that mostly behaves as akeyword
field but runswildcard
queries more efficiently.Relates to elastic/elasticsearch#53175 and #35481.
The text was updated successfully, but these errors were encountered: