-
Notifications
You must be signed in to change notification settings - Fork 24.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ESQL: unused union type attributes can use a lot of memory #111964
Comments
Ideas that we can try:
|
Pinging @elastic/es-analytical-engine (Team:Analytics) |
Unfortunately we cannot easily compute the error message afterwards, as we don't preserve the Probably the best solution is the first one you proposed, ie. reduce the size of the error message. Stripping the message completely could be an option as well; the data node does not need it. Anyway, for debug reasons and for consistency, I'd prefer to keep the deserialized versions as similar as possible to the original. |
When dealing with index patterns, eg. `FROM logs-*`, some fields can have the same name but different types in different indices. In this case we build an error message like ``` Cannot use field [multi_typed] due to ambiguities being mapped as [2] incompatible types: [ip] in [test1, test2], [keyword] in [test3]" ``` With this PR, in case of many indices involved, we avoid listing them all, but we only list three of them and provide information about how many other indices are affected, eg. ``` Cannot use field [multi_typed] due to ambiguities being mapped as [2] incompatible types: [ip] in [test1, test2, test3] and [2] other indices, [keyword] in [test6] ``` (see the `and [2] other indices`) Since these error messages are stored in `UnspportedAttributes` and serialized, this PR reduces significantly the size of a serialized execution plan with many type conflicts. Fixes elastic#111964 Related to elastic#111358
When dealing with index patterns, eg. `FROM logs-*`, some fields can have the same name but different types in different indices. In this case we build an error message like ``` Cannot use field [multi_typed] due to ambiguities being mapped as [2] incompatible types: [ip] in [test1, test2], [keyword] in [test3]" ``` With this PR, in case of many indices involved, we avoid listing them all, but we only list three of them and provide information about how many other indices are affected, eg. ``` Cannot use field [multi_typed] due to ambiguities being mapped as [2] incompatible types: [ip] in [test1, test2, test3] and [2] other indices, [keyword] in [test6] ``` (see the `and [2] other indices`) Since these error messages are stored in `UnspportedAttributes` and serialized, this PR reduces significantly the size of a serialized execution plan with many type conflicts. Fixes elastic#111964 Related to elastic#111358
When dealing with index patterns, eg. `FROM logs-*`, some fields can have the same name but different types in different indices. In this case we build an error message like ``` Cannot use field [multi_typed] due to ambiguities being mapped as [2] incompatible types: [ip] in [test1, test2], [keyword] in [test3]" ``` With this PR, in case of many indices involved, we avoid listing them all, but we only list three of them and provide information about how many other indices are affected, eg. ``` Cannot use field [multi_typed] due to ambiguities being mapped as [2] incompatible types: [ip] in [test1, test2, test3] and [2] other indices, [keyword] in [test6] ``` (see the `and [2] other indices`) Since these error messages are stored in `UnspportedAttributes` and serialized, this PR reduces significantly the size of a serialized execution plan with many type conflicts. Fixes elastic#111964 Related to elastic#111358
…12819) * ES|QL: shorten error messages for UnsupportedAttributes (#111973) When dealing with index patterns, eg. `FROM logs-*`, some fields can have the same name but different types in different indices. In this case we build an error message like ``` Cannot use field [multi_typed] due to ambiguities being mapped as [2] incompatible types: [ip] in [test1, test2], [keyword] in [test3]" ``` With this PR, in case of many indices involved, we avoid listing them all, but we only list three of them and provide information about how many other indices are affected, eg. ``` Cannot use field [multi_typed] due to ambiguities being mapped as [2] incompatible types: [ip] in [test1, test2, test3] and [2] other indices, [keyword] in [test6] ``` (see the `and [2] other indices`) Since these error messages are stored in `UnspportedAttributes` and serialized, this PR reduces significantly the size of a serialized execution plan with many type conflicts. Fixes #111964 Related to #111358 * Spotless --------- Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
Relates #107330
In simple queries like
FROM logs-* | LIMIT 10
, many indices and fields can be involved. In particular, fields mapped to different types in different indices are just passed through, even though we do nothing with them (their values are alwaysnull
).Multi-typed fields internally create attributes which contain an error message string with the different types they're mapped to in different indices, e.g.
line 1:32: Cannot use field [some_field] due to ambiguities being mapped as [2] incompatible types: [double] in [logs-1, logs-2, logs-3], [float] in [logs-4]
. (More specifically: When multi-typed fields are passed through, they get internally represented asUnsupportedAttribute
s, whose error message is constructed fromInvalidMappedField
s here.)These messages can get long; additionally, they get serialized as part of their respective attributes and are being sent to data nodes, although we likely do not need the error string in the data nodes.
Let's reduce that overhead during planning and when serializing the plan.
The text was updated successfully, but these errors were encountered: