Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Lens] Expose shard size for top values (terms) aggregations #117909

Closed
Tracked by #57708
ghudgins opened this issue Nov 8, 2021 · 7 comments · Fixed by #129220
Closed
Tracked by #57708

[Lens] Expose shard size for top values (terms) aggregations #117909

ghudgins opened this issue Nov 8, 2021 · 7 comments · Fixed by #129220
Assignees
Labels
enhancement New value added to drive a business result Feature:Lens impact:needs-assessment Product and/or Engineering needs to evaluate the impact of the change. Team:Visualizations Visualization editors, elastic-charts and infrastructure

Comments

@ghudgins
Copy link
Contributor

ghudgins commented Nov 8, 2021

Describe the feature:

  1. Expose Elasticsearch "shard_size" in advanced option in lens' "Top values" panel.
  2. Once done, update message from [Lens] Expose Elasticsearch accuracy warnings to the user #94918 to

Top values for this visualization may be approximate due to how the data is indexed. Try increasing the number of Top values, set {Index Shard Size to "Increased Precision" } or use Filters instead of Top values for precise results. To learn more about this limit, visit the documentation.

Design

My ideas on the day I made this: Name this vis level advanced setting "Index Shard Size" and have quick options like "Performance Balanced (Default)", "Increased Precision (Slower reports)". Should this allow custom values for advanced users?

Describe a specific use case for the feature:
When My data is high cardinality
And my data is not evenly distributed in a method similar to my shard / ILM policy (i.e. my data fluctuates dramatically in time)
I need to increase my shard size at the expense of performance
So I can solve inaccuracy warnings

@ghudgins ghudgins added enhancement New value added to drive a business result Team:Visualizations Visualization editors, elastic-charts and infrastructure Feature:Lens labels Nov 8, 2021
@elasticmachine
Copy link
Contributor

Pinging @elastic/kibana-vis-editors (Team:VisEditors)

@ghudgins
Copy link
Contributor Author

ghudgins commented Nov 8, 2021

@MichaelMarcialis 👀 no urgency but I made some hip-fired designs on this issue idea. Your hip fire is better than mine 😄

@MichaelMarcialis
Copy link
Contributor

@MichaelMarcialis 👀 no urgency but I made some hip-fired designs on this issue idea. Your hip fire is better than mine 😄

Haha, no worries. Want to add it to our next working group agenda to discuss the issue and prioritize?

@ghudgins
Copy link
Contributor Author

ideas on 17 November WG: can we come up with a range of values to bind to a slider? a set of options? @flash1293 to follow-up with elasticsearch team

@flash1293
Copy link
Contributor

Followed up with the Elasticsearch team on this. There is no obvious "right" way to do this, but I feel comfortable about basing a decision on the following points:

  • Right now, increasing "size" is the only way in Lens to fix accuracy errors
  • High "shard_size" is better than high "size" wrt performance
  • "size" is allowed to go up to 1000

As a high "size" is allowed already, we are not making things worse by allowing a way to go up to this limit using "shard_size" as well.
Offering a switch between "performance mode" and "accuracy mode", with "accuracy mode" setting shard_size to Math.max(1000, size * 1.5 + 10) will help in a lot of cases while only moderately increasing the risk of Elasticsearch OOM errors.

@drewdaemon
Copy link
Contributor

drewdaemon commented Mar 31, 2022

@flash1293 do I understand the following correctly?

  • I will first need to add support for the shard_size parameter to the terms, rare-terms, and multi-terms es-aggs expression functions.
  • At the end of the day we are looking to add a "performance mode/accuracy mode" switch to the advanced settings for the "Top Values" function.

@flash1293
Copy link
Contributor

I will first need to add support for the shard_size parameter to the terms, rare-terms, and multi-terms es-aggs expression functions.

Right, for terms and multi-terms, for rare terms this setting doesn't exist and should be disabled in the UI

At the end of the day we are looking to add a "performance mode/accuracy mode" switch to the advanced settings for the "Top Values" function.

Correct, also in case the precision warning is rendered, we can add a call-to-action button in there to switch over to accuracy mode.

@exalate-issue-sync exalate-issue-sync bot reopened this Apr 19, 2022
@exalate-issue-sync exalate-issue-sync bot added the impact:needs-assessment Product and/or Engineering needs to evaluate the impact of the change. label Apr 19, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New value added to drive a business result Feature:Lens impact:needs-assessment Product and/or Engineering needs to evaluate the impact of the change. Team:Visualizations Visualization editors, elastic-charts and infrastructure
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants