Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Add Data table bucket Split rows agg limit of 3 #1179

Closed

Conversation

RoyiSitbon
Copy link
Contributor

@RoyiSitbon RoyiSitbon commented Jan 25, 2022

Signed-off-by: sitbubu royi.sitbon@logz.io

Description

The default limit of 3 “Split rows” aggregated buckets isn't applied on this visualization which allows adding more buckets and can harm cluster performance.

In other types of visualizations, there is a default limit of up to 3 “Split rows” aggregated buckets, e.g. Horizontal-Bar visualizations.

We wish to have an option to configure the max,min limitation for the “Split rows” aggregated bucket under dataTable vis.
i.e. Using this configuration, we will limit the max amount to 3:
Screen Shot 2022-02-03 at 14 25 02

This way we will see the next result in our UI:
splitRowsLimited

If we won't set any limitation, i.e:
Screen Shot 2022-02-03 at 14 27 12

We will have same behavior as now:
splitRowsUnlimited

linked issue:
#1178

@kavilla
Copy link
Member

kavilla commented Jan 26, 2022

Hello @RoyiSitbon,

Thanks for opening this. Will check it out, right now it's blocked because the tests are failing due to this issue.

Initial nitpick feedback would be if you can some of the content you added to the PR to the commit description that provide insight on the commit without having to come to the PR. Also, I believe this could potential to add a quick test to verify it limits to 3.

@RoyiSitbon
Copy link
Contributor Author

Hello @RoyiSitbon,

Thanks for opening this. Will check it out, right now it's blocked because the tests are failing due to this issue.

Initial nitpick feedback would be if you can some of the content you added to the PR to the commit description that provide insight on the commit without having to come to the PR. Also, I believe this could potential to add a quick test to verify it limits to 3.

Hey, sure. Thanks for the feedback.

@jgough
Copy link
Contributor

jgough commented Jan 28, 2022

I often want to split on more than 3 fields and losing this ability would be a step backwards when data sets are small and performance is not an issue. Is there any scope of making this configurable setting with a default of 3?

@jgough
Copy link
Contributor

jgough commented Jan 28, 2022

The functionality of allowing more than 3 splits in data tables has been present since Kibana 4 and I use this ability regularly. Sadly I have many visualisations that would break and just not be possible unless it were possible to configure this limit.

@kavilla
Copy link
Member

kavilla commented Jan 29, 2022

The functionality of allowing more than 3 splits in data tables has been present since Kibana 4 and I use this ability regularly. Sadly I have many visualisations that would break and just not be possible unless it were possible to configure this limit.

I often want to split on more than 3 fields and losing this ability would be a step backwards when data sets are small and performance is not an issue. Is there any scope of making this configurable setting with a default of 3?

I see there is probably historical reason to have this not be capped at 3.

Perhaps, @ahopp or @kgcreative, what do y'all think about this change? Does it seems like a good improvement to make all of them similar in behavior and have an override config? Or can we gather more insight on the potential impact?

@kgcreative
Copy link
Member

kgcreative commented Jan 31, 2022

@kavilla, @RoyiSitbon, @jgough - I would prefer to handle this in a way where we don't introduce a breaking change. I would suggest a configuration option that's empty by default to introduce a split field limit (either in stack management, or kibana yml, or somewhere else). This would allow admins to configure this limit if needed for performance reasons, but wouldn't break existing functionality.

This setting should then be enforced by any visualization that uses split fields in agg buckets.

maybe something like aggBucketSplitFieldLimit: num;, where null or 0 allows for any number of split fields.

For additional corner case handling, existing visualizations with split fields beyond N would only display N split rows.

@RoyiSitbon
Copy link
Contributor Author

@kavilla, @RoyiSitbon, @jgough - I would prefer to handle this in a way where we don't introduce a breaking change. I would suggest a configuration option that's empty by default to introduce a split field limit (either in stack management, or kibana yml, or somewhere else). This would allow admins to configure this limit if needed for performance reasons, but wouldn't break existing functionality.

This setting should then be enforced by any visualization that uses split fields in agg buckets.

maybe something like aggBucketSplitFieldLimit: num;, where null or 0 allows for any number of split fields.

For additional corner case handling, existing visualizations with split fields beyond N would only display N split rows.

Sure, that sounds good to me.
I'll work on a configurable solution if agreed by all.
@kavilla @jgough

Thanks

@RoyiSitbon RoyiSitbon changed the title [2.x][BUG] Add Data table bucket Split rows agg limit of 3 [BUG] Add Data table bucket Split rows agg limit of 3 Feb 3, 2022
@RoyiSitbon
Copy link
Contributor Author

Hey, I changed it to be a configurable solution as talked.
Thanks

@kavilla
Copy link
Member

kavilla commented Feb 15, 2022

Hi @RoyiSitbon, if you rebase the tests should pass (granted if no failures by new implementation). Would you like for me to rebase for you on your branch? Let me know. Thank you!

Also, would there be an ability to add some tests to test this new functionality.

@RoyiSitbon
Copy link
Contributor Author

Hi @RoyiSitbon, if you rebase the tests should pass (granted if no failures by new implementation). Would you like for me to rebase for you on your branch? Let me know. Thank you!

Also, would there be an ability to add some tests to test this new functionality.

Hey, rebased
Also, I will try to think of a tailored test. Thank you

@RoyiSitbon
Copy link
Contributor Author

Hi @RoyiSitbon, if you rebase the tests should pass (granted if no failures by new implementation). Would you like for me to rebase for you on your branch? Let me know. Thank you!

Also, would there be an ability to add some tests to test this new functionality.

Added a new functional test with the right flow as well.

@RoyiSitbon RoyiSitbon force-pushed the addAggBucketSplitRowLimit branch 2 times, most recently from c3a8a60 to 734ccea Compare February 20, 2022 13:46
Signed-off-by: sitbubu <royi.sitbon@logz.io>
Signed-off-by: sitbubu <royi.sitbon@logz.io>
Signed-off-by: sitbubu <royi.sitbon@logz.io>
Signed-off-by: sitbubu <royi.sitbon@logz.io>
Signed-off-by: sitbubu <royi.sitbon@logz.io>
@ananzh
Copy link
Member

ananzh commented Mar 3, 2022

Sorry. I commented on the wrong place: #1178 (comment)

I will re-check it.

@ananzh
Copy link
Member

ananzh commented Mar 4, 2022

Hey @RoyiSitbon

After the latest change, advanced settings does not set a default split limit which is great. I have verified that. But I think the function is still not working right? For example, I set the advanced settings with/without max to limit the split to 3:

Screen Shot 2022-03-04 at 1 09 17 PM

Then go to visualization, create a data table using eCommerce sample data. However, I could still split even I have 3 buckets. First, the "Split rows" option has no changes and still clickable.
Screen Shot 2022-03-04 at 1 13 09 PM

Second, adding one more split is still working. We could see a descending based on product quality.
Screen Shot 2022-03-04 at 1 12 14 PM

So I am bit confused here. Did I mis-use or mis-understand here? pls help

@ananzh
Copy link
Member

ananzh commented Mar 5, 2022

Hey @RoyiSitbon

After the latest change, advanced settings does not set a default split limit which is great. I have verified that. But I think the function is still not working right? For example, I set the advanced settings with/without max to limit the split to 3:

Screen Shot 2022-03-04 at 1 09 17 PM

Then go to visualization, create a data table using eCommerce sample data. However, I could still split even I have 3 buckets. First, the "Split rows" option has no changes and still clickable. Screen Shot 2022-03-04 at 1 13 09 PM

Second, adding one more split is still working. We could see a descending based on product quality. Screen Shot 2022-03-04 at 1 12 14 PM

So I am bit confused here. Did I mis-use or mis-understand here? pls help

My bad. It turns a cache issue and need to refresh the page to reload the settings. It is better to use requiresPageReload: true to improve user experience.

@ananzh
Copy link
Member

ananzh commented Mar 5, 2022

Meanwhile, I have several questions

  1. do other graph types besides data table also have this issue? or do they have some bucket split default limit in the code?
  2. if other graph types have the same issue, do we or can we support them?
  3. if no other graph types have the same issue and data table is the only type, do we need to use {"table": {"bucket":{}}}? should we modify the settings to be just bucket-min and bucket-max?

Probably we should discuss more here.

@RoyiSitbon
Copy link
Contributor Author

requiresPageReload

Hey,
unlike this dynamic configuration setting, other agg types have constant settings at the moment.
The reason I implemented it like so, is because It will facilitate the implementation process afterward.
We think it will be a nice addition to integrate this feature to all vis and agg types in the system. but I guess it will be a gradual process.

@kgcreative
Copy link
Member

I think it would be worth documenting all of the different visualizations that can use bucket aggregations. I think one question for the community here is if we want to have to pass all the different visualization types in order to change the split limit on all of them, or if there is an approach that can cover all visualization types, and, if desired, control individual visualizations at a more granular level.

From reading the issue, this seems to come from a desire to limit potentially expensive queries on large data sets, so it feels like a single setting to control the split limit everywhere it's used would be friendlier than having to control these individually

@RoyiSitbon
Copy link
Contributor Author

I think it would be worth documenting all of the different visualizations that can use bucket aggregations. I think one question for the community here is if we want to have to pass all the different visualization types in order to change the split limit on all of them, or if there is an approach that can cover all visualization types, and, if desired, control individual visualizations at a more granular level.

From reading the issue, this seems to come from a desire to limit potentially expensive queries on large data sets, so it feels like a single setting to control the split limit everywhere it's used would be friendlier than having to control these individually

Hey @kgcreative, thanks for replying.
I understand what you are saying, who would be able to answer this type of question?
How will we understand if this solution is legit, and can be merged as-is or required to accommodate?

Thanks

Copy link
Member

@ashwin-pc ashwin-pc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Based on #1178 this does not seem to address the real problem here. If the goal is to limit expensive aggregations dynamically, this only does that for the table agg and that too for the Split Row agg type. I believe that issue needs a lil more discussion before we implement a solution. I'll call out my concerns and possible solutions in that issue.

@@ -45,6 +45,10 @@ export function getTableVisTypeDefinition(
core: CoreSetup,
context: PluginInitializerContext
): BaseVisTypeOptions {
const tableBucketAggAmounts = core.uiSettings.get('visualize:aggAmounts')?.table?.bucket;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a lil confused by this change. This UI setting only applies to the table viz since its used in the table_vis_type. But the setting is labeled as a way to dynamically set the min and max agg limits for the bucket viz. That is not the case. Am i missing something?

[VISUALIZE_AGG_AMOUNTS]: {
// currently only supports dataTable vis
name: i18n.translate('visualizations.advancedSettings.visualizeAggAmountsTitle', {
defaultMessage: `Set bucket's aggregation min, max level limitation`,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any reason to only limit the bucket agg and not the metric agg too?

@joshuarrrr
Copy link
Member

@RoyiSitbon I'm going to convert this to a draft until we can come to consensus on a more general approach as discussed in #1178.

@joshuarrrr joshuarrrr marked this pull request as draft June 22, 2022 19:02
@seanneumann
Copy link
Contributor

We are closing this PR because what it says and does is very different. Next steps to resolve the discussion on #1178. After that feel free to open a new PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[BUG] Data Table Visualisation - The default limit of 3 "Split rows" aggregated buckets isn't applied
8 participants