Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

What should I do if I do not need time series in the box plot chart? #10468

Closed
cyang52 opened this issue Jul 29, 2020 · 6 comments · Fixed by #11199
Closed

What should I do if I do not need time series in the box plot chart? #10468

cyang52 opened this issue Jul 29, 2020 · 6 comments · Fixed by #11199
Labels
!deprecated-label:bug Deprecated label - Use #bug instead

Comments

@cyang52
Copy link

cyang52 commented Jul 29, 2020

I know that superset has a close connection with time series, but if I do not need such attribute and set is_timeseries = false, the box plot will be a flat line.
Instead, if I need other attributes to series data, such as customers names or countries( columns in the dataset) how can I replace time series to other attributes? For example, I plan to create a new control panel like Entity and use this panel to replace time panel (time range and time column). This may be very similar to Tableau's detail panel.

SELECT region AS region, year AS __timestamp, sum(SP_POP_TOTL) AS sum__SP_POP_TOTL FROM wb_health_population INNER JOIN (SELECT region AS region__, sum(SP_POP_TOTL) AS mme_inner__ FROM wb_health_population WHERE year >= STR_TO_DATE('1960-01-01 00:00:00.000000', '%Y-%m-%d %H:%i:%s.%f') AND year <= STR_TO_DATE('2020-07-29 08:20:11.000000', '%Y-%m-%d %H:%i:%s.%f') GROUP BY region ORDER BY mme_inner__ DESC LIMIT 25) AS anon_1 ON region = region__ WHERE year >= STR_TO_DATE('1960-01-01 00:00:00.000000', '%Y-%m-%d %H:%i:%s.%f') AND year <= STR_TO_DATE('2020-07-29 08:20:11.000000', '%Y-%m-%d %H:%i:%s.%f') GROUP BY region, year ORDER BY sum__SP_POP_TOTL DESC LIMIT 50000
As you can see, SQL queries years, what I want is to make SQL queries other attributes.
I think what I should rewrite is viz.py and boxplot control panels. But in the viz.py, I didn't find where I can replace time series and in the control panels, I did not find where to delete the time panel.

Thanks!

@cyang52 cyang52 added the !deprecated-label:bug Deprecated label - Use #bug instead label Jul 29, 2020
@issue-label-bot
Copy link

Issue-Label Bot is automatically applying the label #question to this issue, with a confidence of 0.94. Please mark this comment with 👍 or 👎 to give our bot feedback!

Links: app homepage, dashboard and code for this bot.

@villebro
Copy link
Member

villebro commented Jul 29, 2020

@cyang52 I recently thought about precisely what you're describing here. Adding an additional multi-select control (entity is as good a name as any), and then grouping by groupby + entity would make it possible to calculate the box plot statistics across all entity columns. In addition, #10344 that was recently merged makes it possible to apply the time grain on any column that is used as a grouping variable, making it possible to do the exact same temporal box plot as before, if one so wishes.

As box plot is using the legacy plugin framework, you'd need to change the control panel in the superset-ui repo, more specifically here: https://github.com/apache-superset/superset-ui/blob/master/plugins/preset-chart-xy/src/BoxPlot/controlPanel.ts`. After that, you'd need to add logic in viz.py that makes sure the entity columns are added to groupbys and then handled in the pivot operation. It is worth mentioning that we're moving away from the legacy plugin architecture, so if I were to make major changes to the box plot viz, I'd probably

  1. port it to ECharts
  2. make the data request using the new chart data endpoint on /api/v1/chart/data

Doing it like this you wouldn't need to add anything to viz.py, however you'd need to do all the necessary pandas transformation work using the post_processing operations in buildQuery.

If you haven't already, check out the blog post by @rusackas on how to create a viz plugin using the new architecture: https://preset.io/blog/2020-07-02-hello-world/ . If you need help, feel free to reach out on slack.

@cyang52
Copy link
Author

cyang52 commented Jul 30, 2020

I checked #10344, just wondering if I do not have ds or any columns that relative to (time, year or month) , instead, I have region countries, etc. Is there any difference between before and after?

If I print out chart_data, it shows:
{'datasource': '17__table', 'viz_type': 'box_plot', 'slice_id': 58, 'url_params': {}, 'time_range_endpoints': (<TimeRangeEndpoint.UNKNOWN: 'unknown'>, <TimeRangeEndpoint.INCLUSIVE: 'inclusive'>), 'granularity_sqla': 'year', 'time_range': '1960-01-01 : now', 'metrics': ['sum__SP_POP_TOTL'], 'adhoc_filters': [], 'groupby': ['region'], 'limit': '25', 'color_scheme': 'bnbColors', 'label_colors': {}, 'whisker_options': '2/98 percentiles', 'x_ticks_layout': 'staggered', 'where': '', 'having': '', 'having_filters': [], 'filters': []}

<TimeRangeEndpoint.UNKNOWN: 'unknown'>, <TimeRangeEndpoint.INCLUSIVE: 'inclusive'>), 'granularity_sqla': 'year', 'time_range': '1960-01-01 : now',> those information I do not need and do not include in my data set columns.

@villebro
Copy link
Member

@cyang52 what I'm proposing is leaving the time fields there. If the user doesn't add the timestamp column to either entity or groupby, it will not be included in the query. However, if it is, then it will affect the chart. If we do make this change, we will need to migrate the chart metadata for existing boxplot charts to add the timestamp field to the new entity field to make it work like before.

@junlincc
Copy link
Member

🙏@villebro can we close this issue?

@villebro
Copy link
Member

villebro commented Oct 8, 2020

@cyang52 @junlincc please see PR #11199 which should resolve this issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
!deprecated-label:bug Deprecated label - Use #bug instead
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants