Skip to content

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Time-series Line / Bar chart: Add option to plot 0 if no data. #15036

Closed
EBoisseauSierra opened this issue Jun 8, 2021 · 10 comments
Closed

Time-series Line / Bar chart: Add option to plot 0 if no data. #15036

EBoisseauSierra opened this issue Jun 8, 2021 · 10 comments
Labels
enhancement:request Enhancement request submitted by anyone from the community viz:charts:timeseries Related to Timeseries

Comments

@EBoisseauSierra
Copy link
Contributor

Issue

I have a simple metric (say: SUM(orders_count)), and I would like to track its evolution over time. My objective is to visualize the total for different time granularity (group by hour, day, week, etc.).

However, I don't always have records for each time bucket (e.g. I didn't had any order between 2 and 3am, or on Sunday, etc.). In this situation, I want to plot 0 for each time bucket I had no data for.

In that situation, the line chart simply omit the given time bucket and interpolate the line given the previous and next data point:

Screenshot from 2021-06-08 09-26-46_shadow

I am aware that I can use pandas resampling methods to actually plot a 0 data point on each given hour I had no order:

Screenshot from 2021-06-08 09-27-10_shadow

However, this workaround doesn't “follow” the time granularity I aggregate data on:

Screenshot from 2021-06-08 09-27-37_shadow

This means that each time I want to update the granularity, I have to modify it at two different places to get the correct graph.

Screenshot from 2021-06-08 09-27-55_shadow

Moreover, this makes that I cannot pass the time granularity as a parameter from a native filter.

Requested feature

I would like a simple option to replace missing values (i.e. time buckets with no record to aggregate) with either:

  • linear interpolation (current behaviour),
  • nothing (can be emulated via pandas.resample(<granularity>, mean)),
  • 0 (can be emulated via pandas.resample(<granularity>, sum)).

Metabase uses a simple dropdown menu for that:

Screenshot from 2021-06-08 11-02-27_shadow.png

Alternatives

As seen above, using pandas.resample is solving the issue partially only, as it doesn't dynamically adjust to the selected time granularity.

One could of course write a custom query that joins the list of every single hour between min(timestamp) and max(timestamp) to force generate records for these time buckets… but it's a lot of work — and again wouldn't be dynamically adapting to different time grains.

Context

Examples generated on Superset 1.1.0.

@EBoisseauSierra
Copy link
Contributor Author

EBoisseauSierra commented Jun 8, 2021

Note that the charts above were generated using the Line Chart chart. The behaviour is the same with the Time-series Bar chart.

However, the more correct Time-series (line) chart (Echarts) doesn't feature pandas.resample, so the workarounds aren't applicable there.

@EBoisseauSierra
Copy link
Contributor Author

Very much related to following PRs:

@zhaoyongjie zhaoyongjie added the enhancement:request Enhancement request submitted by anyone from the community label Jun 8, 2021
@zhaoyongjie
Copy link
Member

zhaoyongjie commented Jun 8, 2021

related issue:

@zhaoyongjie zhaoyongjie added the viz:charts:timeseries Related to Timeseries label Jun 8, 2021
@NickChoiRBX
Copy link

For my purposes, it'd be much better if there was a linear interpolation option. Is there a version that's available on? I've seen a lot of people requesting this be made available as a feature, given that the app's UI used to behave this way.

@EBoisseauSierra
Copy link
Contributor Author

@NickChoiRBX Isn't linear interpolation the current behaviour already (see first screenshot)?

@NickChoiRBX
Copy link

image
I've seen mostly people mentioning that this functionality is no longer available for them. I'm on version 1.0.1 and this is how line charts look for me (notice the point in the upper left corner for the purple line).

@rusackas
Copy link
Member

I know that @zhaoyongjie is working on advanced analytics (including resampling, I believe) for the newer Timeseries Chart. I hope that this helps resolve things for you soon.

I'm not sure, offhand, if it would be easy to add an option to the resample controls that sets grain to "Inhert Time Grain" or something to that effect. Then you wouldn't have to double-configure it.

It may also be possible to add a control to the chart which works a bit like the PR you found to fill in the data (based on the granularity) with zeros. Or better, yet, make it a choice to "infill data" and a secondary choice to fill with 0 points or null for anyone who doesn't like the connecting line. My suspicion is that most people want interpolation between known datapoints, but the second-most-common request would be the nulls inserted, to create gaps. 0 points doesn't seem like a terribly common need, but could be supported by these controls, I think.

Curious what @villebro thinks here too... about the general approach, as well as the dynamic/progressive-reveal controls that might be needed for my half-baked proposal ;)

@zhaoyongjie
Copy link
Member

Thanks for the proposal, I will finish it in the resample(advanced analytics).

@EBoisseauSierra
Copy link
Contributor Author

To provide some context: We believe “0-filling” (in addition to interpolation and gaps) can be relevant when, e.g., you want to plot the count of records or sum of a given metric per <time_grain>. In such cases, not having data (for a given time bucket) means something — i.e., that the count/sum is actually 0.

@rusackas
Copy link
Member

rusackas commented Jul 2, 2021

Yep, I've definitely had needs for 0-filling and null-filling (gaps) in the past, so I'd love to find a good design for controls to allow any of these options.

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

Labels
enhancement:request Enhancement request submitted by anyone from the community viz:charts:timeseries Related to Timeseries
Projects
None yet
Development

No branches or pull requests

4 participants