Stacked Area Charts #2960

alexcjohnson · 2018-09-01T05:50:18Z

We held out a long time on this one, but stacked area charts are finally coming to plotly.js.

The API is as discussed in #1217:

Provide matching stackgroup attributes to some scatter traces and they become a stack.
There are no plot-wide stacking attributes; stack-wide attributes are in the trace definitions, and we'll take a value for each attribute from the first trace in the stack that contains that attribute, visible or not (so the stack doesn't fall apart if you hide the first trace). This is different from, and more powerful than, how we describe bar stacking/grouping - and should be reviewed with an eye toward eventually using a similar framework for bars.
The data for all traces in the stack are sorted by position, and gaps in each trace are filled in either with zeros or interpolations
The one item from add _real_ stacked area charts [feature request] #1217 I did not include here is stackgaps: 'interrupt'. That's going to require some finicky drawing code, particularly if we want to support arbitrary line.shape so I'll leave it for later. But 'infer zero' (default) and 'interpolate' are included here.
Another open item is to improve hover info. What I did here matches stacked bars, but both of them, particularly if you normalize the results, would benefit from more options - normalized vs raw data, (sub)totals.

In order to make it work well in various edge cases I made a number of preparatory changes:

Lib.sort c87ccb3 wraps the built-in Array.sort with a check for whether the array is already perfectly sorted (or perfectly reversed), that for arrays of length 1e5+ can be a 10x or better speedup for already-sorted arrays, and should have very little penalty for unsorted arrays. For stacked area I expect the vast majority of the time the data will already be sorted, so that's why I implemented this now and this is the only place I used it, but I bet there are other places it would be useful as well.
Some edge case improvements in autorange 1f4898c - I changed a few baseline images (and one mock), I hope you'll agree these were actually incorrect before.
Better ordering of hover labels when traces have matching data (such as in a bar or area stack when one trace is zero) - try to preserve the stacking order 2fde3dc
Continue lines off the edge toward invalid log values 68b489d - I think I hadn't done this before (for scatter) out of caution lest we draw something misleading, I opted to just not draw the line at all. But particularly with fills, and even more so with stacked fills, this gets confusing and misleading as the fills would just connect across the missing point(s). I opted to draw these lines straight toward the edge if one dimension went invalid (since in principle they're going infinitely far away) or at a slope of 1 on a log/log plot if both dimensions go invalid simultaneously. Note that there are cases here where a separate point will move across these lines if you flip between linear and log axes, but that was already possible with finite data; this is just an extreme case of the same. (note the axes_range_type baseline change belongs in this commit but I put it in the autorange commit instead)

cc @etpinard @antoinerg @nicolaskruchten

faster sort of already-sorted arrays with minimal penalty for unsorted arrays

src/plots/cartesian/autorange.js

alexcjohnson · 2018-09-01T06:02:22Z

test/image/mocks/scatter_fill_corner_cases.json

@@ -75,7 +75,6 @@
    {
      "x": [1.5],
      "y": [1.25],
-      "fill": "tonexty",


Oh this change I think is actually required due to a change in the stacked area commit be38e93#diff-33c02cd37e7a4c951059a3c93221ac4eR175 - we were accidentally treating a length-1 trace as filling to itself (since its start and end points are the same!) but we shouldn't do that... therefore this trace, since it's the first on its subplot, should interpret 'tonexty' as 'tozeroy'.

alexcjohnson · 2018-09-01T06:08:41Z

src/traces/scatter/calc.js

+    var subplotAndType = trace.xaxis + trace.yaxis + trace.type;
+    var firstScatter = fullLayout._firstScatter;
+    if(!firstScatter[subplotAndType]) firstScatter[subplotAndType] = trace.uid;
+}


In fact, scatter_fill_corner_cases top subplots were also prevented from filling to zero with 'tonexty' because only one subplot could have the "first scatter" trace on it. This commit 🔪 gd.firstscatter and replaces it with one trace (uid) per subplot, attached to fullLayout. The stacked area mocks 🔒 this.

Nice. This is probably the most underrated piece in this PR. I always found that gd.firstscatter less-than-ideal. This is a welcome improvement.

test/image/mocks/log_lines_fills.json

src/plots/plots.js

src/traces/scatter/cross_trace_calc.js

alexcjohnson · 2018-09-01T06:25:03Z

src/traces/scatter/cross_trace_calc.js

+                if(cd.length !== serieslen) {
+                    // TODO: verify this never happens and remove
+                    throw new Error('length mismatch!');
+                }


Oops, missed this one... well, when I test ^^ I'll have plenty of confidence to remove it 😄

🔪 in 8547cf8

alexcjohnson · 2018-09-01T06:27:46Z

src/traces/scatter/plot.js

+            // if we're stacking, "infer zero" gap mode gets markers in the
+            // gap points - because we've inferred a zero there - but other
+            // modes (currently "interpolate", later "interrupt" hopefully)
+            // we don't draw generated markers


@etpinard @nicolaskruchten do you agree with this choice? It only applies to points we generate in one trace to match the positions from another trace - those are the "gap points"

do you agree with this choice?

+1 for me.

src/traces/scatter/stack_defaults.js

etpinard

Great PR!

I hope implementing those per-trace stack* attributes wasn't too much of a headache. The two stackgaps modes are looking great. 📈

Most of my comments are simply comments, with the exception of:

I don't think we need that alwaysSupplyDefaults trace module category
mutating gd.calcdata[i][j].i isn't great.
is that hacky fill default logic really necessary?
what do you think adding a 'stack' flag to scatter mode

src/plots/cartesian/autorange.js

test/image/mocks/log_lines_fills.json

src/lib/search.js

src/plots/plots.js

src/traces/bar/layout_attributes.js

etpinard · 2018-09-04T18:34:34Z

src/traces/scatter/calc.js

+    var subplotAndType = trace.xaxis + trace.yaxis + trace.type;
+    var firstScatter = fullLayout._firstScatter;
+    if(!firstScatter[subplotAndType]) firstScatter[subplotAndType] = trace.uid;
+}


Nice. This is probably the most underrated piece in this PR. I always found that gd.firstscatter less-than-ideal. This is a welcome improvement.

src/traces/scatter/calc.js

src/traces/scatter/cross_trace_calc.js

etpinard · 2018-09-04T18:44:39Z

src/traces/scatter/plot.js

+            // if we're stacking, "infer zero" gap mode gets markers in the
+            // gap points - because we've inferred a zero there - but other
+            // modes (currently "interpolate", later "interrupt" hopefully)
+            // we don't draw generated markers


do you agree with this choice?

+1 for me.

src/traces/scatter/stack_defaults.js

and remove error check added just for debugging during dev

nicolaskruchten · 2018-09-05T00:57:55Z

So from a high-level API standpoint, why do we want stackgroup again? Is it just so as to match a potential future equivalent for bar? Because as a standalone API it's kind of ungainly, and I can't imagine a use-case for have some areas stacked and some not in the same plot... ?

alexcjohnson · 2018-09-05T03:06:20Z

So from a high-level API standpoint, why do we want stackgroup again? Is it just so as to match a potential future equivalent for bar? Because as a standalone API it's kind of ungainly, and I can't imagine a use-case for have some areas stacked and some not in the same plot... ?

It's unusual for sure, but I wouldn't want to rule it out. What if you have one stack series for data and another for fits? Then one stack would need its fill removed, since they're overlapping. Or prediction/extrapolation - these might not overlap but still you might want different styling for corresponding items in each stack. Or two back-to-back stacks, like those plots that have male on one side and female on the other, with the axis in the middle (we could manage that one with an analog of barmode: 'relative', or perhaps even better two axes with a constraint so you don't need to flip your data... but you see the point)

What do you think about adding a 'stack' flag to scatter mode to make it easier to toggle stacked areas on and off?

mode is otherwise all about how to draw the series, not where to draw it... and the one bit of how that stacking impacts (fill) isn't even part of mode.

But, perhaps both of these concerns could be assuaged by making a boolean stack attribute, then giving stackgroup a default value but only coercing it when stack is true? (and for completeness, if you provide only a stackgroup let stack default to true). That way the usual behavior would be to just use the boolean but the full flexibility would still be available (if perhaps buried in the UI)

nicolaskruchten · 2018-09-05T12:57:33Z

OK, I'll buy the "back to back stacks" argument :)

Could we make sure the documentation clearly explains whether or not stack normalization applies across or within subplots please? I don't know the answer but I'd like to and I think we should canonicalize it in the docs!

nicolaskruchten · 2018-09-05T14:00:53Z

I think we can live without an extra stack attribute personally :)

alexcjohnson · 2018-09-05T19:22:13Z

Could we make sure the documentation clearly explains whether or not stack normalization applies across or within subplots please? I don't know the answer but I'd like to and I think we should canonicalize it in the docs!

Within subplots (and within stack groups, if there are multiple on one subplot) -> 00d7d22

etpinard · 2018-09-06T18:36:11Z

Down to 2️⃣ unresolved comments:

…o stack config

etpinard · 2018-09-07T14:01:05Z

Nicely done 💃

alexcjohnson added 7 commits August 31, 2018 12:33

Lib.sort

c87ccb3

faster sort of already-sorted arrays with minimal penalty for unsorted arrays

fix a bunch of edge cases in cartesian autorange

1f4898c

rangemode only applies to linear axes

0541cd1

lint fx/hover

4e71dfa

better ordering of trace hoverlabels for matching positions

2fde3dc

continue lines off the edge toward invalid log values

68b489d

stacked area charts!

be38e93