-
-
Notifications
You must be signed in to change notification settings - Fork 18k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Violin plots? #5676
Comments
Statsmodels has them. I think the idea is to keep the more statistical things there/maybe move some pandas stats functionality over. |
These look fun! I think that dep is fine (similar to elsewhere in codebase: https://github.com/pydata/pandas/blob/0327addef295fef292f2fdf8c95546c4cc039abb/pandas/tools/rplot.py#L528) Edit: I posted this before seeing @TomAugspurger's comment... |
except that the rplot module isn't very well maintained and is quite slow. |
Ah yes statsmodels. I looked at it but didn't end up using them because you couldn't specify the kernel bandwidth.Theirs is quite nice because you can do the left/right distributions for direct comparisons. Maybe I'll just add a PR there that adds a bandwidth specification to the args. |
If they already have some of this, probably better to move it to there |
Note that rplot, in retrospect, was a controversial merge precisely for the reason @TomAugspurger mentioned. @jtratner , yep, it was in PR limbo for the longest while and I wasn't aware of the statsmodels issue |
Okay great! I'll stick to statsmodels for the violinplots. Should other stuff like the clustergrams/clustered heatmaps also be there? |
It's a fuzzy boundary. The plots are beautiful but are far more bespoke then pandas' current meat |
We might call them "2nd order plots" maybe? :) |
maybe its worthwhile to refactor out these secondary plots from both pandas and statsmodels and make a common import? (so in effect available in both, but not code dupe) |
Where's the all-singing, all-dancing pydata exploratory data viz library?? |
|
I'm hoping https://github.com/vispy/vispy will be the beginning of such a plotting library. Plotting 100M points a second with a GPU beats the 25 seconds plus it takes for matplotlib right now. 😄 |
@olgabot, I see ggplot2 supports these plots as a distinct geom, would be nice to have these cc @JanSchulz |
#783 still needs a champion, that plot should fit right in to pandas. |
Violin plots I'm working on in statsmodels so I'll close this one. |
@jtratner, no decrees here - just opinion. I think #783 is a reasonable addition to include in pandas, Please feel free to disagree loudly. |
Noting that these seem to have landed in seaborn, and are part of ggplot and so Relieved I'm not just shooting down functionality. These actually find their way to |
Besides boxplots, another way to look at the distribution of data is by violin plots:
http://nbviewer.ipython.org/gist/olgabot/7902901
I'd like to add this to pandas. Is there interest in these plots? It currently depends on my
prettyplotlib
but that can be removed. My concern is that it depends onscipy.stats.gaussian_kde
- is that fine?The text was updated successfully, but these errors were encountered: