distplot: only compute traces that will be shown #2730
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
In this pull request I make a small change to ff.create_distplot. ff.create_distplot produces a figure showing the distribution of some data. It can create three traces (histogram, kernel density estimation (KDE) and rug) and it has switches that let users select, which of the aforementioned traces to include in the resulting figure.
Independent of the user choice (boolean parameters: show_hist, show_curve and show_rug), it will compute all three traces whether they will be included in the resulting figure or not. I changed the code so that only the traces that will be included in the resulting figure get computed.
This will make the function more efficient in situations where a user doesn't need all three traces.
Also (this is my use case), it allows one corner case that currently doesn't work: The function currently doesn't support being called with a dataset (hist_data parameter) that only contains a single entry. What happens in this case is that scipy throws an exception when trying to compute the KDE. However, in some use cases, there is no need for a KDE (show_curve=False). It should be possible to create a distplot with a single-entry dataset without a KDE.
Code PR
plotly.graph_objects
, my modifications concern thecodegen
files and not generated files.modified existing tests.
new tutorial notebook (please see the doc checklist as well).