-
-
Notifications
You must be signed in to change notification settings - Fork 2.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
KeyError when running the create_dendrogram example #2627
Conversation
So with scipy 1.4.1, if I made a plotly dendrogram with 10 colors, which colours would they be? |
(small request: could you please name your branches and your PRs a little more descriptively? a good branch name here would be |
This same code fails on this branch with |
Final note: let's slightly improve the docstring and doc page for this thing so that it's clearer that this is a thin thin wrapper around https://docs.scipy.org/doc/scipy/reference/generated/scipy.cluster.hierarchy.dendrogram.html :) I might actually say this just visualizes the output of from scipy.cluster.hierarchy import dendrogram
dendrogram(
Z=linkagefun(distfun(X)),
orientation=orientation,
labels=labels,
color_threshold=color_threshold,
) |
That is indeed pretty close! I'm surprised that it's that close without being identical :) |
Oh, I see, you're repeating "red, green, cyan" at the end of the cycle? |
I don't think so, I just tried to find a mapping that made the change subtle. It was really hard to figure out exactly the relationship between the 1.4.1 colors and the 1.5.1 colors, but if you compare the sequences for a large dendrogram (in the old version a list of 'r','g', etc., in the new version a list of 'c1', 'c2', etc), there are cns that map to multiple old colors. But I think the mapping I chose keeps the color change pretty minimized between version updates. I'll submit a PR in a few minutes and you can see exactly what I did. |
I also improved the documentation for the colorscale argument. You might think the documentation is weird, but this is literally what the argument does :O ! |
@@ -32,7 +32,29 @@ def create_dendrogram( | |||
:param (ndarray) X: Matrix of observations as array of arrays | |||
:param (str) orientation: 'top', 'right', 'bottom', or 'left' | |||
:param (list) labels: List of axis category labels(observation labels) | |||
:param (list) colorscale: Optional colorscale for dendrogram tree | |||
:param (list) colorscale: Optional colorscale for dendrogram tree. To |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is probably too detailed an explanation... most people will likely not want to follow the indirection and go read up on what scipy does here. I think just saying "should be 8 colors" is sufficient :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And we should mention that no matter the version of scipy
, the 7th color in the scale is ignored (note: this is because the w
key isn't outputted by scipy), and in scipy
15+, colors 2, 3 and 6 are used twice as often as the others. 🤦
OK, let's just update the docstring with my terser version and merge this. |
The default colorscale for _dendrogram contains color names compatible with the default colors given by scipy===1.5.0. It is still backwards compatible with older scipy versions.
This is done for ff.create_dendrogram. It was tried as much as possible to preserve the old color sequence, but this was not possible. Also improved the documentation of the colorscale argument.
62405de
to
90e417d
Compare
:param (list) colorscale: Optional colorscale for the dendrogram tree. With | ||
scipy<=1.4.1 requires 8 colors to be specified, | ||
the 7th of which is ignored. With scipy>=1.5.0, | ||
requires 10 colors. In this case the 8th color is |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so actually we only ever want 8 here, and the 7th is always ignored.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right what I wrote this time is actually wrong. Yes we always want 8, and the 7th is ignored. How about this:
:param (list) colorscale: Optional colorscale for the dendrogram tree.
Requires 8 colors to be specified, the 7th of which is ignored. With scipy>=1.5.0, the 2nd, 3rd and 6th are used twice as often as the others.Given a shorter list, the missing values are replaced with defaults and with a longer list the extra values are ignored.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great! 💃 once done
closes #2618
With the new color names ('C0','C1', 'C2', ...) I only added 5 colours whereas with the old names 8 colours were added. Looking at
matplotlib.rcParams['axes.prop_cycle']
it seems up to'C9'
is possible, so do we add mappings for colours 'C5' to 'C9'?