-
-
Notifications
You must be signed in to change notification settings - Fork 18k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Color text based on background color when using _background_gradient()
#21263
Color text based on background color when using _background_gradient()
#21263
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR. Here are some initial comments
pandas/io/formats/style.py
Outdated
text_color: float or int | ||
luminance threshold for determining text color. Facilitates text | ||
visibility across varying background colors. From 0 to 1. | ||
0 = all text is dark colored, 1 = all text is light colored. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can add a version added for 0.24.0 for this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this the common name for this field in mpl?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMHO text_color=0.2
looks very counterintuitive to me. It almost looks like the exact opposite of what this feature is about (not having a constant text color).
Shouldn't the name at least contain "threshold", e.g. text_color_threshold
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree that text_color_threshold
is suitable, is it ok with multiple underscores in the parameter name? I don't think there is a common name for this in mpl (for coloring text in general, they tend to use just color
, but I believe it would be easy to mistake that for the backgruond color here due to the method's name)
pandas/io/formats/style.py
Outdated
@@ -863,7 +863,7 @@ def highlight_null(self, null_color='red'): | |||
return self | |||
|
|||
def background_gradient(self, cmap='PuBu', low=0, high=0, axis=0, | |||
subset=None): | |||
subset=None, text_color=0.2): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To your question of what value to use as a default I don't have a preference visually, but if Seaborn is using 0.4 I'd rather just fall inline with that. Would certainly make the look and feel more consistent for users using both in say a Jupyter notebook
pandas/io/formats/style.py
Outdated
if (not isinstance(text_color, (float, int)) or | ||
not 0 <= text_color <= 1): | ||
msg = "`text_color` must be a value from 0 to 1." | ||
raise ValueError(msg) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add a test to ensure this raises?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am having troubles getting the test correct for this. When I try the function manually, it raises ValueError
when called with any one of the parameters in the test, but pytest keeps failing saying that it doesn't raise a ValueError
. Would you have time to check my latest commits and advise?
pandas/io/formats/style.py
Outdated
raise ValueError(msg) | ||
|
||
def relative_luminance(color): | ||
"""Calculate the relative luminance of a color according to W3C |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is a standard for pandas docstrings you'll want to follow:
https://python-sprints.github.io/pandas/guide/pandas_docstring.html
Off the top of my head:
- The first row should be only one line
- The type and description of parameter(s) should be on separate lines
- You'll want a space before the Returns section
- Could add a Raises section for the bad luminance
@@ -1031,7 +1031,9 @@ def test_background_gradient(self): | |||
|
|||
result = df.style.background_gradient( | |||
subset=pd.IndexSlice[1, 'A'])._compute().ctx | |||
assert result[(1, 0)] == ['background-color: #fff7fb'] | |||
|
|||
assert result[(1, 0)] == ['background-color: #fff7fb', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we add a more comprehensive / dedicated test for this? Something that encompasses the full range of expected values
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added a suggestion
Codecov Report
@@ Coverage Diff @@
## master #21263 +/- ##
==========================================
+ Coverage 91.84% 91.85% +<.01%
==========================================
Files 153 153
Lines 49538 49555 +17
==========================================
+ Hits 45499 45518 +19
+ Misses 4039 4037 -2
Continue to review full report at Codecov.
|
@@ -1028,10 +1028,25 @@ def test_background_gradient(self): | |||
assert all("#" in x[0] for x in result.values()) | |||
assert result[(0, 0)] == result[(0, 1)] | |||
assert result[(1, 0)] == result[(1, 1)] | |||
for res in result: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Make this a separate test called test_text_color_threshold
to distinguish it from the gradient testing.
Also I didn't really understand the point of the conditional - can we not be more assertive about the exact values we'd expect?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My thinking is that we want to test if the text color is light or dark conditional on the background color. If for some unexpected reason the background color changes, this test should not fail.
I added a section to assert if the background color takes on one of its 4 expected values to prevent that test_text_color_threshold
doesn't test anything (I could move this section to test_background_gradient
if you think that is more suitable).
'color: #000000'] | ||
|
||
@td.skip_if_no_mpl | ||
def test_text_color_threshold(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rename this to test_text_color_threshold_raises
@td.skip_if_no_mpl | ||
def test_text_color_threshold(self): | ||
df = pd.DataFrame([[1, 2], [2, 4]], columns=['A', 'B']) | ||
for text_color_threshold in [1.1, '1', -1, [2, 2]]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can parametrize these (check test_to_html.py
for examples if you don't know what I mean)
def test_text_color_threshold(self): | ||
df = pd.DataFrame([[1, 2], [2, 4]], columns=['A', 'B']) | ||
for text_color_threshold in [1.1, '1', -1, [2, 2]]: | ||
with pytest.raises(ValueError): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
tm.assert_raises_regex
would be preferable as you can use it to assert on the message as well. Can still be used as a context manager (can also see usage in test_to_html.py
)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the pointers, I updated accordingly.
This test still complains that a ValueError
is not raised for any of the parameters. When I call the the function outside the test, it does raise ValueError: ('`text_color_threshold` must be a value from 0 to 1.', 'occurred at index A')
. The test works fine if I raise ValueError(msg)
manually.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This didn't raise for me locally, though it did when running the _compute
method
In [12]: df.style.background_gradient(text_color_threshold='1')
Out[12]: <pandas.io.formats.style.Styler at 0x110163630>
In [13]: df.style.background_gradient(text_color_threshold='1')._compute()
ValueError: ('`text_color_threshold` must be a value from 0 to 1.', 'occurred at index A')
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, I missed that _compute()
is called automatically in Notebooks via _repr_html_
. I believe all comments are addressed now.
@@ -863,7 +863,7 @@ def highlight_null(self, null_color='red'): | |||
return self | |||
|
|||
def background_gradient(self, cmap='PuBu', low=0, high=0, axis=0, | |||
subset=None): | |||
subset=None, text_color_threshold=0.408): | |||
""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you also add a Raises section for the docstring?
@td.skip_if_no_mpl | ||
def test_text_color_threshold(self): | ||
df = pd.DataFrame([[1, 2], [2, 4]], columns=['A', 'B']) | ||
for c_map in [None, 'YlOrRd']: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Parametrize the cmap
df = pd.DataFrame([[1, 2], [2, 4]], columns=['A', 'B']) | ||
for c_map in [None, 'YlOrRd']: | ||
result = df.style.background_gradient(cmap=c_map)._compute().ctx | ||
for res in result: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of the loop just be explicit about the dict that you expect and compare it to the result
elif result[res][0].split(' ')[1] in ['#800026', '#440154']: | ||
assert result[(res)][1].split(' ')[1] == '#f1f1f1' | ||
result = df.style.background_gradient(cmap=c_map)._compute().ctx | ||
test_colors = {None: {(0, 0): ('#440154', '#f1f1f1'), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Change the variable here to expected
(standard in pandas tests)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of using a nested dict you should also send this in via parametrization, so you would have "c_map,expected"
and then send in a tuple of the c_map
and it's expected result
'YlOrRd': {(0, 0): ('#ffffcc', '#000000'), | ||
(1, 0): ('#800026', '#f1f1f1')}} | ||
# Light text on dark background | ||
assert result[0, 0][0].split(' ')[1] == test_colors[c_map][0, 0][0], ( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are you splitting this? Just make your expected
variable account for the exact string required. Unless I'm missing something you should just have one assertion here for result == expected
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was doing this to be able to raise different assertion messages for the background and foreground color. But since pytest shows the expected and current value anyways, this should be easy to trace done without separate assertion messages.
def test_text_color_threshold(self, c_map, expected): | ||
df = pd.DataFrame([[1, 2], [2, 4]], columns=['A', 'B']) | ||
result = df.style.background_gradient(cmap=c_map)._compute().ctx | ||
assert result[0, 0] == expected[0] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Much closer but why not set up expected so it looks something like:
{(0, 0): [...],
(0, 1): [...]}
And simplify at the end to result == expected
?
assert result[(1, 0)] == ['background-color: #fff7fb', | ||
'color: #000000'] | ||
|
||
@td.skip_if_no_mpl |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do these tests need to be in the MatplotlibDep test class? I didn't think this required mpl but could be wrong.
If that is the case then move the @td.skip_if_no_mpl
decorator to the class instead of on each function individually.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, they both call background_gradient()
which depends on mpl.
doc/source/whatsnew/v0.24.0.txt
Outdated
@@ -181,7 +181,7 @@ Reshaping | |||
Other | |||
^^^^^ | |||
|
|||
- | |||
- :meth: `~pandas.io.formats.style.Styler.background_gradient` now takes a ``text_color_threshold`` parameter to automatically lighten the text color based on the luminance of the background color. This improves readability with dark background colors without the need to limit the background colormap range. (:issue:`21258`, :issue:`21269`) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can just reference the first issue here, since that is what you are closing anyway (second was a duplicate)
Thanks for the changes. Assuming tests pass I'll approve on my end |
@WillAyd Thank you for all the help, much appreciated! |
@TomAugspurger : Could you take a look? |
@@ -879,26 +879,39 @@ def background_gradient(self, cmap='PuBu', low=0, high=0, axis=0, | |||
1 or 'columns' for columnwise, 0 or 'index' for rowwise | |||
subset: IndexSlice | |||
a valid slice for ``data`` to limit the style application to | |||
text_color_threshold: float or int |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add , default 0.408
.
And may a note as to why that's the default?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We chose 0.408
to stay consistent with the Seaborn implementation. Should I ask the Seaborn author for the underlying reason, just reference seaborn, or leave the value without explanation?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No worries, was just curious.
Raises | ||
------ | ||
ValueError | ||
If ``text_color_threshold`` is not a value from 0 to 1. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Single backtick for parameter names.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, I did double tick because I saw it in other places, for example for high
and low
in the notes section. Do you want me to update them to single tick as well? Does the same go for data
in the subset section and the expressions in the note section?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I noticed that after posting. Probably fine to be consistent w/ the rest of the docstring here.
Thanks @joelostblom! |
thanks for this improvement. FYI: under PR #21259 I add support for |
* Color text based on background gradient Closes pandas-dev#21258
The purpose of this PR is to automatically color text dark or light based on the background color of the HTML table:
================
old behavior
===========================new behavior
========As described in #21258, I use the luminance-based approach from
seaborn
's annotated heatmaps. Tagging @WillAyd who commented on that issue. A few comments on this PRInitially, I was not sure if defining the
relative_luminance()
method within_background_gradient()
was the right way to go, but I opted for this since I saw the same approach elsewhere in the file. Let me know if you prefer a different approach.I am not sure how intuitive it is that a parameter named
text_color
takes a numeric argument and not a color, but I think it is a good name for discoverability. Naming itluminance_threshold
or similar might be confusing for users looking for a way to change the text color.I opted to make the light text not completely white. The focus should be the background color so the text should not pop out too much. Thoughts?
================
#ffffff
============================#f1f1f1
(current choice)===text_color
based on my own qualitative assessment, feel free to disagree. The seaborn default of 0.4 makes too much text white in my opinion. Since colors and contrast can be quite subjective, I thought I would include some comparisons.============
text_color=0.4
======================text_color=0.2
(current choice)===This is my first PR, apologies if I have misunderstood something.
style.background_gradient()
#21258git diff upstream/master -u -- "*.py" | flake8 --diff