BUG: 1.3.0 styler.render() throws exception when using highlight_max/highlight_min (code worked with pandas 1.2.5) #42466

rendner · 2021-07-09T16:56:27Z

I have checked that this issue has not already been reported.
I have confirmed this bug exists on the latest version of pandas.
(optional) I have confirmed this bug exists on the master branch of pandas.

Code Sample

These are minimal examples which throw an exception. The same problem exist when you use highlight_min instead of highlight_max.

import pandas as pd

# throws a TypeError("'>=' not supported between instances of 'int' and 'str'")
# pd.DataFrame({'col_0': [0, 'a']}).style.highlight_max().render()

# throws a TypeError("'>=' not supported between instances of 'Timestamp' and 'int'")
# pd.DataFrame({'col_0': [pd.to_datetime('1/1/2000'), 0]}).style.highlight_max().render()

# throws a ValueError('The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()')
pd.DataFrame({'col_0': [pd.period_range(start='2000-01-01', end='2000-01-02')]}).style.highlight_max().render()

Problem description

All examples worked in pandas 1.2.5 without throwing an exception.

Expected Output

The code doesn't throw an exception.

Output of `pd.show_versions()`

INSTALLED VERSIONS

commit : f00ed8f
python : 3.8.10.final.0
python-bits : 64
OS : Linux
OS-release : 5.4.0-77-generic
Version : #86-Ubuntu SMP Thu Jun 17 02:35:03 UTC 2021
machine : x86_64
processor : x86_64
byteorder : little
LC_ALL : None
LANG : en_US.UTF-8
LOCALE : en_US.UTF-8
pandas : 1.3.0
numpy : 1.21.0
pytz : 2021.1
dateutil : 2.8.1
pip : 21.1.2
setuptools : 57.0.0
Cython : None
pytest : None
hypothesis : None
sphinx : None
blosc : None
feather : None
xlsxwriter : None
lxml.etree : None
html5lib : None
pymysql : None
psycopg2 : None
jinja2 : 3.0.1
IPython : None
pandas_datareader: None
bs4 : None
bottleneck : None
fsspec : None
fastparquet : None
gcsfs : None
matplotlib : 3.4.2
numexpr : None
odfpy : None
openpyxl : None
pandas_gbq : None
pyarrow : None
pyxlsb : None
s3fs : None
scipy : None
sqlalchemy : None
tables : None
tabulate : None
xarray : None
xlrd : None
xlwt : None
numba : None

The text was updated successfully, but these errors were encountered:

attack68 · 2021-07-09T21:57:36Z

Do you mean this:

# throws a ValueError('The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()')
pd.DataFrame({'col_0': [pd.period_range(start='2000-01-01', end='2000-01-02')]}).style.highlight_max().render()

or do you mean this (notice no list):

pd.DataFrame({'col_0': pd.period_range(start='2000-01-01', end='2000-01-02')}).style.highlight_max().render()

which works just fine?

attack68 · 2021-07-09T22:04:37Z

Additionally it is not Styler throwing the error it is numpy.nanmax:

df = pd.DataFrame({'col_0': [0, 'a']})
numpy.nanmax(df["col_0"])
# TypeError: '>=' not supported between instances of 'int' and 'str'

rendner · 2021-07-10T04:20:15Z

Additionally it is not Styler throwing the error it is numpy.nanmax:

That's correct.

or do you mean this (notice no list):
pd.DataFrame({'col_0': pd.period_range(start='2000-01-01', end='2000-01-02')}).style.highlight_max().render()
which works just fine?

I tried to simplify the problem, my original was (which runs into the same error):

import pandas as pd

# throws a # ValueError('The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()')
pd.DataFrame({'col_0': [pd.to_datetime('1/1/2000'), pd.to_timedelta(0, unit='d'), pd.period_range(start='2000-01-01', end='2000-01-02')]}).style.highlight_max().render()

I did some research. Version 1.2.5 also uses numpy.nanmax:
https://github.com/pandas-dev/pandas/blob/v1.2.5/pandas/io/formats/style.py#L1547
But there was an additional step before:
https://github.com/pandas-dev/pandas/blob/v1.2.5/pandas/io/formats/style.py#L1531

In 1.3.0 the maybe_numeric_slice call is no longer present:
https://github.com/pandas-dev/pandas/blob/v1.3.0/pandas/io/formats/style.py#L2277-L2286

attack68 · 2021-07-10T07:42:22Z

I don't believe this is a bug or your expected output is warranted.

Your original code works in 1.2.5 simply because maybe_numeric_slice detects that dtype is not in [np.number], so highlight_max is not applied to col_0, but this was never documented that it took place. You cannot highlight anything in col_0 because it makes no sense to compare the different object types.

Your requests comes down to either:

pandas handling errors silently when a column with incompatible dtypes is present for a less than comparison.
or pandas raising this error to highlight the method fails and why.

I prefer the second for two reasons.

There is a subset argument provided which allows the user to select his columns (or rows) upon which the function should work, being reactive to an error message if necessary.
In 1.2.5 the following did nothing:

df = DataFrame([['a', 'b'], ['c', 'd']])
df.style.highlight_max().render()  # pre-selects only numeric columns

In 1.3.0 there is now a correct comparison made and the maximum of strings is highlighted.

df.style.highlight_max().render()  # highlights 'c' and 'd'

I think it is more valuable for the method to work where it should, than avoid errors where it can't.

rendner · 2021-07-10T13:25:34Z

Thank you. I understand the reasons and can live with the changed and now more consistent behavior. Because maybe_numeric_slice only adjusted the subset if it was None. I was just a bit irritated because I couldn't find any hint about the changed behavior in the release notes.

attack68 · 2021-07-10T17:10:09Z

I was just a bit irritated because I couldn't find any hint about the changed behavior in the release notes.

Fair point, was my bad, must have missed it when I made the changes.

rendner added Bug Needs Triage Issue that has not been reviewed by a pandas team member labels Jul 9, 2021

rendner changed the title ~~BUG: 1.3.0 styler.highlight_max/styler.highlight_min throw exceptions for code which worked with pandas 1.2.5~~ BUG: 1.3.0 styler.render() throws exception when using highlight_max/styler.highlight_min (code worked with pandas 1.2.5) Jul 9, 2021

rendner changed the title ~~BUG: 1.3.0 styler.render() throws exception when using highlight_max/styler.highlight_min (code worked with pandas 1.2.5)~~ BUG: 1.3.0 styler.render() throws exception when using highlight_max/highlight_min (code worked with pandas 1.2.5) Jul 9, 2021

rendner closed this as completed Jul 10, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

BUG: 1.3.0 styler.render() throws exception when using highlight_max/highlight_min (code worked with pandas 1.2.5) #42466

BUG: 1.3.0 styler.render() throws exception when using highlight_max/highlight_min (code worked with pandas 1.2.5) #42466

rendner commented Jul 9, 2021

INSTALLED VERSIONS

attack68 commented Jul 9, 2021

attack68 commented Jul 9, 2021

rendner commented Jul 10, 2021 •

edited

Loading

attack68 commented Jul 10, 2021

rendner commented Jul 10, 2021 •

edited

Loading

attack68 commented Jul 10, 2021

BUG: 1.3.0 styler.render() throws exception when using highlight_max/highlight_min (code worked with pandas 1.2.5) #42466

BUG: 1.3.0 styler.render() throws exception when using highlight_max/highlight_min (code worked with pandas 1.2.5) #42466

Comments

rendner commented Jul 9, 2021

Code Sample

Problem description

Expected Output

Output of pd.show_versions()

INSTALLED VERSIONS

attack68 commented Jul 9, 2021

attack68 commented Jul 9, 2021

rendner commented Jul 10, 2021 • edited Loading

attack68 commented Jul 10, 2021

rendner commented Jul 10, 2021 • edited Loading

attack68 commented Jul 10, 2021

Output of `pd.show_versions()`

rendner commented Jul 10, 2021 •

edited

Loading

rendner commented Jul 10, 2021 •

edited

Loading