Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

API: make hide_columns and hide_index have a consistent signature and function in Styler #41266

Merged
merged 33 commits into from
Jun 16, 2021

Conversation

attack68
Copy link
Contributor

@attack68 attack68 commented May 2, 2021

This closes #41158 (which is an alternative PR for similar functionality)

Currently hide_index() and hide_columns(subset) have similar signatures but different operations. One hides the index whilst showing the data-rows, and the other hides data-columns whilst showing the column-headers row.

This PR

  • adds the subset keyword to give: hide_index(subset=None).
  • sets the default to hide_columns(subset=None).

When subset is None the function now operates to hide the entire index, or entire column headers row respectively.
When subset is not None it operates to selectively hide the given rows or columns respectively, keeping the index or column headers row.

We also add the show keyword to allow the method to operate inversely.

@jreback I think you will prefer this over the hide_values and hide_headers methods suggested in #41158 since this re-uses the existing methods with minimal changes but tries to align their consistency. It is also fully backwards compatible.


if clabels:
column_headers = [
if not self.hidden_colheads:
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

all that was changed here was to add if not self.hidden_colheads:, the other additions and removals in comparison window are misleading.

@attack68 attack68 changed the title ENH: make hide_columns and hide_index have a consistent signature and function in Styler API: make hide_columns and hide_index have a consistent signature and function in Styler May 3, 2021
@attack68 attack68 added API - Consistency Internal Consistency of API/Behavior Styler conditional formatting using DataFrame.style labels May 3, 2021
@jreback
Copy link
Contributor

jreback commented May 3, 2021

ok i undestand what you are doing, but this is still a very confusing api. we want to support

  • hide index values themselves
  • hide column values themselves
  • hide particular columns

can you show me what you would write for these?

@attack68
Copy link
Contributor Author

attack68 commented May 3, 2021

ok i undestand what you are doing, but this is still a very confusing api. we want to support

  • hide index values themselves
  • hide column values themselves
  • hide particular columns

can you show me what you would write for these?

sure..

Screen Shot 2021-05-03 at 18 53 22

The mechanics seem to be in place to do this quite easily, it just needs steering on the best API implementation..

@jreback
Copy link
Contributor

jreback commented May 3, 2021

ok your example looks fine. I am confused by the show keyword then.

@attack68
Copy link
Contributor Author

attack68 commented May 5, 2021

ok your example looks fine. I am confused by the show keyword then.

I removed the show option. Can always add it back later or do it in a different way

pandas/io/formats/style.py Show resolved Hide resolved
pandas/io/formats/style.py Outdated Show resolved Hide resolved

Parameters
----------
subset : IndexSlice
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is pretty confusing, why is this not just a list-like (of row labels)? e.g. similar to the argument for .dropna() for example

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

changed. pls review

- if ``subset`` is ``None`` then the entire column headers row will be hidden
whilst the data-values remain visible.
- if a ``subset`` is given then those specific columns, including the
data-values will be hidden, whilst the column headers row remains visible.

Parameters
----------
subset : IndexSlice
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same comment as above

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

now I wouldn't object to also allowing an IndexSlice here (in addition to a list-like of columns labels)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

changed. pls review

--------
Hide column headers and retain the data values:

>>> midx = pd.MultiIndex.from_product([["x", "y"], ["a", "b", "c"]])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you show an example first that has a single level index.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added

@@ -98,6 +98,8 @@ def __init__(

# add rendering variables
self.hidden_index: bool = False
self.hidden_rows: Sequence[int] = []
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you change the impl to use the same nomenclature, e.g. hidden_index, hidden_columns (or similar)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you give your input to the suggestion below:

self.hide_index_: bool
self.hide_columns_: bool
self.hidden_indexes: Sequence[int]
self.hidden_columns: Sequence[int]

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated (used rows instead of indexes)

@jreback jreback added this to the 1.3 milestone Jun 16, 2021
@jreback jreback merged commit 34fb225 into pandas-dev:master Jun 16, 2021
meeseeksmachine pushed a commit to meeseeksmachine/pandas that referenced this pull request Jun 16, 2021
…x` have a consistent signature and function in `Styler`
@jreback
Copy link
Contributor

jreback commented Jun 16, 2021

this is good, thanks @attack68 as a followup can you make sure that these methods are in the api ref & also we have sufficient docs in the notebook (the doc-strings are great obviously), but generally give an update in the notebook (its not of course necessary to add everything, but major things like this are good).

@jreback
Copy link
Contributor

jreback commented Jun 16, 2021

@meeseeksdev backport 1.3.x

@lumberbot-app
Copy link

lumberbot-app bot commented Jun 16, 2021

Something went wrong ... Please have a look at my logs.

simonjayhawkins pushed a commit that referenced this pull request Jun 16, 2021
…consistent signature and function in `Styler` (#42041)

Co-authored-by: attack68 <24256554+attack68@users.noreply.github.com>
@attack68 attack68 deleted the hiding_data_columns_and_index branch June 16, 2021 13:23
JulianWgs pushed a commit to JulianWgs/pandas that referenced this pull request Jul 3, 2021
@Leonardofreua
Copy link
Contributor

Leonardofreua commented Jul 27, 2021

Hi @attack68. I'm working on issue #42674 and I found some doctests with expected results different from the real results (at least in my tests). See the examples below:

  • hide_columns() (styles.py)
>>> midx = pd.MultiIndex.from_product([["x", "y"], ["a", "b", "c"]])
>>> df = pd.DataFrame(np.random.randn(6,6), index=midx, columns=midx)
>>> df.style.format("{:.1f}").hide_columns()
x   d    0.1    0.0    0.4    1.3    0.6   -1.4
    e    0.7    1.0    1.3    1.5   -0.0   -0.2
    f    1.4   -0.8    1.6   -0.2   -0.4   -0.3
y   d    0.4    1.0   -0.2   -0.8   -1.2    1.1
    e   -0.6    1.2    1.8    1.9    0.3    0.3
    f    0.8    0.5   -0.3    1.2    2.2   -0.8

As you can see below, the values d, e, and f don't appear in the result, because they were actually defined as a, b, and c. We can also see that x and y are aligned with b (second and fifth lines) and not with d (first and third lines):
image

Is there any reason these results are divergent?

Obs.: There are some other doctests in the same method after the one mentioned above, which present the same problem.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
API - Consistency Internal Consistency of API/Behavior Styler conditional formatting using DataFrame.style
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants