Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH: Conditional HTML Formatting #10250

Merged
merged 1 commit into from
Nov 16, 2015

Conversation

TomAugspurger
Copy link
Contributor

closes #3190
closes #4315

Not close to being done, but I wanted to put this here before the meeting. Maybe someone will have a chance to check it out.

http://nbviewer.ipython.org/github/TomAugspurger/pandas/blob/638bd3e361633a4c446ee02534e07b8a9332258a/style.ipynb

https://github.com/TomAugspurger/pandas/blob/stylely/style.ipynb

latest example notebook

@jorisvandenbossche jorisvandenbossche added the Output-Formatting __repr__ of pandas objects, to_string label Jun 2, 2015
@jreback
Copy link
Contributor

jreback commented Oct 11, 2015

@TomAugspurger what's the status on this (looks interesting). do you have some issue references?

how generic is this

@jreback jreback added the IO HTML read_html, to_html, Styler.apply, Styler.applymap label Oct 11, 2015
@TomAugspurger
Copy link
Contributor Author

Haven't put any time in to it.

This is something that I could start in another library and be pulled in to pandas if people think it's useful.

On Oct 11, 2015, at 10:44, Jeff Reback notifications@github.com wrote:

@TomAugspurger what's the status on this (looks interesting). do you have some issue references?

how generic is this


Reply to this email directly or view it on GitHub.

@@ -1228,6 +1229,218 @@ def _write_hierarchical_rows(self, fmt_values, indent):
nindex_levels=frame.index.nlevels)



class StyleFormatter(object):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's make a separate module (.style)

@TomAugspurger
Copy link
Contributor Author

I've updated the notebook here: http://nbviewer.ipython.org/github/pydata/pandas/blob/0f0ba429180c27b4ae3cb21f6b343db46af69397/style.ipynb

Would be curious to hear people's thoughts before I put too much more time into it. Specifically on

  1. The API: I'm using method chaining, as suggested by Jeff. That helps with the keyword args.
  2. The scope: Does is belong in pandas, or as a separate repo?

No need to read into the messy implementation yet.

@jreback
Copy link
Contributor

jreback commented Oct 24, 2015

@TomAugspurger I think also adding some text formatting things would be nice (e.g. float formatting) in the example

@TomAugspurger
Copy link
Contributor Author

I think also adding some text formatting things would be nice

Sure. That could be a keyword argument to the Styler constructor. Something like

s = Styler(df, formatter=lambda x: "{:>.2f}".format(x)).color_bg_range()

gives

screen shot 2015-10-24 at 11 33 56 am

Doing that at the constructor level separates the two types of styling being done. python's formatting for the string vs. the CSS property: values pairs.

@jreback
Copy link
Contributor

jreback commented Oct 24, 2015

actually I realize that should just use .round(2).style

but was thinking that an alignment filter might be nice in that case

eg

.round(2).style.align_right()

@TomAugspurger
Copy link
Contributor Author

We can actually support non-conditional formattings (like align in this case) a bit better than having method per CSS property. Instead we'll just have a .set_properties method so you can do

(df.round(2)
    .style
    .color_bg_range(cmap='viridis')
    .highlight_null(null_color='red')
    .set_properties(**{"color": "#f8f8fa", "text-align": 'center', "width": '10em'})
    .render())

screen shot 2015-10-25 at 6 33 59 pm

We might want to support a selection argument to that (which would be an IndexSlice), in case you want to style a subset.

@jreback
Copy link
Contributor

jreback commented Oct 25, 2015

I don't think u need the .render() any longer
as the repr_html does this

@jreback
Copy link
Contributor

jreback commented Oct 25, 2015

course this would be straightforward to hook up to some ipython widgets .....

eg maybe a row and column slider for colors
s couple of radio buttons ....

@TomAugspurger
Copy link
Contributor Author

Just pushed a quick update. Notebook: http://nbviewer.ipython.org/github/pydata/pandas/blob/a7d730f30d87027464cc1747b38a844ba8d09fe3/style.ipynb
Newest additions are .set_properties for setting static (non data-dependent) properties and hooking into our pd.options.display.precision for controlling rounding. Both of those are pretty nice to use.

Currently working on

  1. supporting slices somehow / applying certain styles to certain columns / rows
  2. Table styling.

Supporting slices is tricky because our state is a dict mapping (row_position, col_postion) -> [styles]. Whenever you do a selection you lose that position information. We can go from label to position with something like get_indexer, but that will only work reliably for Indexes with no duplicates (I think). I'll keep thinking on this.

Table styling is just additional styling applied to the table as a whole, instead of individual cells. Pretty easy to do, it'll probably just be a dictionary you pass in when making the Style object, or when you .render.

Should have some time on Saturday.

@max-sixty
Copy link
Contributor

This is awesome. It'll be a step change for us in using IPython Notebooks to show data, rather than writing out to Excel and doing formatting there.

To put some weight behind the slices - there are a couple of concepts there:

  • Which set of data to use as comparisons for a certain value. For example, in the image above, should the '1.0' value in the top left be at the bottom end of the range or the top, depending on whether it's compared to its table or its column.
  • Having different styles for different slices. I think this would be really useful. Re the label -> position issue, just my $0.02, but this working only on unique indexes may be fine - generally if you're looking at output you want unique labels anyway.

@TomAugspurger
Copy link
Contributor Author

@MaximilianR thanks for the comments.

Re the first one, that coloring is done relative to the column. That's how color_bg_range (which will be renamed) is written. But we or the user could pretty easily write their own function

def color_table(df):
    # normalized df
    ...
    # fill colors
    colors = [ ]
    return DataFrame(colors, index=df.index, columns=df.columns)

and use that with df.style.applymap(color_table). Or we could include that, maybe as an option on color_bg_range, maybe as a separate method.

Having different styles for different slices. I think this would be really useful.

Agreed, I'd say it's necessary. At the very least, I have to implement it so that errors aren't thrown if you have a DataFrame with a non-numeric column :) I'm also leaning towards just punting on non-unique indexes for now.

I'll try to get some documentation relatively soon. I think things are to the point where the overall interaction is stable.

@mrocklin
Copy link
Contributor

This is really cool. My first impression is that the colors play too prominent a role in the styles that I see here. They tend to dominate over the numerical information. I wonder if more subtle shading / using less of the dynamic range would provide the same visual hints while not dominating the overall effect.

However, as a disclaimer, IANAD (I am not a designer)

@TomAugspurger
Copy link
Contributor Author

Thanks. Agreed entirely about the colors. I'm 99% sure that there's a parameter you can pass to the matplotlib color converter or color map to control what percent of the range it uses (vmin and vmax). My first guess didn't work, and I just haven't done researched further yet.

@mrocklin
Copy link
Contributor

It was a minor nit relative to the technical merits here. Please do ignore for now.

@TomAugspurger
Copy link
Contributor Author

I think you bring up a good point though. The One Thing I want to get right here is providing a flexible framework for people to build on. I'm hopeful that another library will be built that does more / better actual styling. Our defaults need to just be not awful :)

@tonyfast
Copy link

This is really great to see. I have been doing a lot of this by hand.

Some thoughts:

  • It would be good to color the headings so that categorical information can be identified.

  • The Styler and DataFrame can export to a dataframe than can be used for data visualization. The visual attributes of the table and visualization can align with less code.

  • Create the css from python objects where the key is a css property.

  • Use data attributes instead of classes in the table. This will make the table easier to reuse with client side tools.

     <td id="id" class="data row6 col1">0.121668</td>
     <td id="id" class="data" data-row="6" data-col="1">0.121668</td>
    

@blink1073
Copy link

The DataFrame editor in Spyder IDE may be of interest as far as styling: https://github.com/spyder-ide/spyder/blob/master/spyderlib/widgets/variableexplorer/dataframeeditor.py

arrayeditor

@TomAugspurger
Copy link
Contributor Author

Ok, now I'm really going to get documentation written soon, now that there's broader interest.

It would be good to color the headings

I want this too. From a technical standpoint it will be easy to add. I haven't found a nice way to do this from a user's point of view yet. It's on my TODO list, but will be a little latter.

The Styler and DataFrame can export to a dataframe than can be used for data visualization. The visual attributes of the table and visualization can align with less code

Can you expand on this? I'm not sure what you mean.

Use data attributes instead of classes in the table

Good idea, I'll check out how that works.

@TomAugspurger
Copy link
Contributor Author

Got slicing to work, but I've had to give up support for duplicates in the Index. I think that's a fair trade for now. ¯_(ツ)_/¯

notebook

And hooking up to widgets is easy and fun!

out

@tonyfast
Copy link

Nice one! It's cool to think about this eventually playing nicely with pivottable.

@TomAugspurger
Copy link
Contributor Author

Potentially, I’m not familiar with that. This will at least give you classes like the row or column number that you can select and style.

On Oct 29, 2015, at 10:01 PM, Tony Fast notifications@github.com wrote:

Nice one! It's cool to think about this eventually playing nicely with pivottable https://github.com/nicolaskruchten/pivottable.


Reply to this email directly or view it on GitHub #10250 (comment).

@TomAugspurger
Copy link
Contributor Author

Hi everyone, I've uploaded a first pass of the docs as a notebook here: http://nbviewer.ipython.org/gist/TomAugspurger/841a9dc2816853824ce1

I'll be iterating on that quite a bit, but it gives a good overview of what's capable. You should at least skim to the end :)


Speaking of docs, I think we'll need to do some tinkering with sphinx to get this working.

And my next biggish item to fix is styling of Indexes / column headers. My current best idea is dedicated methods .apply_columns and .apply_index to indicate that the result should be directed at the index / column. I'm very open to alternative suggestions though.

@jreback
Copy link
Contributor

jreback commented Oct 31, 2015

@TomAugspurger I think it would be intuitive to simply do

df.style.index.apply(...) or whatever (might need a slight modification to make this work, maybe a subclass)

"""
return self._todo

def set(self, styles, table_styles=None):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think you should accept any other args, except styles here. makes it confusing otherwise.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe call this .import?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

keyword :) I used .set, I think because that's what seaborn uses. There's also matplotlib's style.use. I might like that better.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ahh, I like .use better., actually these should be named I think, e.. what I really want to do is:

pd.options.display.style = 'cool'

(and as a context manager, etc).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

o maybe export as a a tuple/dict? e.g. Styler.export('cool') -> ('cool',.........)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And the desire to support something like pd.options.display.style is why I accept the table_styles kwarg in .set / .use. That way you can bundle things like :hover etc. Otherwise you have to .use('cool').set_table_styles([hover, ...]). We could package the applied styles + table_styles into some data structure with a name and stuff; just kept it simple for now..

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shoot, I’ve built up an asymmetry between .export and .use with this. Going to push on this issue for now and just accept styles in .use like you suggested I think. Then we’ll figure out the best API when we hook things up to pd.options (0.18 I think).

On Nov 15, 2015, at 10:58 AM, Jeff Reback notifications@github.com wrote:

In pandas/core/style.py #10250 (comment):

  •    return self
    
  • def export(self):
  •    """
    
  •    Export the styles to applied to the current Styler.
    
  •    Can be applied to a second style with `.set`.
    
  •    .. versionadded:: 0.17.1
    
  •    Returns
    

  •    styles: list
    
  •    """
    
  •    return self._todo
    
  • def set(self, styles, table_styles=None):
    o maybe export as a a tuple? e.g. Styler.export('cool') -> ('cool',.........)


Reply to this email directly or view it on GitHub https://github.com/pydata/pandas/pull/10250/files#r44874507.

@jreback
Copy link
Contributor

jreback commented Nov 15, 2015

I think I simply include the rendered html (from nbconvert) in the docs themselves

eg u render at doc build time in make.py then include as a inline ref

u can even split the notebook examples up a bit

then style.rst can simply show these rendered examples

depending on the data within, by using the ``.style`` property.
This is a property on every object that returns a ``Styler`` object, which has
useful methods for formatting and displaying DataFrames.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I realized that if you are going to render the notebook, then putting it in doc/source where you had it is prob correct. sorry. :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was just looking into how git handled symlinks :) Probably best to just keep it here, perhaps with a link to download it. It actually integrates well with sphinx.

screen shot 2015-11-15 at 12 12 17 pm

I'll tweak a few things in our website CSS as a followup PR.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok sure

@TomAugspurger TomAugspurger changed the title WIP: Conditional HTML Formatting ENH: Conditional HTML Formatting Nov 15, 2015
@TomAugspurger
Copy link
Contributor Author

@jreback added a small entry to whatsnews. Going to go through your comments again (thanks for reviewing btw!) and another glance over everything. But are you cool merging this sometime today and following up with remaining issues?

@jreback
Copy link
Contributor

jreback commented Nov 15, 2015

it's cool with merging today
go ahead when green / ready

I'll look at comments as well

@TomAugspurger
Copy link
Contributor Author

OK then, merging. I'll start a followup issue as well.

TomAugspurger pushed a commit that referenced this pull request Nov 16, 2015
ENH: Conditional HTML Formatting
@TomAugspurger TomAugspurger merged commit 18ae314 into pandas-dev:master Nov 16, 2015
@jreback
Copy link
Contributor

jreback commented Nov 16, 2015

whoo hoo!

thanks Tom!

@TomAugspurger
Copy link
Contributor Author

Thanks for reviewing & feedback. I'm excited to see what people who actually know CSS can do with this.

@shoyer
Copy link
Member

shoyer commented Nov 16, 2015

thank you Tom!

@lodagro
Copy link
Contributor

lodagro commented Nov 16, 2015

👍

@kynan
Copy link
Contributor

kynan commented Nov 16, 2015

+1 Thanks for putting this together @TomAugspurger!

@dov
Copy link

dov commented Sep 14, 2016

Is there anyway to apply a class to a cell instead of setting its properties? I.e. I would like to transform a table like:

df = pd.DataFrame([[1,2],
              [-5,1]],
             columns=['A','B'])

to:


<html><table border="1" class="dataframe">
  <thead>
    <tr style="text-align: right;">
      <th></th>
      <th>A</th>
      <th>B</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <th>0</th>
      <td>1</td>
      <td>2</td>
    </tr>
    <tr>
      <th>1</th>
      <td class="negative">-5</td>
      <td>1</td>
    </tr>
  </tbody>
</table></html>

and define the css of the class negative in an external css file.

@TomAugspurger
Copy link
Contributor Author

@dov at the moment, no. It wouldn't be hard to add a .set_class(classes, subset=None) method though. Mind opening a new issue for that?

@TomAugspurger TomAugspurger deleted the stylely branch November 3, 2016 12:38
@jihwans
Copy link
Contributor

jihwans commented Jun 24, 2020

@dov at the moment, no. It wouldn't be hard to add a .set_class(classes, subset=None) method though. Mind opening a new issue for that?

Has it ever done by someone..?

@dov
Copy link

dov commented Jun 24, 2020

@dov at the moment, no. It wouldn't be hard to add a .set_class(classes, subset=None) method though. Mind opening a new issue for that?

Has it ever done by someone..?

I must have missed that. No, I never opened such an issue. I'm not even sure what @TomAugspurger meant. What .set_class is a method of and what subset means?

@jihwans
Copy link
Contributor

jihwans commented Jun 24, 2020

I guess what @TomAugspurger meant was that just as as use styler functions to add styles, we may be able to easily add function(s) that allow adding classes using similar methods. .set_class(classes, subset=None) is a hypothetical API method that may be named so to allow such functionality. I really needed it but I could not find one in current pandas, unless I don't know enough to be sure.

@jihwans
Copy link
Contributor

jihwans commented Jun 24, 2020

@TomAugspurger unless someone is working on it, I'd like to working on adding the function - please let me know.

@TomAugspurger
Copy link
Contributor Author

TomAugspurger commented Jun 24, 2020 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
IO HTML read_html, to_html, Styler.apply, Styler.applymap Output-Formatting __repr__ of pandas objects, to_string
Projects
None yet