Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: fix max_columns=0, close #2856 #2881

Merged
merged 5 commits into from
Apr 12, 2013
Merged

Conversation

lodagro
Copy link
Contributor

@lodagro lodagro commented Feb 16, 2013

@jreback
Copy link
Contributor

jreback commented Mar 22, 2013

@lodagro can you add a test for this?

(I agree with your change...in fact tried to use it the other day!)

@lodagro
Copy link
Contributor Author

lodagro commented Mar 25, 2013

@jreback test has been addded (both for max_rows and max_columns)

@jreback
Copy link
Contributor

jreback commented Mar 25, 2013

@y-p if u would have a look

@ghost
Copy link

ghost commented Mar 25, 2013

terminal width detection only works in a proper shell/terminal situation, not IPython
frontends, the option description should be fixed to reflect that (we should have fixed that a while ago).
If I read it correctly, eith this change when expand_repr = False, you'll always get
a summary view. expand_repr was the R like multi page view, not a global setting
for info/not_info.

@lodagro
Copy link
Contributor Author

lodagro commented Mar 26, 2013

With expand_repr=False, the repr will not expand and you always get an info view, that is how i understood exand_repr, this seems to be incorrect and expand_repr should control if a multi page view or info view should be shown when the width of the dataframe repr exceeds pd.options.display.line_width. There is a test needed for this, since my code change did not break a test.

Now, how are the options expand_frame=True, max_columns=0 and pd.options.display.line_width supposed to interact in the various scenario`s of line_width / terminal-width and repr-width?
For example line_width > terminal-width and terminal-width < repr-width < line_width. Ok i know, it may seem silly to set line_width > terminal-width but it is possible to do so. Or do we disable multipage-view when max_columns=0.

What about line_width=0, and use terminal width detection to set its value?

@ghost
Copy link

ghost commented Mar 26, 2013

I double-checked in case I was the one who got it wrong (it happens):

display.expand_frame_repr: [default: True] [currently: True]
: boolean
        Whether to print out the full DataFrame repr for wide DataFrames
        across multiple lines.
        If False, the summary representation is shown.

So I think expand_ should only affect wide frames, not all frames.

It really has become a bit tangled. IMO, since terminal-width has proven
unreliable in the common case, it's been taken out of the loop (in code if not docstrings).

Your PR suggests you want it back, so you'll have to handle the case maze .... :)

I would suggest leaving everything as is (since there have been no cries from the balcony),
and tailor the behaviour in the textual termina; case to defaults you would consider sane.

@lodagro
Copy link
Contributor Author

lodagro commented Mar 26, 2013

Leaving everything as is, maybe not since expand_frame_repr does not work on master anyway. Setting it to False will always show an info view - independent of the frame repr width. And this is not how it was intended to be.

Terminal size detection works fine, given that a terminal is used. Auto detect does e.g not work for the ipython notebook, qtconsole or IDLE. Way back auto detection on the width was the default, i think it was a good idea no longer to do this given the confusion it brought, but for someone how know the limitations of auto size detection it should still be possible to use it -- does this count as a cry from the balcony? :-)

What about?:

There are two ways to detect a wide frame (potentially both with auto width detection):
* max_columns
* line_width

and two ways to handle a wide frame:
* info view
* wide view (R style)

The way how to handle a wide frame is controlled by expand_frame_repr, irrespective how the detection occurred. If one of the two detection methods says it is a wide frame, than it will be treated as such. (This is different from the current situation, since max_columns overflow can only result in an info view.)

@ghost
Copy link

ghost commented Mar 26, 2013

(**1)you're right, expand_repr is anded with the check on max_columns in df.repr,
that's wrong.

"Unreliable" terminal width, in that it reports the width of the terminal in which the
piython frontend was launched. which has nothing to do with available width,
and the reason why it was disabled.

two ways to detect wide frame, yes, but

  • max_columns
  • terminal_width (if in interactive session)

That happens in _need_info_repr_
if you look, line_width is only used by the expand_repr machinery to determine
where to wrap, that happens after the call to _need_info_repr_ in df.unicode,
and the wrapping is optional. because...

if (**1) is removed, there's also a third case for the output
which is displaying without wrapping (line_width= None).

@lodagro
Copy link
Contributor Author

lodagro commented Mar 27, 2013

@y-p both max_columns=0 and expand_frame_repr should be ok now.

I have left the behaviour that if a frame repr exceeds bounds set by max_columns/max_rows, an info view will be shown, indepent of expand_frame_repr

@ghost
Copy link

ghost commented Mar 27, 2013

I saw this a while back but forgot to deal with it, in_qtconsole doesn't cover ipnb, which is basically
the same case, and the detection code based on ip.config doesn't work across ipython versions ,
see here

@lodagro
Copy link
Contributor Author

lodagro commented Mar 28, 2013

We are clearly not in line on this, no problem we will converge :-)

The place were you disagree on me using line_width is the scenario when the frame repr fits in the boundaries set by max_columns/max_rows. Even than it is still possible than an info repr is needed when expand_frame_repr=False and the frame repr exceeds line_width. I find i strange that you want to check on terminal_width in this scenario and limit the repr width to line_width. It could also be i completely misunderstand line_width / expand_frame_repr.

IMO terminal_width and terminal_height should only be used when requested by the user (either by setting max_columns, max_rows to zero, maybe also for line_width=0). I see no need to overrule the result returned from get_terminal_size. Now this is done for the qtconsole, it seems you also want to add a detection/overruling if the notebook is used. Even with those two in place there are other scenarios when get_terminal_size will not give the expected result. Why not leave it up to the user if he/she wants to use it. In this branch terminal_width/terminal_size is only used when either max_columns or max_rows is set to zero (yes it can be cleaned up). This is user controlled and we probably need to document the limitations of doing so, but i would leave it up to the user to make an informed decision. For me there is no need for qtconsole/notebook detection/overruling.

We do need to handle interactive or not, since users complained about the repr info fallback when using pandas in non interactive mode, being it un-unix like. On master, not all repr info fallbacks are prevented, is that ok?

Yes, it was spaghetti and i made it even worse, that is for later let us first converge on the desired behavior.

By the way, df.to_string() has a line_width issue (also on master), when it falls back to a multi-page view, the line_width is not always respected, in the two examples below i expect width to be maxed on line_width=80.

In [1]: import pandas as pd

In [2]: len(repr(pd.DataFrame(123456, range(5), range(15))).split('\n')[1])
Out[2]: 92

In [3]: pd.options.display.max_columns = 1000

In [4]: len(repr(pd.DataFrame(0, range(5), range(35))).split('\n')[1])
Out[4]: 104

@ghost
Copy link

ghost commented Mar 28, 2013

  • The ipython frontend behaviour I wish to extened to ipnb is not a personal attack on users' freedom
    to choose. It stems from the fact that qtconsole + _repr_html_ bug #2275 was fixed for qtconsole, but afaict the issue lingers for ipnb users.
  • Even than it is still possible than an info repr is needed when expand_frame_repr=False 
    and the frame repr exceeds line_width.
    
 I don't think so. Here's the description again:

In [14]: pd.describe_option("line_width")
display.line_width: [default: 80] [currently: 120]
: int
When printing wide DataFrames, this is the width of each line.


- Perhaps open another issue for the last example? if it's not directly related.

@lodagro
Copy link
Contributor Author

lodagro commented Mar 28, 2013

2013/3/28 y-p notifications@github.com

The ipython frontend behaviour I wish to extened to ipnb is not a
personal attack on users' freedom
to choose. It stems from the fact that #2275https://github.com/pydata/pandas/issues/2275was fixed for qtconsole, but afaict the issue lingers for ipnb users.

ok, i will look into this.

Even than it is still possible than an info repr is needed when expand_frame_repr=False
and the frame repr exceeds line_width.

I don't think so. Here's the description again:

In [14]: pd.describe_option("line_width")
display.line_width: [default: 80] [currently: 120]
: int
When printing wide DataFrames, this is the width of each line.

line_width ... When printing wide frames, this is the (max) width of each
line .... hurting my brain here, trying to understand our mismatch...
Supose max_columns=20, max_rows=100, expand_frame_repr=False,
line_width=80, terminal_size=(200, 200 ) for a frame with 3 columns and 3
rows a full view can only be shown if it is smaller than 80chars otherwise
the info view is needed (and this is the check i added in _need_info_view)
On master an info view will always be shown for such a frame/option
settings. It is only needed when a full view would exceed line_width. For
me the check _need_info_repr should take into account 'line_width' since i
don`t want a full view to exceed line_width, nor an R-like view in the case
where expand_frame_repr=False.

  • Perhaps open another issue for the last example? if it's not
    directly related.

ok


Reply to this email directly or view it on GitHubhttps://github.com//pull/2881#issuecomment-15588503
.

@ghost
Copy link

ghost commented Mar 29, 2013

currently, max_columns is what determines info/or not based on number of frame columns.
line_width only affects the width of the displayed pages, it's a formatting option, not
a control for full/info display.

you're saying line_width should serve as the value of terminal_width, when in qtconsole and ipnb.
That's probably a good idea, just need to update the description.

max_columns is an unfortunate naming choice, since columns is ambiguous
(frame cols vs display cols, ie line width).

@lodagro
Copy link
Contributor Author

lodagro commented Mar 29, 2013

Yes i would include line_width also in the decision process of full/info/wide display. Not so much to handle qtconsole, or notebook.

Another point i am trying to make is that the output of get_terminal_size should only be used when either one of max_columns, max_rows, line_width is set to zero. And document the shortcoming of doing this. Such that it may result in unexpected results when a terminal is not used. I would not try to handle the imperfection of get_terminal_size (qt_console, notebook detection). There will be scenarios anyway we can not cover, like a gui with embedded python shell. BTW python3.3 standard library has its own get_terminal_size, with the same shortcomings we are facing. If none of max_columns, max_rows, line_width is set to zero get_terminal_size is not part of the game! And checks to decide if we have a wide frame are only done against max_columns, max_rows and line_width. How we handle a wide frame is determined by expand_frame_repr.

For me max_columns/max_rows is clear, but i can see your point that it may lead to confusion. We can maybe rework the option description to avoid this.

@ghost
Copy link

ghost commented Mar 29, 2013

I'm +1 for:

  • exposing terminal_height, terminal_width / terminal_size as options
  • deprecating line_width and redirecting it to terminal_width (the config options
    machinery let's you do that)
  • have expand_repr_width use terminal_width rather then line_width to determine formatting

This exposes an extra option but simpilifies the code a little, gives the user control to override
terminal autodection. and presents a couple of symmetric pairs of options: height/width bounds on frames (max_rows,max_cols), and height/width of terminal (t_width,t_height).

@ghost
Copy link

ghost commented Mar 29, 2013

I wrote the last before reading yours. Did I read correctly in that we agree?
It's ok to disable get_terminal_size if an explicit value has been set, but you
your suggestion means that if a single value is set to 0, then get_terminal_size
will fill in the blanks. too much?

@lodagro
Copy link
Contributor Author

lodagro commented Mar 29, 2013

Yes we do agree, you are taking it one step further by exposing extra option - it makes perfect sense.
fyi currently packing, i will not be near a computer for one week.

@wesm
Copy link
Member

wesm commented Apr 8, 2013

Test suite fails. Can this make it into 0.11 with fixing?

@ghost
Copy link

ghost commented Apr 8, 2013

I think re this discussion it's supposed to change quite a bit from where it's at.
dunno if @lodagro is back near a computer.

@lodagro
Copy link
Contributor Author

lodagro commented Apr 8, 2013

Back around a computer. I will implement the discussed changes in one of the next days.

@lodagro
Copy link
Contributor Author

lodagro commented Apr 10, 2013

Ok, i gave this a quick look - things should work as discussed now. Not yet finished, need to clean up and there is a failing test (test_large_frame_repr in pandas.tests.test_format.TestDataFrameFormatting), need to figure out what it is testing.

wesm added a commit that referenced this pull request Apr 12, 2013
@wesm wesm merged commit 7db1af4 into pandas-dev:master Apr 12, 2013
@wesm
Copy link
Member

wesm commented Apr 12, 2013

Fixed the unit test. Thanks

@lodagro
Copy link
Contributor Author

lodagro commented Apr 12, 2013

glups ... that was quick to merge, i wanted to shake it a bit, squash, doc strings are not updated yet. I`ll try to do it later on today.

@ghost
Copy link

ghost commented Apr 15, 2013

Would be good to have some mention of the new options and deprecations in
RELEASE.rst and v0.11.0.txt.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Output-Formatting __repr__ of pandas objects, to_string
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants