Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Option to set large_repr to info(verbose=False) missing #6568

Closed
bjonen opened this issue Mar 7, 2014 · 13 comments
Closed

Option to set large_repr to info(verbose=False) missing #6568

bjonen opened this issue Mar 7, 2014 · 13 comments
Labels
API Design Output-Formatting __repr__ of pandas objects, to_string
Milestone

Comments

@bjonen
Copy link
Contributor

bjonen commented Mar 7, 2014

In v0.13.1 the default representation for pd.DataFrame changed. I would like to be able to restore the previous default (see http://pandas.pydata.org/pandas-docs/stable/whatsnew.html#dataframe-repr-changes)

The option display.large_repr only allows to set info(verbose=True) as default. The default in previous version of pandas was info(verbose=False) however. The effect is that displaying a DataFrame with a large number of columns creates a lot of output.

Am I missing an easy way to change this behavior? Otherwise I suggest adding an option, e.g. truncate, info_short, info_long.

display.large_repr: [default: truncate] [currently: info]
: 'truncate'/'info'

    For DataFrames exceeding max_rows/max_cols, the repr (and HTML repr) can
    show a truncated table (the default from 0.13), or switch to the view from
    df.info() (the behaviour in earlier versions of pandas).

pd.show_versions()

INSTALLED VERSIONS

commit: None
python: 2.7.6.final.0
python-bits: 64
OS: Windows
OS-release: 8
machine: AMD64
processor: Intel64 Family 6 Model 69 Stepping 1, GenuineIntel
byteorder: little
LC_ALL: None
LANG: None

pandas: 0.13.1
Cython: None
numpy: 1.8.0
scipy: 0.13.3
statsmodels: 0.5.0
IPython: 1.2.0
sphinx: 1.2.1
patsy: 0.2.1
scikits.timeseries: None
dateutil: 2.2
pytz: 2013.9
bottleneck: 0.8.0
tables: 3.1.0
numexpr: 2.3
matplotlib: 1.3.1
openpyxl: 1.8.3
xlrd: 0.9.2
xlwt: 0.7.5
xlsxwriter: 0.5.2
sqlalchemy: None
lxml: 3.3.1
bs4: 4.3.2
html5lib: 0.999
bq: None
apiclient: None

@jreback
Copy link
Contributor

jreback commented Mar 7, 2014

can you give an example (and picture) of the 3 scenarios?

@bjonen
Copy link
Contributor Author

bjonen commented Mar 7, 2014

Sure, here comes an example for a large DataFrame.

df = pd.DataFrame(columns=range(100),index=range(1000))
  1. My preferred way of displaying such a DataFrame.
df.info(verbose=False)

<class 'pandas.core.frame.DataFrame'>
Int64Index: 1000 entries, 0 to 999
Columns: 100 entries, 0 to 99
dtypes: object(100)

  1. The current version when option is set to info (pd.options.display.large_repr = 'info')
df.info(verbose=True)

<class 'pandas.core.frame.DataFrame'>
Int64Index: 1000 entries, 0 to 999
Data columns (total 100 columns):
0 0 non-null object
1 0 non-null object
2 0 non-null object
3 0 non-null object
4 0 non-null object
5 0 non-null object

  1. The current default, that is option is set to truncate
df

Out[10]:
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 0 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
1 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
2 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
3 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
4 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
5 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
6 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
7 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN

@jreback
Copy link
Contributor

jreback commented Mar 7, 2014

see from the ML (this related, not identical): https://groups.google.com/forum/#!topic/pydata/wyhbADpsUss

@bjonen
Copy link
Contributor Author

bjonen commented Mar 19, 2014

To be clear, I do not want to change the default. I understand it is difficult to satisfy everyone's preferences.

For people working on large DataFrames the current options do not allow for a useful representation which means typing df.info(0) everytime one wants to look at the summary. So I suggest adding an option by splitting info to info_short and info_long (or any other names).

Then one could edit (among others):

https://github.com/pydata/pandas/blob/master/pandas/core/frame.py#L448

buf = StringIO(u(""))
if self._info_repr():
    self.info(buf=buf)
    return buf.getvalue()

to something like

buf = StringIO(u(""))
if self._info_long_repr():
    self.info(buf=buf)
elif self._info_short_repr():
    self.info(verb=False,buf=buf)
    return buf.getvalue()

@jreback
Copy link
Contributor

jreback commented Mar 19, 2014

I agree with this, but instead of splitting that option, maybe add info_verbose, default to True.
I know adding another option but I actually think this is less confusing!

pls submit a PR, need to update the docs on this (I think FAQ and/or a section in basics), mostly describes the options for display.

@jreback jreback added this to the 0.14.0 milestone Mar 19, 2014
@jreback
Copy link
Contributor

jreback commented Mar 19, 2014

@jseabold @jorisvandenbossche

not really an API change, but allowing one to specify an option here.

@bjonen
Copy link
Contributor Author

bjonen commented Mar 19, 2014

Ok sounds good.

@jreback
Copy link
Contributor

jreback commented Apr 9, 2014

@bjonen want to submit a PR for this?

@bjonen
Copy link
Contributor Author

bjonen commented Apr 9, 2014

Yes, will do.

@jreback
Copy link
Contributor

jreback commented May 5, 2014

@bjonen ping!

@bjonen
Copy link
Contributor Author

bjonen commented May 5, 2014

I didn't have much free time the last week, but I have a draft for #5603 (comment)

@jreback
Copy link
Contributor

jreback commented May 5, 2014

ok gr8

@jreback
Copy link
Contributor

jreback commented May 14, 2014

closed by #7130

@jreback jreback closed this as completed May 14, 2014
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
API Design Output-Formatting __repr__ of pandas objects, to_string
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants