-
-
Notifications
You must be signed in to change notification settings - Fork 18.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Rethink when HTML repr of DataFrame is displayed #4886
Comments
Instead of specializing just on HTML, why don't we just change the default max row config option if you detect you're in a notebook? |
Looks like you can't easily detect whether in a notebook, I guess we could just up the max_rows in |
This isn't free though, gets slow when you get up to 50,000 cells for example: df = DataFrame([range(1000) for _ in range(50)])
In [21]: %timeit df.to_string() # method used to print object
1 loops, best of 3: 3.26 s per loop |
Yep, by design, the kernel (where code is executed) doesn't know about the frontend. Looking at the code, it mentions:
So perhaps this is already improved, and I saw it in an older version. Linking those issues: #3541, #3573, and PR #3663 claiming to fix them. Oh, and there's code which attempts to detect whether it's running in the Qt console or the notebook...sorry, that won't work all the time (the process which started a kernel isn't necessarily the same as the process making this execution request). I'll bring that up to try to work out a better way to do handle the difference. |
yeah, I had a sense. Does Qt console also use |
It does. I'm proposing that we (IPython) define a rich HTML repr and a separate 'poor HTML', suitable for use in the Qt console. I'd still like to leave this issue open, because it looks like when you hit 60 rows (or whatever max_rows is configured to), it still switches abruptly to the short 'info' view, whereas I think it should show a truncated table. |
that'd be helpful :) - but yes, it seems like it would make sense to change html's repr, instead of just defaulting to info() |
@takluyver if you can set up how you'd like the repr to look, we can add a config option that can be set either in a .pandasrc or in an ipython startup script/in a notebook. |
I've found time to take a look at this. I reused the Here's the current display when you go beyond 60 rows/20 columns: And here's the new: |
I personally like how your proposed version looks. |
Thanks Jeff. I've now covered the cases with MultiIndex-es, added tests, and made PR #5550. |
I don't object to making this controllable via an option, but I'm -1 on making it the default. The way I see it, the default view of a dataframe is the info view. It always An alternative solution in 2 parts is:
I strongly urge conducting a small usability study (have a few users adopt it for a week |
The Series representation gives the first and last elements. That could maybe also be an interesting approach to something similar for DataFrames, instead of first rows/cols in the proposal. Example of Series (there is also, apart from the data, some extra information on the total length):
|
Conversely, there's the Not making it the default would defeat the entire point. New users are not going to hunt around in config settings to set this to behave intuitively. I don't even know what configuration file pandas uses. And I don't think another config setting is necessary: if you want to see the info view, use the I would love some people to do user testing - any volunteers? However, in uncontrolled user testing of the current behaviour, I have observed the sudden switch to a completely different repr confusing new users and annoying more experienced users. |
after playing with this some more I think it is an improvement - objections withdrawn. |
Cheers @y-p. :-) |
merged #5550. |
Helping out with a moderately beginner class recently, I noticed several people having problems, because they could easily display a table view of a small DataFrame, but the representation looked completely different when it exceeded a certain size. People thought that they had a different type of object, or that the detailed information was some kind of an error message. There's no obvious way to get the HTML repr for larger DataFrames.
Suggestions:
I'll try to work on this soon if no-one objects or beats me to it.
The text was updated successfully, but these errors were encountered: