Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deprecation of Panel ? #13563

Closed
jorisvandenbossche opened this issue Jul 5, 2016 · 10 comments
Closed

Deprecation of Panel ? #13563

jorisvandenbossche opened this issue Jul 5, 2016 · 10 comments
Labels
API Design Deprecate Functionality to remove in pandas Needs Discussion Requires discussion from core team before further action

Comments

@jorisvandenbossche
Copy link
Member

jorisvandenbossche commented Jul 5, 2016

This is a topic that has come up recently (#10000, #8906, pandas-dev mailing list discussion), let's make this an issue to track the discussion about it.

Deprecating Panels would be a rather large change, so:

  • Do we need to further discuss if we actually want to do this?
  • Are there people who make intensive use of Panels to ask feedback?
  • How do we go about such a deprecation? First making a note in the whatsnew / pinging mailing list or other fora before actually deprecating?

cc @pydata/pandas @MaximilianR

@jorisvandenbossche jorisvandenbossche added API Design Deprecate Functionality to remove in pandas Needs Discussion Requires discussion from core team before further action Multi Dimensional labels Jul 5, 2016
@sinhrks
Copy link
Member

sinhrks commented Jul 5, 2016

I'm +1 on moving to xarray, but GitHub search shows the deprecation is not easy... As long as I know about popular packages, pydata/data-reader and quantopian/zipline uses Panel.

CC @davidastephens @ehebert

@max-sixty
Copy link
Contributor

No change this end - we are still using xarray heavily, and it's working beautifully. We've also improved the integration of xarray & pandas, so that should ease the path to deprecation.

@wesm
Copy link
Member

wesm commented Jul 12, 2016

I'm +1 on deprecating Panels; @jreback moved mountains to create a consistent internal object model from 1 to N dimensions, but there is still a feeling of second-class citizenry when it comes to working with data over 2 dimensions. I think we would be better served in the long run by really optimizing for the 1 and 2-dimensional use cases (similar to what the R community has done, though the API surface area of dplyr, data.table, and built-in data frames is quite a bit smaller than pandas -- primarily lacking in the level of indexing complexity).

I maintain that we should plan for a pandas 0.X.Y long-term support LTS release branch that becomes bugfix only so that we can start investing in renovations. I'm interested in feedback from the other core devs how realistic you feel this is.

I've long worried about the amount of baggage we are carrying forward -- there are many organizations with large codebases that have made their peace with pandas's rough edges (data type issues, view / copying semantics, etc.), and it doesn't make sense to abandon them. On the flip side, it would be a shame to be held back from undertaking a more aggressive cleanup and retool of the internals to introduce better performance, extensibility, missing data / data type issues, etc. I regret that 6 months have passed since I brought up this grand scheme and I haven't been able to carve out the time to make a dent, beyond demo'ing a proof-of-concept of integer NAs. Also, I would feel much better about working on this on a long-lived branch (similar to what happened with IPython) under some kind of feature freeze.

Anyway, some of these comments are beyond the scope of this issue. I don't think we should deprecate Panel unless we're collectively on board to the idea of cleaning up pandas internals over the next 12-24 months (which is as much of a code organization problem as anything -- particularly quarantining unit tests that we are contemplating "breaking").

jreback added a commit to jreback/pandas that referenced this issue Mar 7, 2017
jreback added a commit to jreback/pandas that referenced this issue Mar 7, 2017
jreback added a commit to jreback/pandas that referenced this issue Mar 7, 2017
jreback added a commit to jreback/pandas that referenced this issue Mar 7, 2017
jreback added a commit to jreback/pandas that referenced this issue Mar 7, 2017
jreback added a commit to jreback/pandas that referenced this issue Mar 8, 2017
jreback added a commit to jreback/pandas that referenced this issue Mar 22, 2017
jreback added a commit to jreback/pandas that referenced this issue Mar 22, 2017
jreback added a commit to jreback/pandas that referenced this issue Mar 23, 2017
jreback added a commit to jreback/pandas that referenced this issue Mar 23, 2017
jreback added a commit to jreback/pandas that referenced this issue Mar 24, 2017
jreback added a commit to jreback/pandas that referenced this issue Mar 25, 2017
jreback added a commit to jreback/pandas that referenced this issue Mar 25, 2017
jreback added a commit to jreback/pandas that referenced this issue Mar 27, 2017
jreback added a commit to jreback/pandas that referenced this issue Mar 27, 2017
jreback added a commit to jreback/pandas that referenced this issue Mar 27, 2017
jreback added a commit to jreback/pandas that referenced this issue Mar 28, 2017
jreback added a commit to jreback/pandas that referenced this issue Mar 28, 2017
jreback added a commit to jreback/pandas that referenced this issue Apr 2, 2017
jreback added a commit to jreback/pandas that referenced this issue Apr 3, 2017
jreback added a commit to jreback/pandas that referenced this issue Apr 6, 2017
jreback added a commit to jreback/pandas that referenced this issue Apr 7, 2017
jreback added a commit to jreback/pandas that referenced this issue Apr 7, 2017
@jreback jreback closed this as completed in c25fbde Apr 7, 2017
@den-run-ai
Copy link

There are plenty of examples using panel in SO:

https://stackoverflow.com/questions/tagged/panel+pandas

One particular one I'm not sure how to port and do not want to depend on xarray is this one:

https://stackoverflow.com/a/23088780/2230844

@jaypeedevlin
Copy link

jaypeedevlin commented Mar 14, 2018

I noticed today that none of the docs for the panel class/methods seem to have notification around the fact that it's deprecated.

There's the 'deprecate panel' in the 0.22.0 'what's new', but it seems likely that people may not see that if they're searching for panel or following direct links to the docs.

I can see this example of a deprecation note in a docstring, which subjectively doesn't seem to draw a lot of attention to itself. Is there a convention for these that's a little bit more 'attention-grabbing'? Once I know of the best way, I'm happy to submit a PR.

Edit: Actually, just found 1d32264 which seems to indicate exactly what to do in this instance.

@jorisvandenbossche
Copy link
Member Author

There is a deprecation in the user guide, and a warning when you actually use it, but you are certainly correct we could add a notice in all docstrings as well to give this more visibility.

Typically a .. deprecated:: sphinx directive is the way to go to add such deprecations.

PR very welcome!

@joseortiz3
Copy link
Contributor

joseortiz3 commented Dec 1, 2018

I'll be the first to protest deprecation of panels, specifically the need to rewrite legacy code. I have plenty of legacy code for finance for which conversion to multi-index is very painful, code which now spews panel warnings despite working flawlessly. Of course, I write any new code only using multi-index dataframes (which have a significantly higher learning curve, which I am happy that I overcame).

Note about feeling that "3 or more dimensions feels like second-class usage", I would note that there is a deep asymmetry even between the dimensions of a 2D pandas object - columns and rows are explicitly treated differently in pandas, with rows being second-class to columns in a highly non-intuitive way, disobeying the mathematical symmetries of matrices. Food for thought. Then again, often the dimensions of real-life data are inherently asymmetric, since time is a very special type of dimension.

@wesm
Copy link
Member

wesm commented Dec 1, 2018

@joseortiz3 the problem has less to do whether there are users of the code and more about whether there is sufficient bandwidth to maintain the code. If there isn't a motivated developer base to support a component of an open source software project, it doesn't seem reasonable that maintainers of the rest of the project should be burdened by it.

The general thinking (and @jreback and others can comment) is that having > 2 dimensional data structures has made many parts of the codebase significantly more difficult to develop and maintain. This has a high long term cost. Given pandas's funding situation (or lack thereof) I don't see how it is tenable

@jreback
Copy link
Contributor

jreback commented Dec 2, 2018

The general thinking (and @jreback and others can comment) is that having > 2 dimensional data structures has made many parts of the codebase significantly more difficult to develop and maintain. This has a high long term cost. Given pandas's funding situation (or lack thereof) I don't see how it is tenable

This is exactly right. Furthermore, pandas has quite a number of pull requests coming daily and many open issues (2600+). We have a limited amount of core devs (12), so there is a natural limitation to how much the (already huge) scope of pandas can be. Panel is not nearly as mature as other aspects of pandas and would be better served by separate motivated maintainers. Note that there is already quite an overlap with the xarray package in use cases.

@joseortiz3
Copy link
Contributor

joseortiz3 commented Dec 2, 2018

Totally reasonable, of course. Would it be so difficult to write a "panel wrapper" that has a panel-like interface to what is actually a multi-index dataframe? It wouldn't need to implement all of the methods of panel, it would just allow the for 90% of legacy code to be rewritten via a simple from ____ import PanelMultiIndex as Panel. If I had time and/or money! Some day.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
API Design Deprecate Functionality to remove in pandas Needs Discussion Requires discussion from core team before further action
Projects
None yet
Development

Successfully merging a pull request may close this issue.

8 participants