Skip to content
This repository has been archived by the owner on Apr 10, 2024. It is now read-only.

Supporting current panel use cases, interactions with xarray #62

Open
wesm opened this issue Nov 29, 2016 · 3 comments
Open

Supporting current panel use cases, interactions with xarray #62

wesm opened this issue Nov 29, 2016 · 3 comments

Comments

@wesm
Copy link
Owner

wesm commented Nov 29, 2016

The most common use case for panels I've seen has been as an aligning container for data frames -- you can insert a DataFrame "item" as you would a column normally. This can alleviate some awkwardness when working with multi-indexed data.

Couple questions around panels:

  • If we drop Panel as an analytical data structure (i.e. what is currently offered by the NDFrame construct), we should consider the API that will replace the current to_panel and to_frame workflows

  • It may be worthwhile to consider keeping around Panel as a simple container data structure for maintaining a related collection of DataFrames and supporting rudimentary reshaping / axis-swapping functionality. For example, if you have a dict of DataFrame objects in some orientation, you could create a panel, swap axes, then convert to some other data structure (e.g. xarray, MultiIndex-ed DataFrame). If you want to do deeper analysis, you should convert to xarray.

In either case, we'd be eliminating a bunch of thinly supported code

@shoyer
Copy link

shoyer commented Nov 29, 2016

I agree, keeping around Panel as a simple data container could make sense. I have also found it to be useful as an intermediate data structure for easier data alignment, though I can't think of particular use cases off the top of my head.

CC @MaximilianR

@max-sixty
Copy link

I don't have a strong view.

xarray is pretty good for aligning! So I predominately use that:

In [5]: df = pd.DataFrame(np.random.rand(3,4), columns=list('abcd'))

In [6]: df
Out[6]:
          a         b         c         d
0  0.164063  0.014835  0.529693  0.268561
1  0.076066  0.598840  0.887823  0.566114
2  0.599438  0.021646  0.775174  0.959695

In [7]: xr.Dataset({'first': df, 'second': df[list('ab')]})
Out[7]:
<xarray.Dataset>
Dimensions:  (dim_0: 3, dim_1: 4)
Coordinates:
  * dim_0    (dim_0) int64 0 1 2
  * dim_1    (dim_1) object 'a' 'b' 'c' 'd'
Data variables:
    second   (dim_0, dim_1) float64 0.1641 0.01483 nan nan 0.07607 0.5988 ...
    first    (dim_0, dim_1) float64 0.1641 0.01483 0.5297 0.2686 0.07607 ...

And pandas' stack / unstacking is pretty good for swapping axes.

What's the use case where you'd need functionality in pandas?

we should consider the API that will replace the current to_panel and to_frame workflows

@jreback has built some good .to_xarray, and we've built some decent (not perfect yet) coercion by passing xarray & pandas objects into each others' constructors

@jreback
Copy link

jreback commented Apr 7, 2017

this is merged: pandas-dev/pandas#15601

so can think about this (at some point).

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants