Skip to content
This repository has been archived by the owner on Apr 10, 2024. It is now read-only.

API: .to_numpy() #69

Open
jreback opened this issue Mar 16, 2017 · 2 comments
Open

API: .to_numpy() #69

jreback opened this issue Mar 16, 2017 · 2 comments

Comments

@jreback
Copy link

jreback commented Mar 16, 2017

xref pandas-dev/pandas#14052

currently we have an (implicity) numpy conversion when we access .values of a 1D (Series). This mostly returns a numpy array, though we do return numpy-like objects several dtypes:

  • categorical, we simply return a Categorical object
  • datetime tz/aware, we return a datetime64[ns] array in UTC (losing the tz)

This also has implications when we have a 2D object (DataFrame). we use a type that can safely hold all of the data:

  • int & floats -> floats
  • datetime w/tz -> object array
  • object & anything -> object array

so generally this is ok for 2D in that you preserve as much as possible (though of course you must copy / return heavyweight object array at times).

So need some though on how to make this api look & validate cases.

I would propose .to_numpy() (a function, so we can potentially pass options). and it won't break the current API (which we can preserve I think / provide back-compat). w/o making libpandas jump thru hoops to support the 'old' stuff.

@wesm
Copy link
Owner

wesm commented Mar 16, 2017

I agree with this -- it would be helpful to start migrating away from the .values API toward something more explicit to ease the burden. We might even want to introduce a logging layer into pandas 1.0 to alert users to use of "non-future proof" APIs

@jreback
Copy link
Author

jreback commented Mar 16, 2017

right I suppose could instrument things maybe via an option

pandas.options.future.logging='warn'|'raise'|'ignore'

and namespaced a bit for .future.* options if we need.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

2 participants