Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: DataFrame(ndarray, dtype=..) does unsafe conversion #26919

Closed
jorisvandenbossche opened this issue Jun 18, 2019 · 0 comments · Fixed by #41578
Closed

BUG: DataFrame(ndarray, dtype=..) does unsafe conversion #26919

jorisvandenbossche opened this issue Jun 18, 2019 · 0 comments · Fixed by #41578
Labels
Bug Dtype Conversions Unexpected or buggy dtype conversions
Milestone

Comments

@jorisvandenbossche
Copy link
Member

From #26848. When you pass a ndarray to the DataFrame constructor and specify a dtype, this does a "plain" numpy astype, which can have some unwanted side-effects (that we avoid in other parts of pandas) such as np.nan -> integer conversion and out of bounds timestamps:

In [26]: pd.DataFrame(np.array([[1, np.nan], [2, 3]]), dtype='int64')
Out[26]: 
   0                    1
0  1 -9223372036854775808
1  2                    3

In [27]: pd.DataFrame(np.array([['2300-01-01']], dtype='datetime64[D]'), dtype='datetime64[ns]')
Out[27]: 
                              0
0 1715-06-13 00:25:26.290448384

Both cases are guarded in DataFrame.astype:

In [29]: pd.DataFrame(np.array([[1, np.nan], [2, 3]])).astype(dtype='int64')
...
~/scipy/pandas/pandas/core/dtypes/cast.py in astype_nansafe(arr, dtype, copy, skipna)
    678 
    679         if not np.isfinite(arr).all():
--> 680             raise ValueError('Cannot convert non-finite values (NA or inf) to '
    681                              'integer')
    682 

ValueError: Cannot convert non-finite values (NA or inf) to integer

In [30]: pd.DataFrame(np.array([['2300-01-01']], dtype='datetime64[D]')).astype(dtype='datetime64[ns]') 
...
OutOfBoundsDatetime: Out of bounds nanosecond timestamp: 2300-01-01 00:00:00

I suppose we want to do such a safe astype in DataFrame constructor itself as well?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Dtype Conversions Unexpected or buggy dtype conversions
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants