BUG: Regression on DataFrame.from_records #42456

felixdivo · 2021-07-09T06:54:18Z

I have checked that this issue has not already been reported.
I have confirmed this bug exists on the latest version of pandas.
(optional) I have confirmed this bug exists on the master branch of pandas.

Code Sample, a copy-pastable example

from numpy import empty, array
from pandas import DataFrame

structured_dtype = [('prop', int)]

# Does NOT work any more
result = empty((0, len(structured_dtype)))
structured_array = array(result, dtype=structured_dtype)
DataFrame.from_records(structured_array)

Triggers: ValueError: If using all scalar values, you must pass an index.

Problem description

This previously worked on Pandas version 1.2.5.

Expected Output

No exception.

Output of `pd.show_versions()`

INSTALLED VERSIONS

commit : f00ed8f
python : 3.8.10.final.0
python-bits : 64
OS : Linux
OS-release : 5.4.0-77-generic
Version : #86-Ubuntu SMP Thu Jun 17 02:35:03 UTC 2021
machine : x86_64
processor : x86_64
byteorder : little
LC_ALL : None
LANG : en_US.UTF-8
LOCALE : en_US.UTF-8
pandas : 1.3.0
numpy : 1.21.0
pytz : 2021.1
dateutil : 2.8.1
pip : 20.0.2
setuptools : 57.0.0
Cython : None
pytest : 6.2.4
hypothesis : 6.14.0
sphinx : 4.0.2
blosc : None
feather : None
xlsxwriter : None
lxml.etree : None
html5lib : None
pymysql : None
psycopg2 : None
jinja2 : 3.0.1
IPython : None
pandas_datareader: None
bs4 : None
bottleneck : None
fsspec : None
fastparquet : None
gcsfs : None
matplotlib : 3.4.2
numexpr : 2.7.3
odfpy : None
openpyxl : None
pandas_gbq : None
pyarrow : None
pyxlsb : None
s3fs : None
scipy : 1.7.0
sqlalchemy : None
tables : 3.6.1
tabulate : None
xarray : None
xlrd : None
xlwt : None
numba : None

The text was updated successfully, but these errors were encountered:

mzeitlin11 · 2021-07-11T15:01:07Z

Thanks for reporting this change @felixdivo! Bisection indicates #40121 as causing this change, doesn't look intended from a brief glance. Further investigations welcome!

felixdivo · 2021-07-11T16:35:56Z

@jbrockmendel Do you have an idea (as the author of #40121)?

jbrockmendel · 2021-07-11T17:16:28Z

Tentatively looks like in _extract_index we might want the is_list_like(val) and getattr(val, "ndim", 1) == 1 condition to allow ndim >= 1

phofl · 2021-07-20T21:41:21Z

This supresses the error currently raised but causes another later on

Traceback (most recent call last):
  File "/home/developer/.config/JetBrains/PyCharm2021.1/scratches/scratch.py", line 13, in <module>
    pd.DataFrame.from_records(structured_array)
  File "/home/developer/PycharmProjects/pandas/pandas/core/frame.py", line 2118, in from_records
    mgr = arrays_to_mgr(arrays, arr_columns, result_index, columns, typ=manager)
  File "/home/developer/PycharmProjects/pandas/pandas/core/internals/construction.py", line 124, in arrays_to_mgr
    arrays = _homogenize(arrays, index, dtype)
  File "/home/developer/PycharmProjects/pandas/pandas/core/internals/construction.py", line 584, in _homogenize
    val = sanitize_array(
  File "/home/developer/PycharmProjects/pandas/pandas/core/construction.py", line 577, in sanitize_array
    subarr = _sanitize_ndim(subarr, data, dtype, index, allow_2d=allow_2d)
  File "/home/developer/PycharmProjects/pandas/pandas/core/construction.py", line 628, in _sanitize_ndim
    raise ValueError("Data must be 1-dimensional")
ValueError: Data must be 1-dimensional

Process finished with exit code 1

… records pandas-dev#42456

…42456 (#42740) Co-authored-by: jbrockmendel <jbrockmendel@gmail.com>

…das-dev#42735)

felixdivo added Bug Needs Triage Issue that has not been reviewed by a pandas team member labels Jul 9, 2021

mzeitlin11 added Constructors Series/DataFrame/Index/pd.array Constructors Regression Functionality that used to work in a prior pandas version and removed Bug Needs Triage Issue that has not been reviewed by a pandas team member labels Jul 11, 2021

mzeitlin11 added this to the 1.3.1 milestone Jul 11, 2021

simonjayhawkins added a commit to simonjayhawkins/pandas that referenced this issue Jul 16, 2021

code sample for pandas-dev#42456

0071de3

simonjayhawkins modified the milestones: 1.3.1, 1.3.2 Jul 24, 2021

jbrockmendel added a commit to jbrockmendel/pandas that referenced this issue Jul 26, 2021

REGR: DataFrame.from_records with empty records pandas-dev#42456

25b0f42

jbrockmendel mentioned this issue Jul 26, 2021

REGR: DataFrame.from_records with empty records #42456 #42735

Merged

4 tasks

jreback closed this as completed in #42735 Jul 26, 2021

jreback pushed a commit that referenced this issue Jul 26, 2021

REGR: DataFrame.from_records with empty records #42456 (#42735)

422e37e

meeseeksmachine mentioned this issue Jul 26, 2021

Backport PR #42735 on branch 1.3.x (REGR: DataFrame.from_records with empty records #42456) #42740

Merged

meeseeksmachine pushed a commit to meeseeksmachine/pandas that referenced this issue Jul 26, 2021

Backport PR pandas-dev#42735: REGR: DataFrame.from_records with empty…

67165fa

… records pandas-dev#42456

simonjayhawkins pushed a commit that referenced this issue Jul 27, 2021

Backport PR #42735: REGR: DataFrame.from_records with empty records #…

e0c5767

…42456 (#42740) Co-authored-by: jbrockmendel <jbrockmendel@gmail.com>

CGe0516 pushed a commit to CGe0516/pandas that referenced this issue Jul 29, 2021

REGR: DataFrame.from_records with empty records pandas-dev#42456 (pan…

1d734c9

…das-dev#42735)

feefladder pushed a commit to feefladder/pandas that referenced this issue Sep 7, 2021

REGR: DataFrame.from_records with empty records pandas-dev#42456 (pan…

b910e59

…das-dev#42735)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

BUG: Regression on DataFrame.from_records #42456

BUG: Regression on DataFrame.from_records #42456

felixdivo commented Jul 9, 2021

INSTALLED VERSIONS

mzeitlin11 commented Jul 11, 2021

felixdivo commented Jul 11, 2021

jbrockmendel commented Jul 11, 2021

phofl commented Jul 20, 2021

BUG: Regression on DataFrame.from_records #42456

BUG: Regression on DataFrame.from_records #42456

Comments

felixdivo commented Jul 9, 2021

Code Sample, a copy-pastable example

Problem description

Expected Output

Output of pd.show_versions()

INSTALLED VERSIONS

mzeitlin11 commented Jul 11, 2021

felixdivo commented Jul 11, 2021

jbrockmendel commented Jul 11, 2021

phofl commented Jul 20, 2021

Output of `pd.show_versions()`