-
-
Notifications
You must be signed in to change notification settings - Fork 18.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BUG: pd.DataFrame.from_records()
raises a KeyError if passed a string index
and an empty iterable
#47285
Comments
Relates to #2633 (comment). |
pd.DataFrame.from_records()
raises a KeyError if passed a string index
and an empty iterable
Hi, thanks for your report. Not sure if this qualifies as a bug, this looks to me as it behaves as documented.
Since your field does not exist, it does not work. |
in the docs, three compatible data structures are included in the examples. now for the list of dicts case I can see that for consistency the empty list should work data = [
{"col_1": 3, "col_2": "a"},
{"col_1": 2, "col_2": "b"},
{"col_1": 1, "col_2": "c"},
{"col_1": 0, "col_2": "d"},
]
result = pd.DataFrame.from_records(data, index="col_1")
print(result)
data = [
{"col_1": 3, "col_2": "a"},
{"col_1": 2, "col_2": "b"},
{"col_2": "c"},
{"col_1": 0, "col_2": "d"},
]
result = pd.DataFrame.from_records(data, index="col_1")
print(result)
Maybe it should raise KeyError for this last case. but if the list is empty as in the OP example I would expect to get a empty DataFrame with the same Index name and columns (no rows) Now, if the data structure is a np,array it works as intended. # empty data, includes "foo" field
data = np.array([], dtype=[("foo", "i4"), ("col_2", "U1")])
print(repr(data))
print(pd.DataFrame.from_records(data, index="foo"))
if the record data structure is a list of tuples, i.e. does not have field names, it correctly raises data = [(3, "a"), (2, "b"), (1, "c"), (0, "d")]
pd.DataFrame.from_records(data, index="foo") since for the empty list case we don't know if it is a list of dicts or a list of tuples, we should probably not raise. |
I think that we should assume that the user is passing a empty list of dicts otherwise they would not be passing a field label to |
This wasn't clear. without a schema, we do not know the expected columns, so result would be an empty DataFrame (no columns and no rows) |
I guess you can catch the KeyError exception and return an empty dataframe with a named index explicitly:
|
Pandas version checks
I have checked that this issue has not already been reported.
I have confirmed this bug exists on the latest version of pandas.
I have confirmed this bug exists on the main branch of pandas.
Reproducible Example
Issue Description
Building a dataframe from records using an empty iterable for
data
and a string forindex
raises aKeyError
exception instead of returning a empty dataframe with a named index.Here is the stack trace:
Expected Behavior
I would expect
.from_records()
not to raise and instead return an empty dataframe with a named index.Put into code:
At the moment only the first test passes.
Installed Versions
The text was updated successfully, but these errors were encountered: