-
-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
to_pandas() breaks Python objects in object column #13021
Comments
It seems to be happening during the pyarrow conversion: polars/py-polars/src/dataframe.rs Line 879 in 0e34695
Which does some low-level stuff:
The Path ends up as a pl.DataFrame([Path("index.html")]).to_arrow()
# pyarrow.Table
# column_0: fixed_size_binary[8]
# ----
# column_0: [[60BA073201000000]] Does anybody know what this binary value represents and if the original object can be reconstructed from it? |
Same.
|
I had to change The 8-byte value is presumably the memory address of the Python object, interpreted as bytes instead of as a @stinodego I will try to fix this. |
|
What I've learned so far:
Option 1: Casting
Option 2: Bypass PyArrow
|
I got Option 2 working pretty easily, so just need to write tests. |
Checks
I have checked that this issue has not already been reported.
I have confirmed this bug exists on the latest version of Polars.
Reproducible example
Output:
Log output
No response
Issue description
Both of Polars and Pandas can hold Python objects like
pathlib.Path
indtype=object
column.Converting those dataframe by
to_pandas()
doesn't throw errors, but values seems broken.Expected behavior
Installed versions
The text was updated successfully, but these errors were encountered: