Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cannot create Pandas Dataframe from Julia Dataframe #322

Closed
schlichtanders opened this issue Jun 5, 2023 · 7 comments
Closed

Cannot create Pandas Dataframe from Julia Dataframe #322

schlichtanders opened this issue Jun 5, 2023 · 7 comments
Labels
bug Something isn't working

Comments

@schlichtanders
Copy link
Contributor

schlichtanders commented Jun 5, 2023

Affects: JuliaCall

Describe the bug

I get the error JuliaError: MethodError: no method matching iterate(::Symbol) which also happens when just iterating over the pairs iterator from python. Here the example from Python:

from juliacall import Main as jl
import pandas as pd

jl.eval("""
df = DataFrame(grp=repeat(1:2, 3), x=6:-1:1, y=4:9, z=[3:7; missing], id='a':'f')
""")
pd.DataFrame(jl.pairs(jl.eachcol(jl.df)))

raises

---------------------------------------------------------------------------
JuliaError                                Traceback (most recent call last)
Cell In[78], line 1
----> 1 pd.DataFrame(jl.pairs(jl.eachcol(jl.df2)))

File [~/Projects/Jolin.io/workshop-accelerate-Python-with-Julia/.venv/lib/python3.10/site-packages/pandas/core/frame.py:781](https://file+.vscode-resource.vscode-cdn.net/home/ssahm/Projects/fall-in-love-with-julia/~/Projects/Jolin.io/workshop-accelerate-Python-with-Julia/.venv/lib/python3.10/site-packages/pandas/core/frame.py:781), in DataFrame.__init__(self, data, index, columns, dtype, copy)
    779     if columns is not None:
    780         columns = ensure_index(columns)
--> 781     arrays, columns, index = nested_data_to_arrays(
    782         # error: Argument 3 to "nested_data_to_arrays" has incompatible
    783         # type "Optional[Collection[Any]]"; expected "Optional[Index]"
    784         data,
    785         columns,
    786         index,  # type: ignore[arg-type]
    787         dtype,
    788     )
    789     mgr = arrays_to_mgr(
    790         arrays,
    791         columns,
   (...)
    794         typ=manager,
    795     )
    796 else:

File [~/Projects/Jolin.io/workshop-accelerate-Python-with-Julia/.venv/lib/python3.10/site-packages/pandas/core/internals/construction.py:498](https://file+.vscode-resource.vscode-cdn.net/home/ssahm/Projects/fall-in-love-with-julia/~/Projects/Jolin.io/workshop-accelerate-Python-with-Julia/.venv/lib/python3.10/site-packages/pandas/core/internals/construction.py:498), in nested_data_to_arrays(data, columns, index, dtype)
    495 if is_named_tuple(data[0]) and columns is None:
    496     columns = ensure_index(data[0]._fields)
--> 498 arrays, columns = to_arrays(data, columns, dtype=dtype)
    499 columns = ensure_index(columns)
    501 if index is None:

File [~/Projects/Jolin.io/workshop-accelerate-Python-with-Julia/.venv/lib/python3.10/site-packages/pandas/core/internals/construction.py:837](https://file+.vscode-resource.vscode-cdn.net/home/ssahm/Projects/fall-in-love-with-julia/~/Projects/Jolin.io/workshop-accelerate-Python-with-Julia/.venv/lib/python3.10/site-packages/pandas/core/internals/construction.py:837), in to_arrays(data, columns, dtype)
    834     arr, columns = _list_of_series_to_arrays(data, columns)
    835 else:
    836     # last ditch effort
--> 837     data = [tuple(x) for x in data]
    838     arr = _list_to_arrays(data)
    840 content, columns = _finalize_columns_and_data(arr, columns, dtype)

File [~/Projects/Jolin.io/workshop-accelerate-Python-with-Julia/.venv/lib/python3.10/site-packages/pandas/core/internals/construction.py:837](https://file+.vscode-resource.vscode-cdn.net/home/ssahm/Projects/fall-in-love-with-julia/~/Projects/Jolin.io/workshop-accelerate-Python-with-Julia/.venv/lib/python3.10/site-packages/pandas/core/internals/construction.py:837), in (.0)
    834     arr, columns = _list_of_series_to_arrays(data, columns)
    835 else:
    836     # last ditch effort
--> 837     data = [tuple(x) for x in data]
    838     arr = _list_to_arrays(data)
    840 content, columns = _finalize_columns_and_data(arr, columns, dtype)

File [~/.julia/packages/PythonCall/dsECZ/src/jlwrap/iter.jl:37](https://file+.vscode-resource.vscode-cdn.net/home/ssahm/Projects/fall-in-love-with-julia/~/.julia/packages/PythonCall/dsECZ/src/jlwrap/iter.jl:37), in __next__(self)
     35         return self
     36     def __next__(self):
---> 37         return self._jl_callmethod($(pyjl_methodnum(pyjliter_next)))
     38 """, @__FILE__(), "exec"), jl.__dict__)
     39 pycopy!(pyjlitertype, jl.IteratorValue)

JuliaError: MethodError: no method matching iterate(::Symbol)

Closest candidates are:
  iterate(!Matched::Union{LinRange, StepRangeLen})
   @ Base range.jl:880
  iterate(!Matched::Union{LinRange, StepRangeLen}, !Matched::Integer)
   @ Base range.jl:880
  iterate(!Matched::Union{LinearAlgebra.Eigen, LinearAlgebra.GeneralizedEigen})
   @ LinearAlgebra [/nix/store/i6jayqiqfw6h8inkhqigkarv2gjar02a-julia-bin-1.9.0/share/julia/stdlib/v1.9/LinearAlgebra/src/eigen.jl:122](https://file+.vscode-resource.vscode-cdn.net/nix/store/i6jayqiqfw6h8inkhqigkarv2gjar02a-julia-bin-1.9.0/share/julia/stdlib/v1.9/LinearAlgebra/src/eigen.jl:122)
  ...

Stacktrace:
 [1] pyjliter_next(self::PythonCall.Iterator)
   @ PythonCall [~/.julia/packages/PythonCall/dsECZ/src/jlwrap/iter.jl:14](https://file+.vscode-resource.vscode-cdn.net/home/ssahm/Projects/fall-in-love-with-julia/~/.julia/packages/PythonCall/dsECZ/src/jlwrap/iter.jl:14)
 [2] _pyjl_callmethod(f::Any, self_::Ptr{PythonCall.C.PyObject}, args_::Ptr{PythonCall.C.PyObject}, nargs::Int64)
   @ PythonCall [~/.julia/packages/PythonCall/dsECZ/src/jlwrap/base.jl:57](https://file+.vscode-resource.vscode-cdn.net/home/ssahm/Projects/fall-in-love-with-julia/~/.julia/packages/PythonCall/dsECZ/src/jlwrap/base.jl:57)
 [3] _pyjl_callmethod(o::Ptr{PythonCall.C.PyObject}, args::Ptr{PythonCall.C.PyObject})
   @ PythonCall.C [~/.julia/packages/PythonCall/dsECZ/src/cpython/jlwrap.jl:47](https://file+.vscode-resource.vscode-cdn.net/home/ssahm/Projects/fall-in-love-with-julia/~/.julia/packages/PythonCall/dsECZ/src/cpython/jlwrap.jl:47)

Your system
Please provide detailed information about your system:

  • The operating system: Nixos
  • The version of Julia 1.9.0, Python 3.10.9, PythonCall, JuliaCall 0.9.13
@schlichtanders schlichtanders added the bug Something isn't working label Jun 5, 2023
@schlichtanders
Copy link
Contributor Author

I found the following workaround by casting two times to dict

pd.DataFrame(dict(jl.Dict(jl.pairs(jl.eachcol(jl.df2)))))

at least it is still kind of a one liner 😄

@cjdoris
Copy link
Collaborator

cjdoris commented Jun 5, 2023

PythonCall has a function for this: jl.pytable.

@schlichtanders
Copy link
Contributor Author

Impressive, I was thinking this works exactly the opposite way. I looked through many discourse issues and rescanned the docs, but I didn't come about this. Thank you for the link! I see now that I was looking into the wrong part of the documentation - I was just inspecting the Python side and the Julia side, but haven't looked for an extra compatibility page.

I think PyTable is listed on the PythonCall subpage, but pytable is yet not listed on the JuliaCall subpage. It would be great to be added there so that people can find it in the reference their by searching for pandas or DataFrame

@schlichtanders
Copy link
Contributor Author

I am running into the very same problem if I want to send a Julia Dict to python

pandas works weird with JuliaDict wrapper, hence I would really like to understand what is the recommended way to transform a julia dict to a python dict. (couldn't find it so far)

@cjdoris
Copy link
Collaborator

cjdoris commented Jun 15, 2023

pydict 🙂

@schlichtanders
Copy link
Contributor Author

that makes a lot of sense - thank you

let me summarize how I understood PythonCall/JuliaCall:

  1. no implicit conversions happen
  2. however standard types are wrapped into equivalent Python/Julia wrappers which work for many cases like normal Python/Julia objects
  3. sometimes this fails, in which case you need to use explicit conversions (e.g. using pytable or pydict)
    • This type of failure is not easily preventable, but rather systematic. It is because, as in these cases above, the failure is due to the types being different. If no implicit conversion happens but only wrappers are used, it makes sense that the type can make problems, as the wrapper will necessarily have a different type than the default Python/Julia object.
    • I guess it can help people if there could be a highlighted warning about this expected set of failures and the standards on how to deal with them

@cjdoris
Copy link
Collaborator

cjdoris commented Jun 16, 2023

That sounds about right. I'm always happy to take PRs for improvements to the docs if you think something can be clearer.

@cjdoris cjdoris closed this as completed Jun 16, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants