Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cloudpickle cannot pickle cythonized methods #259

Open
marcbllv opened this issue Mar 28, 2019 · 9 comments
Open

Cloudpickle cannot pickle cythonized methods #259

marcbllv opened this issue Mar 28, 2019 · 9 comments

Comments

@marcbllv
Copy link

I'm using Dask to parallelize computing. And my code is compiled with cython.

When computing, Dask actually asks cloudpickle to dump the compute method, but this operation fails.

The problem is that it looks like cloudpickle cannot dump methods of a class if this class has been compiled with cython. But such methods can be dumped using pickle.

How to reproduce:

Here is the code of the class below. I'd like to pickle the method my_method from an object of this class:

class MyClass:

    def __init__(self, a):
        self.a = a

    def my_method(self, b):
        return self.a + b

The following code works properly when importing from the python module:

import cloudpickle
from module import MyClass

my_obj = MyClass(a=10)
cloudpickle.dumps(my_obj.my_method)

But this fails if the module is cythonized.

If the module is cythonized and MyClass is imported from there, call to cloudpickle.dumps(my_obj.my_method) generates the following error:

>>> cloudpickle.dumps(my_obj.my_method)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/marc.beillevaire/conda/envs/test_cythn/lib/python3.7/site-packages/cloudpickle/cloudpickle.py", line 961, in dumps
    cp.dump(obj)
  File "/home/marc.beillevaire/conda/envs/test_cythn/lib/python3.7/site-packages/cloudpickle/cloudpickle.py", line 267, in dump
    return Pickler.dump(self, obj)
  File "/home/marc.beillevaire/conda/envs/test_cythn/lib/python3.7/pickle.py", line 437, in dump
    self.save(obj)
  File "/home/marc.beillevaire/conda/envs/test_cythn/lib/python3.7/pickle.py", line 504, in save
    f(self, obj) # Call unbound method with explicit self
  File "/home/marc.beillevaire/conda/envs/test_cythn/lib/python3.7/site-packages/cloudpickle/cloudpickle.py", line 727, in save_instancemethod
    self.save_reduce(types.MethodType, (obj.__func__, obj.__self__), obj=obj)
  File "/home/marc.beillevaire/conda/envs/test_cythn/lib/python3.7/pickle.py", line 638, in save_reduce
    save(args)
  File "/home/marc.beillevaire/conda/envs/test_cythn/lib/python3.7/pickle.py", line 504, in save
    f(self, obj) # Call unbound method with explicit self
  File "/home/marc.beillevaire/conda/envs/test_cythn/lib/python3.7/pickle.py", line 771, in save_tuple
    save(element)
  File "/home/marc.beillevaire/conda/envs/test_cythn/lib/python3.7/pickle.py", line 535, in save
    self.save_global(obj, rv)
  File "/home/marc.beillevaire/conda/envs/test_cythn/lib/python3.7/site-packages/cloudpickle/cloudpickle.py", line 705, in save_global
    return Pickler.save_global(self, obj, name=name)
  File "/home/marc.beillevaire/conda/envs/test_cythn/lib/python3.7/pickle.py", line 957, in save_global
    (obj, module_name, name)) from None
_pickle.PicklingError: Can't pickle <cyfunction MyClass.my_method at 0x7f78f662e608>: it's not found as module.my_method

Here is the simple setup.py file I'm using to generate the cythonized module:

from distutils.core import setup
from Cython.Build import cythonize

setup(
    ext_modules = cythonize("src/module.py")
)

then: python setup.py build_ext --inplace.

Note that using standard pickle works on both python & cythonized code, ie: import pickle; pickle.dumps(my_obj.my_func) works properly

Did I miss something?

I tried adding __getstate__/__setstate__ that return self.__dict__ but this does not solve the problem.

Does some of you have any clue about this? Thanks for the help :)

@pierreglaser
Copy link
Member

We do have our own way of pickling methods, which fails in this case. Is module expected to be available on the nodes of your cluster?

@ogrisel
Copy link
Contributor

ogrisel commented Apr 10, 2019

Is module expected to be available on the nodes of your cluster?

I guess we should assume so. In this case the module can be imported on the client node and the error is raised at pickling time so this is a bug.

@marcbllv
Copy link
Author

marcbllv commented Apr 10, 2019

Thank you for the replies ogrisel and pierreglaser !

I eventually got my initial problem sorted by modifying getstate/setstate, this was a pickling issue on my side.

If pickling cython methods is not a bug then I guess the issue can be closed.

@ogrisel
Copy link
Contributor

ogrisel commented Apr 10, 2019

No if it works with regular python pickle.dumps and the module is expected to be installed on all the nodes of the cluster, then it should work with cloudpickle, so I believe this is a bug. I did not look into the details yet though.

@ogrisel
Copy link
Contributor

ogrisel commented May 21, 2019

I have not checked but this issue might have been fixed when merging #262.

@marcbllv can you please try to install cloudpickle master and try again to see if this still fails (ideally please try both under Python 2.7 and Python 3.7)?

@ogrisel
Copy link
Contributor

ogrisel commented May 21, 2019

If this is fixed, it would be great to write a non-regression test and a changelog entry.

@pierreglaser
Copy link
Member

I just checked, it is not fixed. But I'm working on it as this issue is also appearing in #273.

@pierreglaser
Copy link
Member

FYI, this was primarily a cython upstream bug, and it just got fixed.

@jakirkham
Copy link
Member

Should be included in Cython 0.29.24 (once released)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants