Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dill breaks cloudpickle > 1.3.0 on inner functions with closure #383

Closed
davidsmf opened this issue Aug 26, 2020 · 3 comments
Closed

Dill breaks cloudpickle > 1.3.0 on inner functions with closure #383

davidsmf opened this issue Aug 26, 2020 · 3 comments
Labels
Milestone

Comments

@davidsmf
Copy link

Cloudpickle 1.5.0, dill 0.3.2 or 0.3.1.1:

import dill
import cloudpickle


class Foo:

    def bar(self, param: int):
        hello = "Hello"
        def baz() -> None:
            print(hello, param)

        cloudpickle.loads(cloudpickle.dumps(baz))()


if __name__ == '__main__':
    Foo().bar(param=2)

This fails with ValueError: Cell is empty.

Without the dill import, or with the dill import after the cloudpickle import, and it's all fine.

If the inner function is just

        def baz() -> None:
            print("Hello world")

then it's also fine.

The problem seems to be if the inner function captures anything in its closure.

According to the cloudpickle dev cloudpipe/cloudpickle#393

In this precise case, importing dill triggers a global side effect which add a high-priority cell reducer that is not compatible with cloudpickle's way of reducing cells.

is there a way to get around this, and have both dill and cloudpickle work?

@mmckerns
Copy link
Member

The issue is that cloudpickle and dill inject different functions into the pickle registry... and thus cloudpickle also breaks dill at times (see #217), because each package can make different choices on how to serialize an object. In some cases, they are incompatible.

However, rather than only giving one option, dill provides several settings for modifying how the serialization happens, which can hopefully work around the times when the packages are by default incompatible.

If you want a serialization that is similar to the choices that cloudpickle uses, you can use: dill.settings['recurse'] = True to set this choice globally, or you can use it on a per call basis. However, this workaround looks like it won't work in your case, where it's just the dill import that is giving you issues. You can, however, use dill.extend(False) to remove dill's injection into the pickle registry.

Then, for case you want cloudpickle and dill working together, you can use dill.extend(True) to turn the injection back on.

@mmckerns
Copy link
Member

Please close if this resolves the issue for you.

@davidsmf
Copy link
Author

davidsmf commented Sep 1, 2020

I've changed the code to

import dill
dill.extend(False)
import cloudpickle
dill.extend(True)

which seems to fix it in this case. And using dill will pickle the classes we need. Unfortunately we use dask-distributed which means we have to use cloudpickle, and in our actual case it's broken. I'll go back to them.

@davidsmf davidsmf closed this as completed Sep 1, 2020
@mmckerns mmckerns added this to the dill-0.3.3 milestone Oct 28, 2020
michaelosthege added a commit to JuBiotech/calibr8 that referenced this issue Jan 8, 2021
tt.TensorVariable was probably the wrong way to begin with.
PyGMO uses cloudpickle, PyMC3 uses dill - they currently break each other.
See uqfoundation/dill#383
michaelosthege added a commit to JuBiotech/bletl that referenced this issue Jul 7, 2021
…ng in case of pickling problems

The pickling fails if dill and cloudpickle are installed at the same time (see uqfoundation/dill#383).
This is, for example, the case when PyMC3 is installed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants