Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

wrapt functions are not dill-able #34

Open
jabooth opened this issue Nov 18, 2014 · 14 comments
Open

wrapt functions are not dill-able #34

jabooth opened this issue Nov 18, 2014 · 14 comments

Comments

@jabooth
Copy link

jabooth commented Nov 18, 2014

It would be extremely useful if wrapt functions were serializable in some way. Although pickle support is unlikely given that pickle doesn't even handle undecorated functions, I hoped the more capable dill may have been up to the task, as dill is capable of handling the decorator pattern in general:

def basic_pass_through(f):

    def g(*args, **kwargs):
        return f(*args, **kwargs)

    return g

@basic_pass_through
def basic_function(a):
    return 2 * a

import dill
f_str = dill.dumps(basic_function)
function_back = dill.loads(f_str)

assert function_back(2) == 4

unfortunately, with wrapt we get:

import wrapt

@wrapt.decorator
def pass_through(wrapped, instance, args, kwargs):
    return wrapped(*args, **kwargs)

@pass_through
def function():
    pass

import dill
f_str = dill.dumps(function)
function_back = dill.loads(f_str)
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-1-f472b826a7de> in <module>()
     13 
     14 f_str = dill.dumps(function)
---> 15 function_back = dill.loads(f_str)

.../lib/python2.7/site-packages/dill/dill.pyc in loads(str)
    158     """unpickle an object from a string"""
    159     file = StringIO(str)
--> 160     return load(file)
    161 
    162 # def dumpzs(obj, protocol=None):

.../lib/python2.7/site-packages/dill/dill.pyc in load(file)
    148     pik = Unpickler(file)
    149     pik._main_module = _main_module
--> 150     obj = pik.load()
    151     if type(obj).__module__ == _main_module.__name__: # point obj class to main
    152         try: obj.__class__ == getattr(pik._main_module, type(obj).__name__)

.../lib/python2.7/pickle.pyc in load(self)
    856             while 1:
    857                 key = read(1)
--> 858                 dispatch[key](self)
    859         except _Stop, stopinst:
    860             return stopinst.value

...lib/python2.7/pickle.pyc in load_newobj(self)
   1081         args = self.stack.pop()
   1082         cls = self.stack[-1]
-> 1083         obj = cls.__new__(cls, *args)
   1084         self.stack[-1] = obj
   1085     dispatch[NEWOBJ] = load_newobj

TypeError: Required argument 'code' (pos 1) not found

I would happily prepare a PR for wrapt which enabled support for dill serialization but I'm not too sure where to start. Certainly by inspection we can see that dill has failed to serialize the wrapt function in the first place:

>> print(f_str)
'\x80\x02cdill.dill\n_load_type\nq\x00U\x0cFunctionTypeq\x01\x85q\x02Rq\x03)\x81q\x04}q\x05b.'

I don't know if I would be better off bringing this to the dill author @mmckerns first or yourself, but if anyone can shed any light I would love to get this working.

@mmckerns
Copy link

This would probably be something to work out with the wrapt developers. I'd like to not stuff too many 3rd-party specific objects in dill -- however, if this ends up being a new reusable object pattern, then by all means it should be handled by dill. I expect it should likely be a PR from you to wrapt, but I'll happy to help out.

So a good place to start is to work with the dill.detect module, and inspect the serialization process and the errors. Start with dill.detect.trace(True), to increase the diagnostic prints as you are pickling. Or you can dive into the object with dill.detect.badobjects, or other methods in the dill.detect. With some more information, it might be easier to point out what the issue is with wrapt objects. If you'd be happy to explore, I'll make sure to have a look at what you generate.

I've found that dill can serialize some very complicated decorators… but you have to be very careful about things like dynamically added attributes on object instances defined inside decorators. dill can handle that, but I don't know if there are edge cases that it doesn't work on -- for example.

@jabooth
Copy link
Author

jabooth commented Nov 19, 2014

Thanks so much for your input @mmckerns.

So to get started I've set dill.detect.trace(True) and then re-run the above example. I think it's fair to say that my two examples are doing roughly equivalent jobs, in the sense that @wrapt.decorator is a convenient way to prepare something similar to the more verbose closure I provided.

import wrapt

@wrapt.decorator
def pass_through(wrapped, instance, args, kwargs):
    return wrapped(*args, **kwargs)

@pass_through
def function():
    pass


import dill
dill.detect.trace(True)
f_str = dill.dumps(function)
T1: <type 'function'>
INFO:dill:T1: <type 'function'>
F2: <function _load_type at 0x7f4d20b94b90>
INFO:dill:F2: <function _load_type at 0x7f4d20b94b90>
D2: <dict object at 0x7f4d3041cb40>
INFO:dill:D2: <dict object at 0x7f4d3041cb40>
assert function == dill.detect.badobjects(function)  # True

for comparison:

def basic_pass_through(f):

    def g(*args, **kwargs):
        return f(*args, **kwargs)

    return g

@basic_pass_through
def basic_function(a):
    return 2 * a

import dill
dill.detect.trace(True)
f_str = dill.dumps(basic_function)
F1: <function g at 0x7f4d20b226e0>
INFO:dill:F1: <function g at 0x7f4d20b226e0>
F2: <function _create_function at 0x7f4d20b94c80>
INFO:dill:F2: <function _create_function at 0x7f4d20b94c80>
Co: <code object g at 0x7f4d20b337b0, file "<ipython-input-15-3e3fa08cad52>", line 3>
INFO:dill:Co: <code object g at 0x7f4d20b337b0, file "<ipython-input-15-3e3fa08cad52>", line 3>
F2: <function _unmarshal at 0x7f4d20b94b18>
INFO:dill:F2: <function _unmarshal at 0x7f4d20b94b18>
D1: <dict object at 0x7f4d334f8b40>
INFO:dill:D1: <dict object at 0x7f4d334f8b40>
Ce: <cell at 0x7f4d20b323d0: function object at 0x7f4d20b22b18>
INFO:dill:Ce: <cell at 0x7f4d20b323d0: function object at 0x7f4d20b22b18>
F2: <function _create_cell at 0x7f4d20b94f50>
INFO:dill:F2: <function _create_cell at 0x7f4d20b94f50>
F1: <function basic_function at 0x7f4d20b22b18>
INFO:dill:F1: <function basic_function at 0x7f4d20b22b18>
Co: <code object basic_function at 0x7f4d20b332b0, file "<ipython-input-15-3e3fa08cad52>", line 8>
INFO:dill:Co: <code object basic_function at 0x7f4d20b332b0, file "<ipython-input-15-3e3fa08cad52>", line 8>
D1: <dict object at 0x7f4d334f8b40>
INFO:dill:D1: <dict object at 0x7f4d334f8b40>
D2: <dict object at 0x7f4d20b38050>
INFO:dill:D2: <dict object at 0x7f4d20b38050>
D2: <dict object at 0x7f4d20b276e0>
INFO:dill:D2: <dict object at 0x7f4d20b276e0>

The first thing of note is that in the basic case dill notices that basic_function is actually the function originally defined as g in our closure. In the case of wrapt dill immediately encounters something it considers as a type, not a function. I guess it then proceeds to 'normally' pickle a type (i.e. the pickle module will go through the usual protocol of checking for a __reduce__ method, __getstate__ etc). This special type that wrapt provides is not pickleable, hence our quick failure.

@mmckerns does this analysis seem correct? If so it seems that the answer to this problem would be to add pickle support to wrapt's special function-like types that it uses, probably be implementing __getstate__ and __setstate__.

@mmckerns
Copy link

@jabooth: I agree with your analysis. I see you options are to either add pickle support (which should be more robust), or to diagnose why the wrapt functions are detected as a type and not a function.

@GrahamDumpleton
Copy link
Owner

For some background reading, you may want to read:

The wrapt package doesn't use function closures to implement decorators as that approach is broken in various ways and can't be used to build a really robust decorator.

The wrappers are therefore instances of a class, where the class is actually a transparent object proxy with additional function like behaviour built on top to ensure the correct binding behaviour for functions when used in conjunction with methods of a class.

There are actually two implementations within wrapt of the object proxy. A C version which is the default and a pure Python fallback version. Right now your tests are probably operating on the C version and why many evil things are happening. That said, even if you installed wrapt without the C extension and allowed it to use the Python version, I am not sure how much further you will get, as the object proxy implementation is quite nasty and overrides many many magic Python methods, as well as doing special setup using meta classes and in the constructor. Not sure that will map well to pickling.

@mmckerns
Copy link

@jabooth: While dill can handle most closures, I build almost all of my complex decorators that are intended to retain certain behaviors (much like wrapt) as classes. Most class variants and patterns, including dynamic attribute addition (which is nice for building decorators), and the associated class instances, can be handled by dill. However, using C is generally a show-stopper. For the python version, dill has a chance of working… as it can handle most meta classes and overrides of internal python methods. There are some known instances where overriding __new__, causes some pickling issues. If you encounter any of these pickling issues, they can often still be overcome with an appropriate __reduce__ method, or similar.

@GrahamDumpleton
Copy link
Owner

If you want to experiment to see if the pure Python version of wrapt can be made to work, do:

pip uninstall wrapt
WRAPT_EXTENSIONS=false pip install wrapt

@mskoenz
Copy link

mskoenz commented Nov 10, 2017

Awesome package, thanks! Implementation and blog are really nice!

But the dill / serialization problem is quite nasty especially if one needs to scale a computation with multiprocessing_on_dill or concurrent.futures (e.g. to work on clusters). From a user standpoint, I fallback to decorator.decorator (which I find less nice) but allows for these large computations. For sure not a need of your typical python user, but any library that uses wrapt is not viable for these reasons.

@mskoenz
Copy link

mskoenz commented Nov 10, 2017

Tested (python3.6, OSX):

import wrapt

@wrapt.decorator
def pass_through(fct, instance, args, kwgs):
    print("doing nothing")
    return fct(*args, **kwgs)

@pass_through
def echo(x):
    return x

print(echo("Hello"))

import dill
dill.loads(dill.dumps(echo))

(in a new empty venv): pip install wrapt

doing nothing
Hello
Traceback (most recent call last):

  File "./test.py", line 21, in <module>
    dill.loads(dill.dumps(echo))
  File ".../dill/dill.py", line 281, in dumps
    dump(obj, file, protocol, byref, fmode, recurse)#, strictio)
  File ".../dill/dill.py", line 274, in dump
    pik.dump(obj)
  File ".../python3.6/pickle.py", line 409, in dump
    self.save(obj)
  File ".../python3.6/pickle.py", line 521, in save
    self.save_reduce(obj=obj, *rv)
  File ".../lib/python3.6/pickle.py", line 603, in save_reduce
    "args[0] from __newobj__ args has the wrong class")
_pickle.PicklingError: args[0] from __newobj__ args has the wrong class

(in a new empty venv): WRAPT_EXTENSIONS=false pip install wrapt

doing nothing
Hello
Traceback (most recent call last):
  File "./wp.py", line 21, in <module>
    dill.loads(dill.dumps(echo))
  File ".../dill/dill.py", line 281, in dumps
    dump(obj, file, protocol, byref, fmode, recurse)#, strictio)
  File ".../dill/dill.py", line 274, in dump

    pik.dump(obj)
  File ".../python3.6/pickle.py", line 409, in dump
    self.save(obj)
  File ".../python3.6/pickle.py", line 496, in save
    rv = reduce(self.proto)
TypeError: can't pickle FunctionWrapper objects

If I use decorator.decorator:

import decorator

@decorator.decorator
def pass_through(fct, *args, **kwgs):
   ...

Output:

doing nothing
Hello

@garciparedes
Copy link

Hi! I want to know if there was any update related with this issue since the last comment. I think it would be really interesting to fix this "bug".

@GrahamDumpleton
Copy link
Owner

Nothing has changed. I don't know how dill/pickle really works under the covers and no one else has offered their knowledge to investigate.

@marwan116
Copy link

Following up on this thread so as not to open another issue about pickling/serializing of wrapt decorators. For such a neatly implemented library it is quite the bummer that when trying to use multiprocessing/concurrent.futures/joblib with any wrapt decorated function the code fails with serialization errors. Given how common of a use case it is to parallelize code using processes (especially in data-intensive domains) - I am wondering if:

  1. A temporary fix to this problem could be if there is a way you might know to revert back to the undecorated function - setting enabled=False doesn't cut it ...

  2. I see on other issue threads that perhaps an ObjectProxy is picklable/serializable - would this be the workaround needed here - i.e. to implement the decorator as a wrapper instead?

  3. Surely over the last 5 years of wrapt you must have had to use multiprocessing - how did you deal with it ?

@GrahamDumpleton
Copy link
Owner

@marwan116 Since pickle is different from dill, with pickle being a part of Python, can you create a separate issue and also include in that a small reproducible example that could use to test the particular scenario you are wanting it to work on.

@marwan116
Copy link

Hey @GrahamDumpleton thank you for taking the time to respond. I will follow up with a small reproducible example shortly.

@marwan116
Copy link

@GrahamDumpleton - please refer to the following issue - (I don't see any guidelines on issue posting so please let me know if I missed anything) - #158

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants