Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pickle class instances whose class definitions have changed #5

Closed
mmckerns opened this issue Oct 14, 2013 · 13 comments
Closed

pickle class instances whose class definitions have changed #5

mmckerns opened this issue Oct 14, 2013 · 13 comments

Comments

@mmckerns
Copy link
Member

This is an interesting use case, detailed on stackoverflow.

@asmeurer
Copy link

A very similar situation at sympy/sympy#2538. I can't pickle Q.positive with that branch with dill.

@mmckerns
Copy link
Member Author

mmckerns commented Nov 8, 2013

With the fix in Issue #12, this issue should also be working in general -- however, one might run into issues if there things defined out of scope that have changed... but, in general it works as below for instances as well as class definitions themselves.

>>> class Foo(object):
...   def baz(self):
...     return 1
... 
>>> import dill
>>> 
>>> f = Foo()  
>>> _f = dill.dumps(f)
>>> 
>>> class Foo(object):
...   def bar(self):
...     return 0
... 
>>> o = dill.loads(_f)
>>> o.baz()
1

@mmckerns mmckerns closed this as completed Nov 8, 2013
@asmeurer
Copy link

asmeurer commented Nov 9, 2013

My one still doesn't work. The issue there is not that the class has changed, but that it is created dynamically in __new__. See https://github.com/sympy/sympy/pull/2538/files#diff-b23e62aa73719754451dd6b06ab6d83bR142

@mmckerns
Copy link
Member Author

mmckerns commented Nov 9, 2013

@asmeurer: can you send me a minimal example? The reason my above example works is that I now serialize the class definitions. Your class definitions haven't changed, but something is used to dynamically generate the instances with __new__, right? So, I put some sort of rng to generate a string that would be a method name, or some crazy thing like that... and that would match your case, correct? If that's so, you'd need to serialize the class definition and the info that seeded the dynamic part of __new__... or you'd need a different code path, possibly generating a static class def, then using that to generate an instance -- I think.

@asmeurer
Copy link

asmeurer commented Nov 9, 2013

I tried copying over the relevant logic, but now I get what looks like a bug in dill

class _sentinel:
    pass

class Test:
    def __new__(cls, name):
        if name == _sentinel:
            obj = object.__new__(cls)
        else:
            obj = type(name.capitalize() + cls.__name__, (cls,),
                {})(_sentinel)
            obj.name = name

        return obj

import dill
t = Test("Aaron")
dill.dumps(t)
Traceback (most recent call last):
  File "test.py", line 17, in <module>
    dill.dumps(t)
  File "/Users/aaronmeurer/anaconda3/envs/dill-dev/lib/python3.3/site-packages/dill/dill.py", line 121, in dumps
    dump(obj, file, protocol)
  File "/Users/aaronmeurer/anaconda3/envs/dill-dev/lib/python3.3/site-packages/dill/dill.py", line 115, in dump
    pik.dump(obj)
  File "/Users/aaronmeurer/anaconda3/envs/dill-dev/lib/python3.3/pickle.py", line 235, in dump
    self.save(obj)
  File "/Users/aaronmeurer/anaconda3/envs/dill-dev/lib/python3.3/pickle.py", line 342, in save
    self.save_reduce(obj=obj, *rv)
  File "/Users/aaronmeurer/anaconda3/envs/dill-dev/lib/python3.3/pickle.py", line 407, in save_reduce
    save(cls)
  File "/Users/aaronmeurer/anaconda3/envs/dill-dev/lib/python3.3/pickle.py", line 297, in save
    f(self, obj) # Call unbound method with explicit self
  File "/Users/aaronmeurer/anaconda3/envs/dill-dev/lib/python3.3/site-packages/dill/dill.py", line 702, in save_type
    _dict = _dict_from_dictproxy(obj.__dict__)
  File "/Users/aaronmeurer/anaconda3/envs/dill-dev/lib/python3.3/site-packages/dill/dill.py", line 379, in _dict_from_dictproxy
    _dict.pop('__dict__')
KeyError: '__dict__'

@asmeurer
Copy link

asmeurer commented Nov 9, 2013

Would it make a difference if I removed the part where I set __doc__ on the generated class?

@mmckerns
Copy link
Member Author

mmckerns commented Nov 9, 2013

Hmmm... the pop('__dict__') I put into the code, originally, to avoid infinite recursion in pickling classes's proxy objects. I think that some python objects in 3x. don't expose __dict__, I believe. Apparently, this issue hadn't percolated in dill, until the most recent changeset changed the course that classes run in their serialization. I need to play with it a little bit... but if there's no infinite recursion in the relevant 3.x object, the fix will just be to not try to execute the pop when in 3.x.

This should probably be a new ticket. I've got to catch a plane, but will dig into this further... as this is a bug I just added for 3.x.

@asmeurer
Copy link

asmeurer commented Nov 9, 2013

I guess, but the SymPy code raises PickleError.

@mmckerns
Copy link
Member Author

mmckerns commented Nov 9, 2013

@asmeurer: I just updated to get past the KeyError above. Now your test case pickles, but doesn't unpickle. I believe I should be able to get around this with a little more thought.

>>> t = Test("Aaron")
>>> dill.detect.errors(t)
RuntimeError('uninitialized staticmethod object',)
>>> _t = dill.dumps(t)
>>> _t
b'\x80\x03cdill.dill\n_create_type\nq\x00(cdill.dill\n_load_type\nq\x01X\x04\x00\x00\x00typeq\x02\x85q\x03Rq\x04X\t\x00\x00\x00AaronTestq\x05h\x00(h\x04X\x04\x00\x00\x00Testq\x06h\x01X\x06\x00\x00\x00objectq\x07\x85q\x08Rq\t\x85q\n}q\x0b(X\x07\x00\x00\x00__new__q\x0ch\x01X\x0c\x00\x00\x00staticmethodq\r\x85q\x0eRq\x0f)\x81q\x10X\n\x00\x00\x00__module__q\x11X\x08\x00\x00\x00__main__q\x12X\x07\x00\x00\x00__doc__q\x13Nutq\x14Rq\x15\x85q\x16}q\x17(h\x13NX\r\x00\x00\x00__slotnames__q\x18]q\x19h\x11h\x12utq\x1aRq\x1b)\x81q\x1c}q\x1dX\x04\x00\x00\x00nameq\x1eX\x05\x00\x00\x00Aaronq\x1fsb.'
>>> dill.loads(_t)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Users/mmckerns/lib/python3.3/site-packages/dill/dill.py", line 135, in loads
    return load(file)
  File "/Users/mmckerns/lib/python3.3/site-packages/dill/dill.py", line 128, in load
    obj = pik.load()
  File "/opt/local/Library/Frameworks/Python.framework/Versions/3.3/lib/python3.3/pickle.py", line 840, in load
    dispatch[key[0]](self)
  File "/opt/local/Library/Frameworks/Python.framework/Versions/3.3/lib/python3.3/pickle.py", line 1085, in load_newobj
    obj = cls.__new__(cls, *args)
RuntimeError: uninitialized staticmethod object

@mmckerns
Copy link
Member Author

mmckerns commented Nov 9, 2013

Let's move this to it's own issue -- #13.

@asmeurer
Copy link

asmeurer commented Nov 9, 2013

It seems it still can't pickle the actual object I am interested in in my SymPy branch, though. I get PicklingError: Can't pickle <class 'sympy.assumptions.assume.PositivePredicate'>: it's not found as sympy.assumptions.assume.PositivePredicate. Does the module structure matter here? I don't create the object in the same file as the class.

@mmckerns
Copy link
Member Author

Right, so one of the things that python does that's a bit of a 'mistake' is that it associates a function or instance created in a factory pattern (like a curry) with the top level namespace where it was created and not the associated place in the namespace where the code actually lives. It's a common problem for pickling and some other similar things... and the root of a number of objects that fail to serialize with dill. So, I am looking into how to solve that problem... and it sounds like it could be related to the issue you are having.

Creating the instance in another file shouldn't matter. Module structure shouldn't matter too much, and an import can often make it easier as opposed to a reference from inner to outer scope within the same file. Encapsulation is the key to making pickling easier. The less you rely on python's namespace management magic, the easier you'll have a time in pickling something.

For me, the sympy code would be a back-burner item... but I'm happy to look into it when I get some free space (in ~2-3 weeks) if it's not cleared up by then.

@asmeurer
Copy link

No problem. It's not high priority for me at the moment. For now, that code just lives in a pull request, and it probably will for some time actually.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants