Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[MRG] Regression on pickling classes from the __main__ module #132

Merged
merged 11 commits into from
Nov 16, 2017

Conversation

ogrisel
Copy link
Contributor

@ogrisel ogrisel commented Nov 13, 2017

This PR aims to fix the regression reported in #131 and that affects 0.4.2 and later. Right now there is is just a non-regression test to reproduce the problem.

I believe the regression was silently introduced by @pitrou's cleanups in #122 although I am not 100% sure yet because this code is quite complex.

from subprocess import PIPE
from subprocess import STDOUT
from subprocess import CalledProcessError
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you import all those names at once? :-)

f.write(source_code.encode('utf-8'))

cmd = [sys.executable, source_file]
pythonpath = "{cwd}/tests:{cwd}".format(cwd=os.getcwd())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can't you use __file__ or something? getcwd sounds fragile.

@ogrisel ogrisel changed the title [WIP] Regression on pickling functions from __main__ module [MRG] Regression on pickling functions from __main__ module Nov 13, 2017
@codecov-io
Copy link

codecov-io commented Nov 13, 2017

Codecov Report

Merging #132 into master will decrease coverage by 0.31%.
The diff coverage is 33.33%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #132      +/-   ##
==========================================
- Coverage   84.08%   83.76%   -0.32%     
==========================================
  Files           2        2              
  Lines         534      536       +2     
  Branches       97       98       +1     
==========================================
  Hits          449      449              
- Misses         63       64       +1     
- Partials       22       23       +1
Impacted Files Coverage Δ
cloudpickle/cloudpickle.py 83.67% <33.33%> (-0.32%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 07cf3a8...748e7d2. Read the comment docs.

@ogrisel
Copy link
Contributor Author

ogrisel commented Nov 13, 2017

@pitrou I have tried to make a minimal change to save_global to fix the tests but it does not feel right to me. Do you have a better suggestion?

@ogrisel
Copy link
Contributor Author

ogrisel commented Nov 13, 2017

The coverage decrease is caused by the new branch that can only be executed in subprocess calls and therefore not accounted in the report.

@pitrou
Copy link
Member

pitrou commented Nov 13, 2017

I have tried to make a minimal change to save_global to fix the tests but it does not feel right to me.

I don't understand the fix. How does it work?

@ogrisel
Copy link
Contributor Author

ogrisel commented Nov 14, 2017

Apparently instances of (type, types.ClassType) from the __main__ module need to go through self.save_dynamic_class(obj) instead of Pickler.save_global(...).

@ogrisel
Copy link
Contributor Author

ogrisel commented Nov 14, 2017

I have pushed a simpler fix.

@@ -628,12 +628,16 @@ def save_global(self, obj, name=None, pack=struct.pack):
The name of this method is somewhat misleading: all types get
dispatched here.
"""
if obj.__module__ == "__main__":
return self.save_dynamic_class(obj)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this mean save_dynamic_class is also used for functions? If so, could you rename that method and update its docstring?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think functions get dispatched here. The dispatch entries for save_global are:

     dispatch[type] = save_global
     dispatch[types.ClassType] = save_global

whereas function-likes are dispatched via:

dispatch[types.FunctionType] = save_function
...
dispatch[types.MethodType] = save_instancemethod
...
dispatch[types.BuiltinFunctionType] = save_builtin_function
...
dispatch[classmethod] = save_classmethod
dispatch[staticmethod] = save_classmethod

I believe the name save_global is a holdover from the base Pickler class, which calls self.save_global in some base class methods that we call into via super, so renaming this isn't straightforward.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have pushed a new commit to simplify that function even further and all tests still pass. However, I am not sure we are not breaking edge cases in third party libraries and applications.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have run the test suite of both loky and joblib against this branch all tests pass as well so I am pretty confident that the changes are fine.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you check whether save_dynamic_class really gets a function as input and, if so, change the name or docstring?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pitrou I'm pretty confident save_dynamic_class can never be called with a function as input. It accesses obj.__bases__ unconditionally, which doesn't exist on function objects.

@@ -628,12 +628,16 @@ def save_global(self, obj, name=None, pack=struct.pack):
The name of this method is somewhat misleading: all types get
dispatched here.
"""
if obj.__module__ == "__main__":
return self.save_dynamic_class(obj)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you check whether save_dynamic_class really gets a function as input and, if so, change the name or docstring?

@ogrisel
Copy link
Contributor Author

ogrisel commented Nov 14, 2017

No it does not. It's only for classes that cannot be looked up as attributes from global modules. In our tests those are:

  • Classes that are defined in the local namespace of another function, method or class,
  • Classes that are defined in the __main__ module.

@ogrisel
Copy link
Contributor Author

ogrisel commented Nov 14, 2017

The name of the PR is misleading: the new non-regression tests call interactively defined functions from the __main__ module but the actual problem was caused by the fact that interactively defined classes used by those functions could not be unpickled by attribute from module lookup.

@ogrisel ogrisel changed the title [MRG] Regression on pickling functions from __main__ module [MRG] Regression on pickling classes from the __main__ module Nov 14, 2017
@pitrou
Copy link
Member

pitrou commented Nov 15, 2017

Ok, thank you for clarifying :-)


raise
return self.save_dynamic_class(obj)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does save_dynamic_class raise a readable exception or do we want to keep the if above?

@ogrisel
Copy link
Contributor Author

ogrisel commented Nov 15, 2017

Maybe it's good idea to keep the protection:

>>> import cloudpickle
>>> import io
>>> p = cloudpickle.CloudPickler(io.BytesIO)
>>> p.save_dynamic_class(lambda x: x)
Fatal Python error: Cannot recover from stack overflow.

Thread 0x00007fc32d281700 (most recent call first):
  File "/home/ogrisel/.virtualenvs/py36/lib/python3.6/site-packages/IPython/core/history.py", line 764 in _writeout_input_cache
  File "/home/ogrisel/.virtualenvs/py36/lib/python3.6/site-packages/IPython/core/history.py", line 780 in writeout_cache
  File "/home/ogrisel/.virtualenvs/py36/lib/python3.6/site-packages/IPython/core/history.py", line 58 in needs_sqlite
  File "<decorator-gen-23>", line 2 in writeout_cache
  File "/home/ogrisel/.virtualenvs/py36/lib/python3.6/site-packages/IPython/core/history.py", line 834 in run
  File "/home/ogrisel/.virtualenvs/py36/lib/python3.6/site-packages/IPython/core/history.py", line 58 in needs_sqlite
  File "<decorator-gen-24>", line 2 in run
  File "/usr/lib/python3.6/threading.py", line 916 in _bootstrap_inner
  File "/usr/lib/python3.6/threading.py", line 884 in _bootstrap

Current thread 0x00007fc33466b700 (most recent call first):
  File "/usr/lib/python3.6/pickle.py", line 264 in _getattribute
  File "/usr/lib/python3.6/pickle.py", line 918 in save_global
  File "/home/ogrisel/code/cloudpickle/cloudpickle/cloudpickle.py", line 635 in save_global
  File "/home/ogrisel/code/cloudpickle/cloudpickle/cloudpickle.py", line 375 in save_function
  File "/usr/lib/python3.6/pickle.py", line 476 in save
  File "/home/ogrisel/code/cloudpickle/cloudpickle/cloudpickle.py", line 480 in save_dynamic_class
  File "/home/ogrisel/code/cloudpickle/cloudpickle/cloudpickle.py", line 642 in save_global
  File "/home/ogrisel/code/cloudpickle/cloudpickle/cloudpickle.py", line 375 in save_function
  File "/usr/lib/python3.6/pickle.py", line 476 in save
  File "/home/ogrisel/code/cloudpickle/cloudpickle/cloudpickle.py", line 480 in save_dynamic_class
  File "/home/ogrisel/code/cloudpickle/cloudpickle/cloudpickle.py", line 642 in save_global
  File "/home/ogrisel/code/cloudpickle/cloudpickle/cloudpickle.py", line 375 in save_function
  ...
Aborted (core dumped)

@ogrisel
Copy link
Contributor Author

ogrisel commented Nov 15, 2017

Actually the snippet from the previous comment is misleading, I passed the BytesIO class instead of an instance to build the pickler. With a correct pickler, the error message is fine I think:

>>> import cloudpickle
>>> import io
>>> p = cloudpickle.CloudPickler(io.BytesIO())
>>> p.save_dynamic_class(lambda x: x)
Traceback (most recent call last):
  File "<ipython-input-8-4356e554f31b>", line 1, in <module>
    p.save_dynamic_class(lambda x: x)
  File "/home/ogrisel/code/cloudpickle/cloudpickle/cloudpickle.py", line 490, in save_dynamic_class
    # encountered while saving will point to the skeleton class.
AttributeError: 'function' object has no attribute '__bases__'

@pitrou
Copy link
Member

pitrou commented Nov 15, 2017

But what was the previous message? It seems to me that Pickler generally raises PicklingError.

@ogrisel
Copy link
Contributor Author

ogrisel commented Nov 15, 2017

  • p.save_dynamic_classes(lambda x: x) would raise the same Attribute error in 0.4.1 (as this part of the code has not changed).

  • p.save_global(lambda x: x) would raise a PicklingError: Can't pickle <function <lambda> at 0x7ff4d2ff97b8> while it now raises the AttributeError.

However, I am not sure we should try to preserve the behavior of the private API on wrong inputs. What really matters is that it should raise a meaningful exception when calling the public API with wrong inputs and calling loads(dumps(arbitrary_func)) should work on 0.5.2 as it used to work in 0.4.1 and not yield an exception.

@ogrisel
Copy link
Contributor Author

ogrisel commented Nov 15, 2017

@robertnishihara @iaroslav-ai could you please try this branch with your code to confirm that it fixes all your problems?

@ogrisel
Copy link
Contributor Author

ogrisel commented Nov 15, 2017

@pitrou it's true that the upstream save_global raises PicklingError when it cannot lookup such an object definition. I will restore this behavior in this branch.

However this case should never happen in cloudpickle as far as I know because cloudpickle has been written precisely to be able to pickle dynamic class and function definitions. I am not sure how to write a test for this.

@robertnishihara
Copy link

@ogrisel Thanks, I just tried it out and this seems to fix my problem.

@iaroslav-ai
Copy link

Yup that seems to solve it. Thanks @ogrisel !

@rgbkrk rgbkrk merged commit fa4b1ce into cloudpipe:master Nov 16, 2017
@ogrisel ogrisel deleted the fix-interactive-func branch November 16, 2017 10:03
BryanCutler added a commit to BryanCutler/spark that referenced this pull request Jan 29, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants