-
-
Notifications
You must be signed in to change notification settings - Fork 181
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
cloudpickle breaks dill deserialization across servers. #217
Comments
@wmarshall484: Thanks for the detailed follow-up. I've never looked into using This may be another one of the things that |
Just to be sure we're on the same page, I'm using |
@mmckerns FYI the latest version of dill
It worked in |
There should be a simple fix for the above:
Essentially the module |
Hi, I have the same issue. I've narrowed down the issue to something in the spaCy package; More specifically after importing spaCy then the key "ClassType" appears in types. The recommended approach does not work in my case. Applying this fix to _dill.py seems to address the issue:
This thread on stackoverflow seems to support the conclusion that spaCy (or one of its dependencies) is the cause of the issue: Specifically:
|
Hi, I am also facing the same issue, trying to build an apache beam pipeline with serialization. It works fine until i introduce spacy: Issue here: https://stackoverflow.com/questions/69649645/spacy-breaks-serialization-in-pardo-apache-beam |
Following up on this issue @mmckerns responded to on Stackoverflow:
http://stackoverflow.com/questions/42960637/python-3-5-dill-pickling-unpickling-on-different-servers-keyerror-classtype/43006034#43006034
In a nutshell, with Python 3.5:
Server A imports
cloudpickle
this causestypes.ClassType
to become defined.Server B does not import
cloudpickle
, sotypes.ClassType
is left undefined.Objects which are serialized in server A also seem to serialize a reference to
ClassType
. Then, when they are deserialized on server B, we encounter the following error:This is because
_reverse_typemap
is populated partly by the contents oftypes
, which doesn't define theClassType
type by default.The workaround on server B is to define
ClassType
in_reverse_typemap
after dill is imported, and before an object is first deserialized.As a long term workaround, maybe create a whitelist of valid 3.5 types found in the
types
module? A whitelist would eliminate this kind of error and prevent any pollution/side effects from other modules likecloudpickle
.The text was updated successfully, but these errors were encountered: