-
-
Notifications
You must be signed in to change notification settings - Fork 181
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Equality doesn't work after deserialization of dataclass #500
Comments
I don't think this has anything to do with the class being nested...
What versions of |
I think this is expected behavior right? The default behavior is to serialize all classes in |
@anivegesana: Yes, this is behavior I'd expect, and consistent with:
@Dibakarroy1997: I'd close this, except that you noted that |
|
Well there's the problem >>> c.a.__dataclass_fields__['x']._field_type is dataclasses._FIELD
False We should probably either monkey-patch this class with a |
This might be similar to cloudpipe/cloudpickle#386. >>> import dill
>>> dill.__version__
'0.3.5.1'
>>> from dataclasses import dataclass
>>> import dataclasses
>>> @dataclass
... class Test:
... x: int
>>> print(dataclasses.fields(Test))
(Field(name='x',type=<class 'int'>,default=<dataclasses._MISSING_TYPE object at 0x000001815533CCD0>,default_factory=<dataclasses._MISSING_TYPE object at 0x000001815533CCD0>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({}),_field_type=_FIELD),)
>>> deserialized_class = dill.loads(dill.dumps(Test))
>>> print(dataclasses.fields(deserialized_class))
() |
Yep. Same issue. Here is a sloppy prototype that will be refined and might be added to dill. If not, a more elegant solution will. import dataclasses
import dill
from pickle import GLOBAL
if hasattr(dataclasses, "_HAS_DEFAULT_FACTORY_CLASS"):
@dill.register(dataclasses._HAS_DEFAULT_FACTORY_CLASS)
def save_dataclasses_HAS_DEFAULT_FACTORY_CLASS(pickler, obj):
pickler.write(GLOBAL + b"dataclasses\n_HAS_DEFAULT_FACTORY\n")
if hasattr(dataclasses, "MISSING"):
@dill.register(type(dataclasses.MISSING))
def save_dataclasses_MISSING_TYPE(pickler, obj):
pickler.write(GLOBAL + b"dataclasses\nMISSING\n")
if hasattr(dataclasses, "KW_ONLY"):
@dill.register(type(dataclasses.KW_ONLY))
def save_dataclasses_KW_ONLY_TYPE(pickler, obj):
pickler.write(GLOBAL + b"dataclasses\nKW_ONLY\n")
if hasattr(dataclasses, "_FIELD_BASE"):
@dill.register(dataclasses._FIELD_BASE)
def save_dataclasses_FIELD_BASE(pickler, obj):
pickler.write(GLOBAL + b"dataclasses\n" + obj.name.encode() + b"\n") >>> @dataclasses.dataclass
... class Test:
... x: int
...
>>> deserialized_class = dill.loads(dill.dumps(Test))
>>> print(dataclasses.fields(deserialized_class))
(Field(name='x',type=<class 'int'>,default=<dataclasses._MISSING_TYPE object at 0x103af0f70>,default_factory=<dataclasses._MISSING_TYPE object at 0x103af0f70>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({}),_field_type=_FIELD),) I definitely believe that this should be a pull request that should also made in python/cpython as well because it doesn't make sense why the sentinel values are duplicated during serialization. |
Thanks @anivegesana. Yeah, I think we should add code to handle dataclasses in |
The most worthwhile solution would be to include this temporary fix for dill 0.3.6 and come back and create the correct pickling function later. Updating the This solution will allow people to create mostly correct pickles of dataclasses that will remain compatible with future versions of dill while allowing me to consider corner cases that I haven't thought of yet. |
This quick fix will be removed when proper dataclass serialization support is added to dill. This is just here to allow for better support, at least for now. dataclasses pickled with this PR will be unpicklable by future versions of dill, but the future versions of dill will be able to be automatically use the newer features in dataclasses.py that were not available in older versions of Python. That forward compatibility features is not present in this PR.
This quick fix will be removed when proper dataclass serialization support is added to dill. This is just here to allow for better support, at least for now. dataclasses pickled with this PR will be unpicklable by future versions of dill, but the future versions of dill will be able to be automatically use the newer features in dataclasses.py that were not available in older versions of Python. That forward compatibility features is not present in this PR.
At least for this simple case: from dataclasses import dataclass, field, fields
import dill
import pickle
@dataclass
class Inner:
member: int = 0
@dataclass
class Outer:
inner: Inner = field(default_factory=Inner)
od = dill.loads(dill.dumps(Outer()))
# fields of `inner` are lost because "_FIELD is not _FIELD"
assert len(fields(od.inner)) == 0 # should be 1
op = pickle.loads(pickle.dumps(Outer()))
assert len(fields(op.inner)) == 1 # works correctly
|
* A temporary quick fix for dataclass serialization (#500) This quick fix will be removed when proper dataclass serialization support is added to dill. This is just here to allow for better support, at least for now. dataclasses pickled with this PR will be unpicklable by future versions of dill, but the future versions of dill will be able to be automatically use the newer features in dataclasses.py that were not available in older versions of Python. That forward compatibility features is not present in this PR. * Fix bug in pickling MappingProxyType in PyPy 3.7+
This should be fixed by: #503. Closing.
|
Equality fails. Even
asdict
also does not returns the same dictionary.The text was updated successfully, but these errors were encountered: