-
Notifications
You must be signed in to change notification settings - Fork 1.8k
General serialization infrastructure #4172
General serialization infrastructure #4172
Conversation
) | ||
|
||
def __json_encode__(self): | ||
ret = {'__symbol__': _get_hybrid_cls_or_func_name(self._get_nni_attr('symbol'))} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if falling back to cloudpickle, args and kwargs are useless and cannot be mutated, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
does cloudpickle work across different OSes?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes.
self.__dict__['_nni_args'] = args | ||
self.__dict__['_nni_kwargs'] = kwargs | ||
|
||
self.__dict__['_nni_self_contained'] = _self_contained |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what _self_contained
mainly used for?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Serializable class instances are self-contained, meaning that we don't have to use .get()
to retrieve the original instance. It behaves naturally like an original one.
return wrap(cls_or_func) | ||
|
||
|
||
def dump(obj: Any, fp: Optional[Any] = None, use_trace: bool = True, pickle_size_limit: int = 4096, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what are dump
and load
used for?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same semantics as dump and load in json-tricks.
return 'path:' + name | ||
except ImportError: | ||
b = cloudpickle.dumps(cls_or_func) | ||
if len(b) > pickle_size_limit: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how to avoid serializing data in dataloader?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can serialize a factory of dataloader.
@@ -9,6 +9,7 @@ | |||
from .runtime.log import init_logger | |||
init_logger() | |||
|
|||
from .common.serializer import * |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it proper to import this at top level?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I intend to make nni.trace
, nni.dump
and nni.load
quickly and conveniently available to users. I didn't anticipate any potential dependency issues either.
def __init__(self, symbol: T, args: List[Any], kwargs: Dict[str, Any], | ||
_self_contained: bool = False): | ||
# use dict to avoid conflicts with user's getattr and setattr | ||
self.__dict__['_nni_symbol'] = symbol |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why __dict__
? Seems setattr
is not overrided.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This class will be inherited by users' classes with multiple inherits, For example, when nni.trace
is used, users' class will be
class MyModule(SerializableObject, nn.Module):
pass
It looks like in this case, setattr
in nn.Module
will be used, which is not expected.
This is generally ready for review.
Will migrate all the serialization used in NNI to the new infrastructure, and add more tests in a follow-up PR. Otherwise the PR will be too large for review.