Skip to content
This repository has been archived by the owner on Sep 18, 2024. It is now read-only.

General serialization infrastructure #4172

Merged
merged 15 commits into from
Oct 12, 2021

Conversation

ultmaster
Copy link
Contributor

@ultmaster ultmaster commented Sep 10, 2021

This is generally ready for review.

Will migrate all the serialization used in NNI to the new infrastructure, and add more tests in a follow-up PR. Otherwise the PR will be too large for review.

@ultmaster ultmaster marked this pull request as ready for review September 17, 2021 07:13
@ultmaster ultmaster closed this Sep 23, 2021
@ultmaster ultmaster reopened this Sep 23, 2021
)

def __json_encode__(self):
ret = {'__symbol__': _get_hybrid_cls_or_func_name(self._get_nni_attr('symbol'))}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if falling back to cloudpickle, args and kwargs are useless and cannot be mutated, right?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does cloudpickle work across different OSes?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes.

self.__dict__['_nni_args'] = args
self.__dict__['_nni_kwargs'] = kwargs

self.__dict__['_nni_self_contained'] = _self_contained
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what _self_contained mainly used for?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Serializable class instances are self-contained, meaning that we don't have to use .get() to retrieve the original instance. It behaves naturally like an original one.

return wrap(cls_or_func)


def dump(obj: Any, fp: Optional[Any] = None, use_trace: bool = True, pickle_size_limit: int = 4096,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what are dump and load used for?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same semantics as dump and load in json-tricks.

return 'path:' + name
except ImportError:
b = cloudpickle.dumps(cls_or_func)
if len(b) > pickle_size_limit:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how to avoid serializing data in dataloader?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can serialize a factory of dataloader.

@@ -9,6 +9,7 @@
from .runtime.log import init_logger
init_logger()

from .common.serializer import *
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it proper to import this at top level?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I intend to make nni.trace, nni.dump and nni.load quickly and conveniently available to users. I didn't anticipate any potential dependency issues either.

def __init__(self, symbol: T, args: List[Any], kwargs: Dict[str, Any],
_self_contained: bool = False):
# use dict to avoid conflicts with user's getattr and setattr
self.__dict__['_nni_symbol'] = symbol
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why __dict__? Seems setattr is not overrided.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This class will be inherited by users' classes with multiple inherits, For example, when nni.trace is used, users' class will be

class MyModule(SerializableObject, nn.Module):
    pass

It looks like in this case, setattr in nn.Module will be used, which is not expected.

@QuanluZhang QuanluZhang merged commit 000de04 into microsoft:master Oct 12, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants