-
Notifications
You must be signed in to change notification settings - Fork 275
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
PyTorchShim
: Add serde callbacks to facilitate lazy loading models
#796
PyTorchShim
: Add serde callbacks to facilitate lazy loading models
#796
Conversation
When attempting to lazily initialize/load PyTorch models, special care needs to be taken when deserializing trained models from disk. Since the `init` method of `Model`s are not called during deserialization, shape inference cannot be performed before restoring model parameters. Therefore, the necessary information required to initialize the model must additionaly be serialized to disk for later use during deserialization.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me, but since this is a fairly large API extension, someone else should also look at it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks reasonable from what I can tell with my limited exposure to thinc
. Only minor comments.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One minor comment, otherwise LGTM. Approving this to not block things unnecessarily.
Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com>
Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com>
When attempting to lazily initialize/load PyTorch models, special care needs to be taken when deserializing trained models from disk. Since the
init
method ofModel
s are not called during deserialization, shape inference cannot be performed before restoring model parameters. Therefore, the necessary information required to initialize the model must additionally be serialized to disk for later use during deserialization.