-
Notifications
You must be signed in to change notification settings - Fork 393
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update the Neptune integration #906
Conversation
Minor language fixes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you very much for the update. I played around with the new neptune logger and experiment tracking site and was impressed. It's very useful and there are nice additions since last time I checked.
I have a few code comments below, please take a look. On top of those, I have some more general comments:
Versions
I saw that you have some code that works with neptune-client>=1.0
, which I didn't find released on PyPI. So I assume it's for forward compatibility for a yet to be released version. Big 👍.
That said, I ran into some issues with certain Neptune versions (see one of my code comments). So I think that there could still be an issue with supporting all versions >=0.9.0
. I would thus like to ask you to check again if it works with all intended versions. Maybe some older ones can be dropped? At the end of the day, you know best which versions should still be supported and how.
Pickling
A minor inconvenience I found is that the NeptuneLogger
apparently can't be pickled:
AttributeError: Can't pickle local object
'NeptuneAuthenticator.__init__.<locals>.session_factory'
Therefore, a user would not be able to save the whole skorch net on Neptune. If there is an easy way to fix this by doing something with NeptuneLogger.get_state
, that would be nice, but I'm not sure. On the neptune-client side, these lines seem to be the offender:
Since a reference to a local function is stored, pickle fails. So if you ever want to fix this, you would have to move this definition to the module level.
Tests
While running the tests, I got this message multiple times:
Error in atexit._run_exitfuncs:
Not sure what it means and if it can cause any trouble.
@twolodzko Let me know when you are finished and need another review round. |
@BenjaminBossan I changed the requirements to a higher version. |
Would this be blocking? It seems that this is something we should rather fix in the Neptune client library that is unpickable than this logger. Would it be ok if we merged the logger like this, but in the meantime we would figure out on our side how to make the client pickle-friendly?
This is not a problem, they can be ignored. The errors raise because we are using async calls and pytest does not handle closing them well. If I understand correctly, the tests have passed but you just saw this in the logs? |
@BenjaminBossan I think I addressed all of your comments. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you a lot for making the changes. Everything looks very good now. I only have a minor change request.
Would this be blocking? It seems that this is something we should rather fix in the Neptune client library that is unpickable than this logger. Would it be ok if we merged the logger like this, but in the meantime we would figure out on our side how to make the client pickle-friendly?
No, not an issue. As a skorch user, I can still use the Neptune logger and then purge it from the callbacks to be able to pickle the model. Most often, the logger would not be required after loading anyway, since at that point, most users are only interested in predicting with the model.
If you do make a change in Neptune that ends up enabling pickle, please let us know and we can add a unit test to check this.
@twolodzko Great, the "atexit" messages disappeared, thanks for fixing that. I have a suggestion to make regarding the def _log_metric(self, name, logs, batch):
# split e.g. train_loss into train and loss if possible
kind, _, key = name.partition('_')
if not key:
key = 'epoch_duration' if kind == 'dur' else kind
self._metric_logger[key].log(logs[name])
else:
... Would that achieve the same goal? If so, I would prefer it, it seems easier to understand. |
self.experiment.stop() | ||
self.run.stop() | ||
|
||
def _log_metric(self, name, logs, batch): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm a little puzzled about this function.
Are those shortcuts like dur
or valid
some internal skorch
strings which we're replacing to increase the readability of our experiment?
Could we add some comment here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@shnela yes. Skorch logs dur
dor duration of the epoch and several metrics like valid_loss
or train_loss
, where we extend valid
to validation
for readability.
@BenjaminBossan Nice idea. Done. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks a lot @twolodzko @normandy7 @shnela. Very nice additions and well thought out changes. I made some minor edits to the CHANGES.md entry, I hope you don't mind.
@BenjaminBossan sure, thanks. |
@BenjaminBossan is there any rough ETA on when a release including these changes may happen? :) |
No, not really, we typically just release when enough changes have accumulated and the last release is very recent, so it'll probably be a couple of months. If it's important for you to have an earlier release, we could consider a patch-level release. |
Thanks for the information.
@BenjaminBossan that would be awesome, if there's such a possibility! :) |
@Herudaio Yes, I think we can do that, hopefully not later than end of next week. |
@Herudaio We just released v0.12.1 of skorch with the updated logger. |
event_lr
, and epoch duration.