Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update the Neptune integration #906

Merged
merged 18 commits into from
Oct 19, 2022
Merged

Update the Neptune integration #906

merged 18 commits into from
Oct 19, 2022

Conversation

twolodzko
Copy link
Contributor

@twolodzko twolodzko commented Oct 11, 2022

  • Neptune integration was updated to work with a recent version of the Neptune client.
  • Additional logging was added: model summary, configuration (learning rate, optimizer, etc), event_lr, and epoch duration.
  • Tests were altered to match the current API of the client library and improved when possible.

Copy link
Collaborator

@BenjaminBossan BenjaminBossan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you very much for the update. I played around with the new neptune logger and experiment tracking site and was impressed. It's very useful and there are nice additions since last time I checked.

I have a few code comments below, please take a look. On top of those, I have some more general comments:

Versions

I saw that you have some code that works with neptune-client>=1.0, which I didn't find released on PyPI. So I assume it's for forward compatibility for a yet to be released version. Big 👍.

That said, I ran into some issues with certain Neptune versions (see one of my code comments). So I think that there could still be an issue with supporting all versions >=0.9.0. I would thus like to ask you to check again if it works with all intended versions. Maybe some older ones can be dropped? At the end of the day, you know best which versions should still be supported and how.

Pickling

A minor inconvenience I found is that the NeptuneLogger apparently can't be pickled:

AttributeError: Can't pickle local object 'NeptuneAuthenticator.__init__.<locals>.session_factory'

Therefore, a user would not be able to save the whole skorch net on Neptune. If there is an easy way to fix this by doing something with NeptuneLogger.get_state, that would be nice, but I'm not sure. On the neptune-client side, these lines seem to be the offender:

https://github.com/neptune-ai/neptune-client/blob/81fcf25b204331ae85b67e19afc78be69ddf3f84/src/neptune/oauth.py#L98-L126

Since a reference to a local function is stored, pickle fails. So if you ever want to fix this, you would have to move this definition to the module level.

Tests

While running the tests, I got this message multiple times:

Error in atexit._run_exitfuncs:

Not sure what it means and if it can cause any trouble.

skorch/callbacks/logging.py Show resolved Hide resolved
skorch/callbacks/logging.py Outdated Show resolved Hide resolved
skorch/callbacks/logging.py Outdated Show resolved Hide resolved
skorch/callbacks/logging.py Show resolved Hide resolved
skorch/callbacks/logging.py Outdated Show resolved Hide resolved
skorch/callbacks/logging.py Outdated Show resolved Hide resolved
skorch/callbacks/logging.py Outdated Show resolved Hide resolved
skorch/callbacks/logging.py Outdated Show resolved Hide resolved
skorch/callbacks/logging.py Outdated Show resolved Hide resolved
skorch/callbacks/logging.py Outdated Show resolved Hide resolved
@BenjaminBossan
Copy link
Collaborator

@twolodzko Let me know when you are finished and need another review round.

@twolodzko
Copy link
Contributor Author

That said, I ran into some issues with certain Neptune versions (see one of my code comments).

@BenjaminBossan I changed the requirements to a higher version.

@twolodzko
Copy link
Contributor Author

Pickling

A minor inconvenience I found is that the NeptuneLogger apparently can't be pickled:

AttributeError: Can't pickle local object 'NeptuneAuthenticator.__init__.<locals>.session_factory'
[...]

Would this be blocking? It seems that this is something we should rather fix in the Neptune client library that is unpickable than this logger. Would it be ok if we merged the logger like this, but in the meantime we would figure out on our side how to make the client pickle-friendly?

Tests

While running the tests, I got this message multiple times:

Error in atexit._run_exitfuncs:

Not sure what it means and if it can cause any trouble.

This is not a problem, they can be ignored. The errors raise because we are using async calls and pytest does not handle closing them well. If I understand correctly, the tests have passed but you just saw this in the logs?

@twolodzko
Copy link
Contributor Author

@BenjaminBossan I think I addressed all of your comments.

Copy link
Collaborator

@BenjaminBossan BenjaminBossan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you a lot for making the changes. Everything looks very good now. I only have a minor change request.

Would this be blocking? It seems that this is something we should rather fix in the Neptune client library that is unpickable than this logger. Would it be ok if we merged the logger like this, but in the meantime we would figure out on our side how to make the client pickle-friendly?

No, not an issue. As a skorch user, I can still use the Neptune logger and then purge it from the callbacks to be able to pickle the model. Most often, the logger would not be required after loading anyway, since at that point, most users are only interested in predicting with the model.

If you do make a change in Neptune that ends up enabling pickle, please let us know and we can add a unit test to check this.

skorch/tests/callbacks/test_logging.py Show resolved Hide resolved
@BenjaminBossan
Copy link
Collaborator

@twolodzko Great, the "atexit" messages disappeared, thanks for fixing that.

I have a suggestion to make regarding the _extract_name_parts function. How about removing it and then inside of _log_metric, change the first lines to:

    def _log_metric(self, name, logs, batch):
        # split e.g. train_loss into train and loss if possible
        kind, _, key = name.partition('_')

        if not key:
            key = 'epoch_duration' if kind == 'dur' else kind
            self._metric_logger[key].log(logs[name])
        else:
            ...

Would that achieve the same goal? If so, I would prefer it, it seems easier to understand.

self.experiment.stop()
self.run.stop()

def _log_metric(self, name, logs, batch):
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a little puzzled about this function.
Are those shortcuts like dur or valid some internal skorch strings which we're replacing to increase the readability of our experiment?
Could we add some comment here?

Copy link
Contributor Author

@twolodzko twolodzko Oct 19, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@shnela yes. Skorch logs dur dor duration of the epoch and several metrics like valid_loss or train_loss, where we extend valid to validation for readability.

@twolodzko
Copy link
Contributor Author

twolodzko commented Oct 19, 2022

I have a suggestion to make regarding the _extract_name_parts function. [...]

@BenjaminBossan Nice idea. Done.

Copy link
Collaborator

@BenjaminBossan BenjaminBossan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot @twolodzko @normandy7 @shnela. Very nice additions and well thought out changes. I made some minor edits to the CHANGES.md entry, I hope you don't mind.

@twolodzko
Copy link
Contributor Author

@BenjaminBossan sure, thanks.

@BenjaminBossan BenjaminBossan merged commit 647702b into skorch-dev:master Oct 19, 2022
@Herudaio
Copy link

Herudaio commented Nov 4, 2022

@BenjaminBossan is there any rough ETA on when a release including these changes may happen? :)

@BenjaminBossan
Copy link
Collaborator

No, not really, we typically just release when enough changes have accumulated and the last release is very recent, so it'll probably be a couple of months. If it's important for you to have an earlier release, we could consider a patch-level release.

@Herudaio
Copy link

Herudaio commented Nov 9, 2022

Thanks for the information.

If it's important for you to have an earlier release, we could consider a patch-level release.

@BenjaminBossan that would be awesome, if there's such a possibility! :)

@BenjaminBossan
Copy link
Collaborator

@Herudaio Yes, I think we can do that, hopefully not later than end of next week.

@BenjaminBossan
Copy link
Collaborator

@Herudaio We just released v0.12.1 of skorch with the updated logger.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants