Skip to content

Commit

Permalink
🔥 💥 Remove kedro_mlflow_line magic and mlflow_client global variable …
Browse files Browse the repository at this point in the history
…in notebooks (#349)
  • Loading branch information
Galileo-Galilei committed Oct 9, 2022
1 parent 4c90a98 commit ec73e7e
Show file tree
Hide file tree
Showing 9 changed files with 18 additions and 73 deletions.
6 changes: 5 additions & 1 deletion CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,11 @@

### Fixed

- :bug: `MlflowArtifactDataSet.load()` now correctly loads the artifact when both `artifact_path` and `run_id` arguments are specified. Previous fix in ``0.11.4`` did not work because when the file already exist locally, mlflow did not download it again so tests were incorrectly passing. ([#362](https://github.com/Galileo-Galilei/kedro-mlflow/issues/362))
- :bug: `MlflowArtifactDataSet.load()` now correctly loads the artifact when both `artifact_path` and `run_id` arguments are specified. Previous fix in ``0.11.4`` did not work because when the file already exist locally, mlflow did not download it again so tests were incorrectly passing ([#362](https://github.com/Galileo-Galilei/kedro-mlflow/issues/362))

### Removed

- :fire: :boom: Remove ``reload_kedro_mlflow`` line magic for notebook because kedro will deprecate the entrypoint in 0.18.3. It is still possible to access the mlflow client associated to the configuration in a notebook with ``context.mlflow.server._mlflow_client`` ([#349](https://github.com/Galileo-Galilei/kedro-mlflow/issues/349)). This is not considered as a breaking change since apparently no one uses it according to a [discussion with kedro's team](https://github.com/kedro-org/kedro/issues/878#issuecomment-1226545251)

## [0.11.4] - 2022-10-04

Expand Down
23 changes: 13 additions & 10 deletions docs/source/06_interactive_use/01_notebook_use.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
# How to use `kedro-mlflow` in a notebook

```{important}
You need to call ``pip install kedro_mlflow[extras]`` to access notebook functionalities.
You need to install ``ipython`` to access notebook functionalities.
```

## Reminder on mlflow's limitations with interactive use
Expand All @@ -22,24 +22,26 @@ Open your notebook / ipython session with the Kedro CLI:
kedro jupyter notebook
```

Kedro [creates a bunch of global variables](https://kedro.readthedocs.io/en/latest/tools_integration/ipython.html#kedro-and-jupyter), including a `session` which are automatically accessible. It also registers some line_magic from plugins, including `%=reload_kedro_mlflow` from `kedro-mlflow`.

In your first cell, launch:
```
%reload_kedro_mlflow
Or if you are on JupyterLab,

```notebook
%load_ext kedro.ipython
```

This automatically:
- load and setup (create the tracking uri, export credentials...) the mlflow configuration of your `mlflow.yml`
Kedro [creates a bunch of global variables](https://kedro.readthedocs.io/en/latest/tools_integration/ipython.html#kedro-and-jupyter), including a `session`, a ``context`` and a ``catalog`` which are automatically accessible.

When the context was created, ``kedro-mlflow`` automatically:
- loaded and setup (create the tracking uri, export credentials...) the mlflow configuration of your `mlflow.yml`
- import ``mlflow`` which is now accessible in your notebook
- Create a `mlflow_client` object with your mlflow server settings, which is now accessible in your notebook

If you change your ``mlflow.yml``, re-execute this cell for the changes to take effect.
If you change your ``mlflow.yml``, reload the kedro extension for the changes to take effect.

## Difference with running through the CLI

- The DataSets `load` and `save` methods works as usual. You can call `catalog.save("my_artifact_dataset", data)` inside a cell, and your data will be logged in mlflow properly (assuming "my_artifact_dataset" is a `kedro_mlflow.io.MlflowArtifactDataSet`).
- The `hooks` which setup configuration are only accessible if you run the session interactive, e.g.:
- The `hooks` which automatically save all parameters/metrics/artifacts in mlflow will work if you run the session interactively, e.g.:

```python
session.run(
pipeline_name="my_ml_pipeline",
Expand All @@ -49,6 +51,7 @@ session.run(
)
```
but it is not very likely in a notebook.
- if you need to interact manually with the mlflow server, you can use ``context.mlflow.server._mlflow_client``.

## Guidelines and best practices suggestions

Expand Down
Empty file removed kedro_mlflow/extras/__init__.py
Empty file.
Empty file.
25 changes: 0 additions & 25 deletions kedro_mlflow/extras/extensions/ipython.py

This file was deleted.

7 changes: 0 additions & 7 deletions setup.py
Original file line number Diff line number Diff line change
Expand Up @@ -59,22 +59,15 @@ def _parse_requirements(path, encoding="utf-8"):
"pre-commit>=2.0.0,<3.0.0",
"jupyter>=1.0.0,<2.0.0",
],
"extras": ["notebook>=6.0.0"],
},
author="Yolan Honoré-Rougé",
entry_points={
"kedro.project_commands": [
"kedro_mlflow = kedro_mlflow.framework.cli.cli:commands"
],
# "kedro.global_commands": [
# "kedro_mlflow = kedro_mlflow.framework.cli.cli:commands"
# ],
"kedro.hooks": [
"mlflow_hook = kedro_mlflow.framework.hooks.mlflow_hook:mlflow_hook",
],
"kedro.line_magic": [
"line_magic = kedro_mlflow.extras.extensions.ipython:reload_kedro_mlflow"
],
},
zip_safe=False,
keywords="kedro-plugin, mlflow, model versioning, model packaging, pipelines, machine learning, data pipelines, data science, data engineering",
Expand Down
Empty file removed tests/extras/__init__.py
Empty file.
Empty file.
30 changes: 0 additions & 30 deletions tests/extras/extensions/test_ipython.py

This file was deleted.

0 comments on commit ec73e7e

Please sign in to comment.