Skip to content

Commit

Permalink
Merge pull request #173 from Galileo-Galilei/release-0.6.0
Browse files Browse the repository at this point in the history
Release 0.6.0
  • Loading branch information
Galileo-Galilei authored Mar 14, 2021
2 parents 6d7023d + 419e6f2 commit 477147f
Show file tree
Hide file tree
Showing 32 changed files with 933 additions and 1,173 deletions.
2 changes: 2 additions & 0 deletions .github/workflows/test.yml
Original file line number Diff line number Diff line change
Expand Up @@ -30,6 +30,7 @@ jobs:
python -m pip install --upgrade pip
pip install -e .[tests]
- name: Lint with flake8
if: matrix.os == 'ubuntu-latest' && matrix.python-version == '3.7' # linting should occur only once in the loop
run: |
# stop the build if there are Python syntax errors or undefined names
flake8 . --count --select=E9,F63,F7,F82 --show-source --statistics --exclude kedro_mlflow/template/project/run.py
Expand All @@ -40,6 +41,7 @@ jobs:
pytest --cov=./ --cov-report=xml
- name: Upload coverage report to Codecov
uses: codecov/codecov-action@v1
if: matrix.os == 'ubuntu-latest' && matrix.python-version == '3.7' # upload should occur only once in the loop
with:
token: ${{ secrets.CODECOV_TOKEN }} # token is not mandatory but make access more stable
file: ./coverage.xml
Expand Down
2 changes: 2 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -137,3 +137,5 @@ mlruns/
debug/
*.xlsx
*.pptx

docs/source/05_framework_ml/05_example_project_step_by_step.md
19 changes: 18 additions & 1 deletion CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,21 @@

## [Unreleased]

## [0.6.0] - 2021-03-14

### Added

- `kedro-mlflow` now supports `kedro==0.17.0` ([#144](https://github.com/Galileo-Galilei/kedro-mlflow/issues/144)). Since the kedro core team made a breaking change in the patch release `0.17.1`, it is not supported yet. They also [recommend to downgrade to 0.17.0 for stability](https://github.com/quantumblacklabs/kedro/issues/716#issuecomment-793983298).
- Updated documentation

### Fixed

- The support of `kedro==0.17.0` automatically makes the CLI commands available when the configuration is declared in a `pyproject.toml` instead of a `.kedro.yml`, which was not the case in previous version despite we claim it was ([#157](https://github.com/Galileo-Galilei/kedro-mlflow/issues/157)).

### Changed

- Drop support for `kedro==0.16.x`. All future plugin updates will be only compatible with `kedro>=0.17.0`.

## [0.5.0] - 2021-02-21

### Added
Expand Down Expand Up @@ -163,7 +178,9 @@ Many documentation improvements:
- Add `MlflowDataSet` for artifacts autologging
- Add `PipelineMl` class and its `pipeline_ml` factory for pipeline packaging and service

[unreleased]: https://github.com/Galileo-Galilei/kedro-mlflow/compare/0.2.1...HEAD
[Unreleased]: https://github.com/Galileo-Galilei/kedro-mlflow/compare/0.6.0...HEAD

[0.6.0]: https://github.com/Galileo-Galilei/kedro-mlflow/compare/0.5.0...0.6.0

[0.2.1]: https://github.com/Galileo-Galilei/kedro-mlflow/compare/0.2.0...0.2.1

Expand Down
4 changes: 2 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@
----------------------------------------------------------
| Branch | Tests | Coverage | Documentation | Deployment | Activity |
|--------|-------|----------|---------------|------------|------------|
| `develop`| [![test](https://github.com/Galileo-Galilei/kedro-mlflow/workflows/test/badge.svg?branch=develop)](https://github.com/Galileo-Galilei/kedro-mlflow/actions?query=workflow%3Atest+branch%3Adevelop)| [![codecov](https://codecov.io/gh/Galileo-Galilei/kedro-mlflow/branch/develop/graph/badge.svg)](https://codecov.io/gh/Galileo-Galilei/kedro-mlflow/branch/develop)|[![Documentation](https://readthedocs.org/projects/kedro-mlflow/badge/?version=latest)](https://kedro-mlflow.readthedocs.io/en/latest/)| [![create-release-candidate](https://github.com/Galileo-Galilei/kedro-mlflow/workflows/create-release-candidate/badge.svg?branch=develop)](https://github.com/Galileo-Galilei/kedro-mlflow/actions?query=branch%3Adevelop+workflow%3Acreate-release-candidate)|[![commit](https://img.shields.io/github/commits-since/Galileo-Galilei/kedro-mlflow/0.5.0)](https://github.com/Galileo-Galilei/kedro-mlflow/compare/0.5.0...develop)|
| `develop`| [![test](https://github.com/Galileo-Galilei/kedro-mlflow/workflows/test/badge.svg?branch=develop)](https://github.com/Galileo-Galilei/kedro-mlflow/actions?query=workflow%3Atest+branch%3Adevelop)| [![codecov](https://codecov.io/gh/Galileo-Galilei/kedro-mlflow/branch/develop/graph/badge.svg)](https://codecov.io/gh/Galileo-Galilei/kedro-mlflow/branch/develop)|[![Documentation](https://readthedocs.org/projects/kedro-mlflow/badge/?version=latest)](https://kedro-mlflow.readthedocs.io/en/latest/)| [![create-release-candidate](https://github.com/Galileo-Galilei/kedro-mlflow/workflows/create-release-candidate/badge.svg?branch=develop)](https://github.com/Galileo-Galilei/kedro-mlflow/actions?query=branch%3Adevelop+workflow%3Acreate-release-candidate)|[![commit](https://img.shields.io/github/commits-since/Galileo-Galilei/kedro-mlflow/0.6.0)](https://github.com/Galileo-Galilei/kedro-mlflow/compare/0.6.0...develop)|
| `master` | [![test](https://github.com/Galileo-Galilei/kedro-mlflow/workflows/test/badge.svg?branch=master)](https://github.com/Galileo-Galilei/kedro-mlflow/actions?query=workflow%3Atest+branch%3Amaster) | [![codecov](https://codecov.io/gh/Galileo-Galilei/kedro-mlflow/branch/master/graph/badge.svg)](https://codecov.io/gh/Galileo-Galilei/kedro-mlflow/branch/master)|[![Documentation](https://readthedocs.org/projects/kedro-mlflow/badge/?version=stable)](https://kedro-mlflow.readthedocs.io/en/stable/)|[![publish](https://github.com/Galileo-Galilei/kedro-mlflow/workflows/publish/badge.svg?branch=master)](https://github.com/Galileo-Galilei/kedro-mlflow/actions?query=branch%3Amaster+workflow%3Apublish)||

# What is kedro-mlflow?
Expand All @@ -25,7 +25,7 @@

# How do I install kedro-mlflow?

**Important: ``kedro-mlflow`` is only compatible with ``kedro>=0.16.0``. If you have a project created with an older version of ``Kedro``, see this [migration guide](https://github.com/quantumblacklabs/kedro/blob/master/RELEASE.md#migration-guide-from-kedro-015-to-016).**
**Important: ``kedro-mlflow`` is only compatible with ``kedro>=0.16.0`` and ``mlflow>=1.0.0``. If you have a project created with an older version of ``Kedro``, see this [migration guide](https://github.com/quantumblacklabs/kedro/blob/master/RELEASE.md#migration-guide-from-kedro-015-to-016).**

``kedro-mlflow`` is available on PyPI, so you can install it with ``pip``:

Expand Down
4 changes: 2 additions & 2 deletions docs/source/01_introduction/02_motivation.md
Original file line number Diff line number Diff line change
Expand Up @@ -44,6 +44,6 @@ Above implementations have the advantage of being very straightforward and *mlfl
|Logging metrics |``catalog.yml`` |``MlflowMetricsDataSet`` |
|Logging Pipeline as model |``hooks.py`` |``KedroPipelineModel`` and ``pipeline_ml_factory``|

In the current version (``kedro_mlflow=0.5.0``), `kedro-mlflow` does not provide interface to set tags outside a Kedro ``Pipeline``. Some of above decisions are subject to debate and design decisions (for instance, metrics are often updated in a loop during each epoch / training iteration and it does not always make sense to register the metric between computation steps, e.g. as a an I/O operation after a node run).
In the current version (``kedro_mlflow=0.6.0``), `kedro-mlflow` does not provide interface to set tags outside a Kedro ``Pipeline``. Some of above decisions are subject to debate and design decisions (for instance, metrics are often updated in a loop during each epoch / training iteration and it does not always make sense to register the metric between computation steps, e.g. as a an I/O operation after a node run).

_**Note:** the version ``0.5.0`` does not need any ``MLProject`` file to use mlflow inside your Kedro project. As seen in the [introduction](./01_introduction.md), this file overlaps with Kedro configuration files._
_**Note:** the version ``0.6.0`` does not need any ``MLProject`` file to use mlflow inside your Kedro project. As seen in the [introduction](./01_introduction.md), this file overlaps with Kedro configuration files._
4 changes: 2 additions & 2 deletions docs/source/02_installation/01_installation.md
Original file line number Diff line number Diff line change
Expand Up @@ -77,10 +77,10 @@ projects. It is developed as part of
the Kedro initiative at QuantumBlack.

Installed plugins:
kedro_mlflow: 0.5.0 (hooks:global,project)
kedro_mlflow: 0.6.0 (hooks:global,project)
```

The version ``0.5.0`` of the plugin is installed and has both global and project commands.
The version ``0.6.0`` of the plugin is installed and has both global and project commands.

That's it! You are now ready to go!

Expand Down
21 changes: 21 additions & 0 deletions docs/source/02_installation/03_migration_guide.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,27 @@

This page explains how to migrate an existing kedro project to a more up to date `kedro-mlflow` versions with breaking changes.

## Migration from 0.5.0 to 0.6.0

``kedro==0.16.x`` is no longer supported. You need to update your project template to ``kedro==0.17.0`` template.

## Migration from 0.4.1 to 0.5.0

The only breaking change with the previous release is the format of ``KedroPipelineMLModel`` class. Hence, if you saved a pipeline as a Mlflow Model with `pipeline_ml_factory` in ``kedro-mlflow==0.4.x``, loading it (either with ``MlflowModelLoggerDataSet`` or ``mlflow.pyfunc.load_model``) with ``kedro-mlflow==0.5.0`` installed will raise an error. You will need either to retrain the model or to load it with ``kedro-mlflow==0.4.x``.

## Migration from 0.4.0 to 0.4.1

There are no breaking change in this patch release except if you retrieve the mlflow configuration manually (e.g. in a script or a jupyter notebok). You must add an extra call to the ``setup()`` method:

```python
from kedro.framework.context import load_context
from kedro_mlflow.framework.context import get_mlflow_config

context=load_context(".")
mlflow_config=get_mlflow_config(context)
mlflow_config.setup() # <-- add this line which did not exists in 0.4.0
```

## Migration from 0.3.0 to 0.4.0

### Catalog entries
Expand Down
2 changes: 1 addition & 1 deletion docs/source/03_getting_started/01_example_project.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ Create a conda environment and install ``kedro-mlflow`` (this will automatically
```console
conda create -n km_example python=3.6.8 --yes
conda activate km_example
pip install kedro-mlflow==0.5.0
pip install kedro-mlflow==0.6.0
```

## Install the toy project
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@

## Automatic parameters versioning

Parameters versioning is automatic when the ``MlflowNodeHook`` is added to [the hook list of the ``ProjectContext``](../02_installation/02_setup.md#declaring-kedro-mlflow-hooks). In ``kedro-mlflow==0.5.0``, the `mlflow.yml` configuration file has a parameter called ``flatten_dict_params`` which enables to [log as distinct parameters the (key, value) pairs of a ```Dict`` parameter](../07_python_objects/02_Hooks.md).
Parameters versioning is automatic when the ``MlflowNodeHook`` is added to [the hook list of the ``ProjectContext``](../02_installation/02_setup.md#declaring-kedro-mlflow-hooks). In ``kedro-mlflow==0.6.0``, the `mlflow.yml` configuration file has a parameter called ``flatten_dict_params`` which enables to [log as distinct parameters the (key, value) pairs of a ```Dict`` parameter](../07_python_objects/02_Hooks.md).

You **do not need any additional configuration** to benefit from parameters versioning.

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -78,7 +78,7 @@ Setting the `mlflow_tracking_uri` key of `mlflow.yml` to the url of this server

You can refer to [this issue](https://github.com/Galileo-Galilei/kedro-mlflow/issues/15) for further details.

In ``kedro-mlflow==0.5.0`` you must configure these elements by yourself. Further releases will introduce helpers for configuration.
In ``kedro-mlflow==0.6.0`` you must configure these elements by yourself. Further releases will introduce helpers for configuration.

### Can I log an artifact in a specific run?

Expand Down
2 changes: 1 addition & 1 deletion docs/source/05_framework_ml/01_why_framework.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@

It is a very common pattern to hear that "machine learning deployment is hard", and this is supposed to explain why so many firms do not achieve to insert ML models in their IT systems (and consequently, not make money despite consequent investments in ML).

On the other hand, you can find thousands of tutorial across the web to explain how to deploy a ML API in 5 mn, either locally or on the cloud. There is also a large amount of training sessions which can teach you "how to become a machine learning engineer in 3 months".
On the other hand, you can find thousands of tutorial across the web to explain how to deploy a ML API in 5 min, either locally or on the cloud. There is also a large amount of training sessions which can teach you "how to become a machine learning engineer in 3 months".

*Who is right then? Both!*

Expand Down
6 changes: 3 additions & 3 deletions docs/source/05_framework_ml/03_framework_solutions.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@

## Reminder

We assume that we want to solve the following challenges among those described in ["Why we need a mlops framework"](docs\source\05_framework_ml\01_why_framework.md) section:
We assume that we want to solve the following challenges among those described in ["Why we need a mlops framework"](./01_why_framework.md) section:

- serve pipelines (which handles business objects) instead of models
- synchronize training and inference by packaging inference pipeline at training time
Expand All @@ -15,7 +15,7 @@ To solve the problem of desynchronization between training and inference, ``kedr

This class implements several methods to compare the ``DataCatalog``s associated to each of the two binded pipelines and performs subsetting oparations. This makes it quite difficult to handle directly. Fortunately, ``kedro-mlflow`` provides a convenient API to create ``PipelineML`` objects: the ``pipeline_ml_factory`` function.

The use of ``pipeline_ml_factory`` is very straightforward, especially if you have used the [project architecture described previously](docs\source\05_framework_ml\02_ml_project_components.md). The best place to create such an object is your `hooks.py` file which will look like this:
The use of ``pipeline_ml_factory`` is very straightforward, especially if you have used the [project architecture described previously](./02_ml_project_components.md). The best place to create such an object is your `hooks.py` file which will look like this:

```python
# hooks.py
Expand Down Expand Up @@ -62,7 +62,7 @@ catalog = load_context(".").io
# artifacts are all the inputs of the inference pipelines that are persisted in the catalog
artifacts = pipeline_training.extract_pipeline_artifacts(catalog)

# (optiona) get the schema of the input dataset
# (optional) get the schema of the input dataset
input_data = catalog.load(pipeline_training.input_name)
model_signature = infer_signature(model_input=input_data)

Expand Down
9 changes: 4 additions & 5 deletions docs/source/05_framework_ml/04_example_project.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@

If you don't want to read the entire explanations, here is a summary:

1. install ``kedro-mlflow`` ``MlflowPipelineHook`` (this is done automatically if you have installed ``kedro-mlflow`` in a ``kedro>=0.16.5`` project)
1. Install ``kedro-mlflow`` ``MlflowPipelineHook`` (this is done automatically if you have installed ``kedro-mlflow`` in a ``kedro>=0.16.5`` project)
2. Turn your training pipeline in a ``PipelineML`` object with ``pipeline_ml_factory`` function in your ``hooks.py``:

```python
Expand Down Expand Up @@ -40,7 +40,7 @@ If you don't want to read the entire explanations, here is a summary:
}
```

3. persist your artifacts locally in the ``catalog.yml``
3. Persist your artifacts locally in the ``catalog.yml``

```yaml
label_encoder:
Expand All @@ -54,9 +54,9 @@ If you don't want to read the entire explanations, here is a summary:
kedro run --pipeline=training
```

**The inference pipeline will _automagically_ be logged as a mlflo model at the end!**
**The inference pipeline will _automagically_ be logged as a mlflow model at the end!**

5. Go to the UI, retrieve the run id of your "inference pipeline" model and use it as you want, e.g. in the catalog.yml:
5. Go to the UI, retrieve the run id of your "inference pipeline" model and use it as you want, e.g. in the `catalog.yml`:

```yaml
# catalog.yml
Expand All @@ -69,7 +69,6 @@ If you don't want to read the entire explanations, here is a summary:
run_id: <your-run-id>
```


## Complete step by step demo project with code

A step by step tutorial with code is available in the [kedro-mlflow-tutorial repository on github](https://github.com/Galileo-Galilei/kedro-mlflow-tutorial#serve-the-inference-pipeline-to-a-end-user).
Loading

0 comments on commit 477147f

Please sign in to comment.