Feature/pipeline ml inputs #101

Galileo-Galilei · 2020-10-20T21:31:08Z

Description

Closes #71 and #100.

Development notes

PipelineML.extract_pipeline_catalog is renamed PipelineML._extract_pipeline_catalog to show it is private
Change the doc to deprecate using extract_pipeline_catalog in favor of extract_pipeline_artifacts
PipelineML now has a logger property
PipelineML now accepts that inference inputs may be in training inputs (and not only in all outputs

Checklist

Read the contributing guidelines
Open this PR as a 'Draft Pull Request' if it is work-in-progress
Update the documentation to reflect the code changes
Add a description of this change and add your name to the list of supporting contributions in the CHANGELOG.md file. Please respect Keep a Changelog guidelines.
Add tests to cover your changes

Notice

I acknowledge and agree that, by checking this box and clicking "Submit Pull Request":
I submit this contribution under the Apache 2.0 license and represent that I am entitled to do so on behalf of myself, my employer, or relevant third parties, as applicable.
I certify that (a) this contribution is my original creation and / or (b) to the extent it is not my original creation, I am authorised to submit this contribution on behalf of the original creator(s) or their licensees.
I certify that the use of this contribution as authorised by the Apache 2.0 license does not violate the intellectual property rights of anyone else.

codecov-io · 2020-10-20T21:32:48Z

Codecov Report

Merging #101 into develop will decrease coverage by 1.22%.
The diff coverage is 100.00%.

@@             Coverage Diff             @@
##           develop     #101      +/-   ##
===========================================
- Coverage    98.53%   97.31%   -1.23%     
===========================================
  Files           20       20              
  Lines          616      632      +16     
===========================================
+ Hits           607      615       +8     
- Misses           9       17       +8

Impacted Files	Coverage Δ
kedro_mlflow/framework/hooks/pipeline_hook.py	`98.79% <100.00%> (ø)`
kedro_mlflow/mlflow/kedro_pipeline_model.py	`100.00% <100.00%> (ø)`
kedro_mlflow/pipeline/pipeline_ml.py	`92.30% <100.00%> (-7.70%)`	⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update e12a74c...ca7d520. Read the comment docs.

Galileo-Galilei · 2020-10-20T21:38:28Z

Known pb with test coverage which includes test folder. It will be solved once #98 is merged, so we should merge it before and I'll rebase on it.

takikadiri

I point out some typo errors and the management of the kedro parameters by the PipelieML

kedro_mlflow/pipeline/pipeline_ml.py

takikadiri · 2020-10-25T20:21:57Z

kedro_mlflow/pipeline/pipeline_ml.py

        self._input_name = name

-    def extract_pipeline_catalog(self, catalog: DataCatalog) -> DataCatalog:
+    def _extract_pipeline_catalog(self, catalog: DataCatalog) -> DataCatalog:


Do we allow the use of parameters as inference / training inputs?
kedro create params:xxx inputs as a MemoryDataSet. The following PipelineML code exclude them from our inference pipelines :

if isinstance(data_set, MemoryDataSet): raise KedroMlflowPipelineMLDatasetsError(...)

I hesitated to deal with parameters automatically, but:

it is quite complicated: there are a some edge case situation to deal with, we have to decide how / when / where to persist them

it is error prone: I don't want to persist parameters that are not explictly intended to.

On the other hand, it is very easy for a user to enforce a parameter just by persisting either as an input or output of a "training" node, e.g. by creating a YAMLDataSet, so I think we can just let it to the user to be sure that it voluntary.

kedro_mlflow/pipeline/pipeline_ml.py

Galileo-Galilei · 2020-10-25T22:02:49Z

It seems coverage has decreased when I reabsed, I may have skipped a test. Do not merge it yet.

takikadiri · 2020-10-25T22:06:08Z

Ok ! For merging multi FIX PRs, do you prefer to pack the commits with a PR merge commit, or i just rebase and merge?

Galileo-Galilei · 2020-10-25T22:14:46Z

I always "rebase and merge" to the develop branch. The only merges are from develop to master.

…aining pipelines

Galileo-Galilei · 2020-10-26T20:07:39Z

@takikadiri It's good to go!

Galileo-Galilei requested a review from takikadiri October 20, 2020 21:31

Galileo-Galilei linked an issue Oct 20, 2020 that may be closed by this pull request

The PipelineML extract_pipeline_catalog should be private #100

Closed

takikadiri reviewed Oct 25, 2020

View reviewed changes

Galileo-Galilei mentioned this pull request Oct 25, 2020

Feature/unpack predictions #98

Merged

9 tasks

Galileo-Galilei force-pushed the feature/pipeline-ml-inputs branch from e30f707 to ca7d520 Compare October 25, 2020 21:57

Galileo-Galilei marked this pull request as draft October 25, 2020 22:03

Galileo-Galilei added 2 commits October 26, 2020 20:31

FIX #100 - Make PipelineML._extract_pipeline_catalog private

6e94a8f

FIX #71 - Enable pipeline_ml to share inputs between inference and tr…

7b8393c

…aining pipelines

Galileo-Galilei force-pushed the feature/pipeline-ml-inputs branch from ca7d520 to 7b8393c Compare October 26, 2020 20:03

Galileo-Galilei marked this pull request as ready for review October 26, 2020 20:04

takikadiri merged commit 08f0645 into develop Oct 27, 2020

Galileo-Galilei deleted the feature/pipeline-ml-inputs branch October 27, 2020 07:45

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature/pipeline ml inputs #101

Feature/pipeline ml inputs #101

Galileo-Galilei commented Oct 20, 2020

codecov-io commented Oct 20, 2020 •

edited

Loading

Galileo-Galilei commented Oct 20, 2020 •

edited

Loading

takikadiri left a comment

takikadiri Oct 25, 2020

Galileo-Galilei Oct 25, 2020

Galileo-Galilei commented Oct 25, 2020

takikadiri commented Oct 25, 2020 •

edited

Loading

Galileo-Galilei commented Oct 25, 2020

Galileo-Galilei commented Oct 26, 2020

Feature/pipeline ml inputs #101

Feature/pipeline ml inputs #101

Conversation

Galileo-Galilei commented Oct 20, 2020

Description

Development notes

Checklist

Notice

codecov-io commented Oct 20, 2020 • edited Loading

Codecov Report

Galileo-Galilei commented Oct 20, 2020 • edited Loading

takikadiri left a comment

Choose a reason for hiding this comment

takikadiri Oct 25, 2020

Choose a reason for hiding this comment

Galileo-Galilei Oct 25, 2020

Choose a reason for hiding this comment

Galileo-Galilei commented Oct 25, 2020

takikadiri commented Oct 25, 2020 • edited Loading

Galileo-Galilei commented Oct 25, 2020

Galileo-Galilei commented Oct 26, 2020

codecov-io commented Oct 20, 2020 •

edited

Loading

Galileo-Galilei commented Oct 20, 2020 •

edited

Loading

takikadiri commented Oct 25, 2020 •

edited

Loading