-
Notifications
You must be signed in to change notification settings - Fork 32
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature/pipeline ml inputs #101
Conversation
Codecov Report
@@ Coverage Diff @@
## develop #101 +/- ##
===========================================
- Coverage 98.53% 97.31% -1.23%
===========================================
Files 20 20
Lines 616 632 +16
===========================================
+ Hits 607 615 +8
- Misses 9 17 +8
Continue to review full report at Codecov.
|
Known pb with test coverage which includes test folder. It will be solved once #98 is merged, so we should merge it before and I'll rebase on it. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I point out some typo errors and the management of the kedro parameters by the PipelieML
self._input_name = name | ||
|
||
def extract_pipeline_catalog(self, catalog: DataCatalog) -> DataCatalog: | ||
def _extract_pipeline_catalog(self, catalog: DataCatalog) -> DataCatalog: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we allow the use of parameters as inference / training inputs?
kedro create params:xxx
inputs as a MemoryDataSet. The following PipelineML code exclude them from our inference pipelines :
if isinstance(data_set, MemoryDataSet):
raise KedroMlflowPipelineMLDatasetsError(...)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I hesitated to deal with parameters automatically, but:
- it is quite complicated: there are a some edge case situation to deal with, we have to decide how / when / where to persist them
- it is error prone: I don't want to persist parameters that are not explictly intended to.
On the other hand, it is very easy for a user to enforce a parameter just by persisting either as an input or output of a "training" node, e.g. by creating a YAMLDataSet, so I think we can just let it to the user to be sure that it voluntary.
e30f707
to
ca7d520
Compare
It seems coverage has decreased when I reabsed, I may have skipped a test. Do not merge it yet. |
Ok ! For merging multi FIX PRs, do you prefer to pack the commits with a PR merge commit, or i just rebase and merge? |
I always "rebase and merge" to the develop branch. The only merges are from develop to master. |
ca7d520
to
7b8393c
Compare
@takikadiri It's good to go! |
Description
Closes #71 and #100.
Development notes
PipelineML.extract_pipeline_catalog
is renamedPipelineML._extract_pipeline_catalog
to show it is privateextract_pipeline_catalog
in favor ofextract_pipeline_artifacts
inference
inputs may be intraining
inputs (and not only in all outputsChecklist
CHANGELOG.md
file. Please respect Keep a Changelog guidelines.Notice
I acknowledge and agree that, by checking this box and clicking "Submit Pull Request":
I submit this contribution under the Apache 2.0 license and represent that I am entitled to do so on behalf of myself, my employer, or relevant third parties, as applicable.
I certify that (a) this contribution is my original creation and / or (b) to the extent it is not my original creation, I am authorised to submit this contribution on behalf of the original creator(s) or their licensees.
I certify that the use of this contribution as authorised by the Apache 2.0 license does not violate the intellectual property rights of anyone else.