Skip to content

Commit

Permalink
🐛 Remove unnecessary dependency to input dataset in KedroPipelineModel (
Browse files Browse the repository at this point in the history
  • Loading branch information
Galileo-Galilei committed Feb 13, 2022
1 parent 46c7fcc commit c630e6b
Show file tree
Hide file tree
Showing 2 changed files with 6 additions and 3 deletions.
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@

### Fixed

- :bug: Force the input dataset in ``KedroPipelineModel`` to be a ``MemoryDataSet`` to remove unnecessary dependency to the underlying Kedro ``AbstractDataSet`` used during training ([#273](https://github.com/Galileo-Galilei/kedro-mlflow/issues/273))
- :bug: Make ``MlflowArtifactDataset`` correctly log in mlflow Kedro DataSets without a ``_path`` attribute like ``kedro.io.PartitionedDataSet`` ([#258](https://github.com/Galileo-Galilei/kedro-mlflow/issues/258)).
- :bug: Automatically persist pipeline parameters when calling the ``kedro mlflow modelify`` command for consistency with how ``PipelineML`` objects are handled and for ease of use ([#282](https://github.com/Galileo-Galilei/kedro-mlflow/issues/282)).

Expand Down
8 changes: 5 additions & 3 deletions kedro_mlflow/mlflow/kedro_pipeline_model.py
Original file line number Diff line number Diff line change
Expand Up @@ -109,9 +109,11 @@ def _extract_pipeline_catalog(self, catalog: DataCatalog) -> DataCatalog:
for data_set_name in self.pipeline.inputs():
if data_set_name == self.input_name:
# there is no obligation that this dataset is persisted
# thus it is allowed to be an empty memory dataset
data_set = catalog._data_sets.get(data_set_name) or MemoryDataSet()
sub_catalog.add(data_set_name=data_set_name, data_set=data_set)
# and even if it is, we keep only an ampty memory dataset to avoid
# extra uneccessary dependencies: this dataset will be replaced at
# inference time and we do not need to know the original type, see
# https://github.com/Galileo-Galilei/kedro-mlflow/issues/273
sub_catalog.add(data_set_name=data_set_name, data_set=MemoryDataSet())
else:
try:
data_set = catalog._data_sets[data_set_name]
Expand Down

0 comments on commit c630e6b

Please sign in to comment.