Avoid systematic deepcopy of inference datasets #133

takikadiri · 2020-12-09T22:56:33Z

Description

Actually kedro_mlflow_model create a new catalog called loaded_catalog where it declare all the pipeline_ml artifacts with the new filepath. see here
Our current problem is that each of these datasets are deep-copied between the kedro nodes, and some artifacts/datasets take a long time to be deep-copied (keras model for example), and this is not suitable in an API serving pattern.

We need to be able to avoid deepcopy of some (or all) datasets in the inference pipeline

Possible Implementation

Defining inference datasets type to MemoryDataset(name, copy_mode="assign") solve the problem, but can break some inference pipeline that mutate the dataset state between the nodes. We can configure the copy_mode in one of these two levels :

Propose an option to redefine inference datasets type at PipelineML level
Propose an options to redefine inference datasets type at kedro_pipeline_model level

…del.predict to decrease inference time

Galileo-Galilei self-assigned this Dec 14, 2020

Galileo-Galilei added the enhancement New feature or request label Dec 14, 2020

Galileo-Galilei added this to the Release 0.5.0 milestone Dec 14, 2020

Galileo-Galilei added a commit that referenced this issue Jan 2, 2021

FIX #133 - Enable to specify runner and copy_mode for KedroPipelineMo…

5cbf5bd

…del.predict to decrease inference time

Galileo-Galilei added a commit that referenced this issue Jan 2, 2021

FIX #133 - Enable to specify runner and copy_mode for KedroPipelineMo…

82e8231

…del.predict to decrease inference time

Galileo-Galilei mentioned this issue Jan 2, 2021

FIX #133 - Enable to specify runner and copy_mode for KedroPipelineModel.predict to decrease inference time #152

Merged

6 tasks

Galileo-Galilei closed this as completed in #152 Jan 7, 2021

Galileo-Galilei added a commit that referenced this issue Jan 7, 2021

FIX #133 - Enable to specify runner and copy_mode for KedroPipelineMo…

7bc28e1

…del.predict to decrease inference time

Galileo-Galilei moved this to ✅ Done in kedro-mlflow roadmap Oct 29, 2024

Galileo-Galilei added this to kedro-mlflow roadmap Oct 29, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Avoid systematic deepcopy of inference datasets #133

Avoid systematic deepcopy of inference datasets #133

takikadiri commented Dec 9, 2020 •

edited

Loading

Avoid systematic deepcopy of inference datasets #133

Avoid systematic deepcopy of inference datasets #133

Comments

takikadiri commented Dec 9, 2020 • edited Loading

Description

Possible Implementation

takikadiri commented Dec 9, 2020 •

edited

Loading