Make the KedroPipelineModel more portable #67
Labels
enhancement
New feature or request
need-design-decision
Several ways of implementation are possible and one must be chosen
Milestone
Hi,
When logging the KedroPipelineModel, we try to log everything the model need, so a downstream tool can recreate an appropriate python environnement for running it. The elements that it need are :
python version
Can be easily infered
pickle version
Can be easily infered
artifacts
We manage well this part by getting appopriate ML Datasets from pipeline_ml
conda_env
What we can do
project source code
The mlflow pyfunc model (KedroPipelineModel) need the project src package code being present in the PYTHONPATH/sys path in order to load the model pickle.
Today we expect the user to package and store their source code as python package and adding it as dependencies in his conda_env. So he can use his model in another machine (or environnment).
Despite it's a good practise to treat the source code as a python package, that add some setup overhead for the user, and some of them just want it to worker out of the box. Bundling the source code with the model will streamline the user experience
We can pass the project package source code at the logging time see here
Mlflow will prepended the source code paths to the system path before the model is loaded.
That will prevent issues like this one
The text was updated successfully, but these errors were encountered: