Manage environments in conda YAML files #158

algattik · 2020-01-30T10:28:14Z

Closes #128

Changes:

Clear management of environments in conda YAML files rather than code.
Use conda rather than pip packages when possible (as recommended in AML docs).
Dev environment is hence also constrained to conda (no more pip install -r requirements.txt).
Pin versions for all packages (IaC best practice).
We do not yet manage explicit AML environments, but working with @sudivate to add that on top. This PR is a great base for that.

This PR will fail to build until the mlopspython container is updated. Here is a a build on a fork of this branch where the only change is to use this branch's version of the mlopspython container.

bump pip requirements versions (microsoft#104)

tests/unit/code_test.py

ml_service/pipelines/diabetes_regression_build_train_pipeline.py

environment_setup/install_requirements.sh

environment_setup/ci_environment.yml

diabetes_regression/training/training_dependencies.yml

* Move all 3 conda files to a single dir * Do not use conda-merge * Pin package versions

sudivate · 2020-01-31T19:48:57Z

ml_service/pipelines/diabetes_regression_build_train_pipeline_with_r.py

    run_config.environment.docker.enabled = True
-    run_config.environment.docker.base_image = "mcr.microsoft.com/mlops/python"


We need this container with r_essentails

we had it essentially to demonstrate the use of the container for training

I've added to the doc instead:

You will also need to add the `r-essentials` Conda packages into `diabetes_regression/scoring_dependencies.yml` and `diabetes_regression/training_dependencies.yml`.

I think it's a much more robust solution, and guides R users to the right process for adding the additional packages they will usually need.

Tested, training seems to run fine:

Starting the daemon thread to refresh tokens in background for process with pid = 137 Entering Run History Context Manager. [1] "R version 3.6.1 (2019-07-05)" [1] "Reading file from weight_data.csv" height weight 1 79 174 2 63 250 3 75 223 4 75 130 5 70 120 6 76 239 7 63 129 8 64 185 9 59 246 10 80 241 11 79 217 12 65 212 13 74 242 14 71 223 15 61 167 16 78 148 17 75 229 18 75 116 19 75 182 20 72 237 21 72 160 22 79 169 23 67 219 24 61 202 25 65 168 26 79 181 27 81 214 28 78 216 29 59 245 1 2 173.6420 222.3347 Call: lm(formula = y ~ x) Coefficients: (Intercept) x 232.5858 -0.5126 [1] "Completed" -rwxrwxrwx 1 root root 1740 Jan 31 20:10 model.rds The experiment completed successfully. Finalizing run... Cleaning up all outstanding Run operations, waiting 300.0 seconds 1 items cleaning up... Cleanup took 0.0007724761962890625 seconds Starting the daemon thread to refresh tokens in background for process with pid = 137

agree with you and that's what we showcased in python training pipeline and for R we wanted to demonstrate that one can bring in their base image for training as well :)

If we want to showcase it, I think it's better to do that in a doc than buried in a script

docs/code_description.md

development_setup.md updated to use install_requirements.sh. See microsoft#158: > Use conda rather than pip packages when possible (as recommended in AML docs). > Dev environment is hence also constrained to conda (no more pip install -r requirements.txt).

* development_setup.md update development_setup.md updated to use install_requirements.sh. See #158: > Use conda rather than pip packages when possible (as recommended in AML docs). > Dev environment is hence also constrained to conda (no more pip install -r requirements.txt). * Content of install_requirements.sh deleted * build_train_pipeline.py filename fixed * build_train_pipeline.py filename fixed

algattik added 8 commits November 28, 2019 18:16

Merge pull request #2 from microsoft/master

f69a2ca

bump pip requirements versions (microsoft#104)

Merge remote-tracking branch 'upstream/master'

7b370c1

Merge branch 'master' of https://github.com/microsoft/MLOpsPython

3ab3230

.

c840623

.

c2b953f

Update code_test.py

bceeba6

.

897b7d4

Update Dockerfile

e304fd2

algattik requested a review from sudivate January 30, 2020 10:29