Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Manage environments in conda YAML files #158

Merged
merged 18 commits into from
Jan 31, 2020

Conversation

algattik
Copy link
Contributor

@algattik algattik commented Jan 30, 2020

Closes #128

Changes:

  • Clear management of environments in conda YAML files rather than code.
  • Use conda rather than pip packages when possible (as recommended in AML docs).
  • Dev environment is hence also constrained to conda (no more pip install -r requirements.txt).
  • Pin versions for all packages (IaC best practice).
  • We do not yet manage explicit AML environments, but working with @sudivate to add that on top. This PR is a great base for that.

This PR will fail to build until the mlopspython container is updated. Here is a a build on a fork of this branch where the only change is to use this branch's version of the mlopspython container.

@algattik algattik requested a review from sudivate January 30, 2020 10:29
tests/unit/code_test.py Outdated Show resolved Hide resolved
* Move all 3 conda files to a single dir
* Do not use conda-merge
* Pin package versions
@algattik algattik mentioned this pull request Jan 31, 2020
@algattik algattik requested a review from sudivate January 31, 2020 06:49
run_config.environment.docker.enabled = True
run_config.environment.docker.base_image = "mcr.microsoft.com/mlops/python"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need this container with r_essentails

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we had it essentially to demonstrate the use of the container for training 

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added to the doc instead:

You will also need to add the
 `r-essentials` Conda packages into `diabetes_regression/scoring_dependencies.yml`
 and `diabetes_regression/training_dependencies.yml`.

I think it's a much more robust solution, and guides R users to the right process for adding the additional packages they will usually need.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tested, training seems to run fine:

Starting the daemon thread to refresh tokens in background for process with pid = 137
Entering Run History Context Manager.
[1] "R version 3.6.1 (2019-07-05)"
[1] "Reading file from weight_data.csv"
   height weight
1      79    174
2      63    250
3      75    223
4      75    130
5      70    120
6      76    239
7      63    129
8      64    185
9      59    246
10     80    241
11     79    217
12     65    212
13     74    242
14     71    223
15     61    167
16     78    148
17     75    229
18     75    116
19     75    182
20     72    237
21     72    160
22     79    169
23     67    219
24     61    202
25     65    168
26     79    181
27     81    214
28     78    216
29     59    245
       1        2 
173.6420 222.3347 

Call:
lm(formula = y ~ x)

Coefficients:
(Intercept)            x  
   232.5858      -0.5126  

[1] "Completed"
-rwxrwxrwx 1 root root 1740 Jan 31 20:10 model.rds


The experiment completed successfully. Finalizing run...
Cleaning up all outstanding Run operations, waiting 300.0 seconds
1 items cleaning up...
Cleanup took 0.0007724761962890625 seconds
Starting the daemon thread to refresh tokens in background for process with pid = 137

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agree with you and that's what we showcased in python training pipeline and for R we wanted to demonstrate that one can bring in their base image for training as well :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we want to showcase it, I think it's better to do that in a doc than buried in a script

docs/code_description.md Outdated Show resolved Hide resolved
docs/code_description.md Outdated Show resolved Hide resolved
docs/code_description.md Outdated Show resolved Hide resolved
@sudivate sudivate merged commit 962778c into microsoft:master Jan 31, 2020
satonaoki added a commit to satonaoki/MLOpsPython that referenced this pull request Feb 10, 2021
development_setup.md updated to use install_requirements.sh.

See microsoft#158:

> Use conda rather than pip packages when possible (as recommended in AML docs).
> Dev environment is hence also constrained to conda (no more pip install -r requirements.txt).
j-so pushed a commit that referenced this pull request Feb 16, 2021
* development_setup.md update

development_setup.md updated to use install_requirements.sh.

See #158:

> Use conda rather than pip packages when possible (as recommended in AML docs).
> Dev environment is hence also constrained to conda (no more pip install -r requirements.txt).

* Content of install_requirements.sh deleted

* build_train_pipeline.py filename fixed

* build_train_pipeline.py filename fixed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Set versions of packages in training and scoring
2 participants