This project welcomes contributions and suggestions. Most contributions require you to agree to a Contributor License Agreement (CLA) declaring that you have the right to, and actually do, grant us the rights to use your contribution. For details, visit https://cla.opensource.microsoft.com.
When you submit a pull request, a CLA bot will automatically determine whether you need to provide a CLA and decorate the PR appropriately (e.g., status check, comment). Simply follow the instructions provided by the bot. You will only need to do this once across all repos using our CLA.
This project has adopted the Microsoft Open Source Code of Conduct. For more information see the Code of Conduct FAQ or contact opencode@microsoft.com with any additional questions or comments.
Pull requests (PRs) to this repo require review and approval by the Azure Machine Learning (AML) team to merge.
Important: PRs from forks of this repository are likely to fail automated workflows due to access to secrets. PRs from forks will be considered but may experience additional delay for testing.
- minimal prose
- minimalist code
- no azureml-* in training code
- examples (including notebooks) can be re-run without failing in less than 10 minutes
- tutorials must be re-run without failing at least daily
pip install --upgrade -r requirements.txt
remains <60s
If modifying existing examples, before a PR:
- run
python readme.py
from the root of the repo - this will generate the
README.md
file - this will generate the
run-examples
andrun-notebooks
workflow files - this will format Python code and notebooks
If you are adding new examples, see below.
PRs to add new examples should consider which type of example to add:
examples
is for general examples using AML and should runcode
examplesnotebooks
is for general example notebooks using AML and should be interactivetutorials
is for end to end tutorials using AML
PRs must follow the following naming conventions:
- naming must be logical
- under
notebooks
use the naming convention scenario-framework-etc-compute, where scenario is one of ["train", "deploy", "score", "dprep"] - directories under
tutorials
must be words separated by hyphens - tutorial workflows (and workflow files) use the naming convention
run-tutorial-*initials*
, where initials is the initials of the words
PRs must include necessary changes to any testing to ensure:
run-examples
runs on every push and PR tomain
(with changes to examples) and runs all examples underexamples/
run-notebooks
runs on every push and PR tomain
(with changes to notebooks) and runs all examples undernotebooks/
run-tutorial-initials
must be tested at least daily and on PR tomain
(with changes to the tutorial)cleanup
runs daily and cleans up AML resources for the testing workspacesmoke
runs hourly and on every push and PR tomain
and performs sanity checks
- to modify
README.md
, you need to modifyreadme.py
and accompanying markdown files - the tables in the
README.md
are auto-generated, including description, via other files - develop on a branch, not a fork, for workflows to run properly
- use an existing environment where possible
- use an existing dataset where possible
- don't register environments
- don't create compute targets
- don't modify
requirements.txt
- you probably shouldn't modify any files in the root of the repo
- you can
!pip install --upgrade packages
as needed in notebooks
environment_name
= "framework-example|tutorial" e.g. "pytorch-example"experiment_name
= "logical-words-example|tutorial" e.g. "hello-world-tutorial"compute_name
= "compute-defined-in-setup.py" e.g. "gpu-K80-2"ws = Workspace.from_config()
dstore = ws.get_default_datastore()
ds = Dataset.File.from_files(...)
env = Environment.from_*(...)
src = ScriptRunConfig(...)
run = Experiment(ws, experiment_name).submit(src)
An example consists of the control plane definition, currently written as a Python script, and user code, which is often Python.
Checklist:
- add control plane code under
examples/
- add user code, preserving any licensing information, under
code/
- run
readme.py
- test
- submit PR, which will run
run-examples
A notebook is a self-contained example written as a .ipynb
file.
Checklist:
- is it interactive?
- does it need to be a notebook?
- are you sure? why?
- add notebook with description to
notebooks/
- run
readme.py
- test
- submit PR, which will run
run-notebooks
Tutorials must include frequent automated testing through GitHub Actions. One time setup for Azure resources and anything else a user needs must be written in the README.md
. An AML team member with access to the testing resource group will follow the README.md
to perform the required setup, and then rerun your tutorial workflow which should now pass.
If it is a simple ML training example, it does not need to be a tutorial. Current themes for tutorials include:
using-*
for how to use ML frameworks and tools in Azuredeploy-*
for advanced deploymentwork-with-*
for Azure integrationsautoml-with-*
for automated ML
Checklist:
- add the tutorial directory under
tutorials/
, following naming conventions - add tutorial files, which are usually notebooks and may be ordered
- add
README.md
in the tutorial directory with a description (see other tutorials for format) - add
run-tutorial-initials
, where initials are the initials of the description directory (see other tutorial workflows) - run
readme.py
- test
- submit PR, which will run your tutorial if setup properly