Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bootstrap diabetes cleanup #189

Merged
merged 6 commits into from
Feb 12, 2020
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 5 additions & 3 deletions bootstrap/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,8 +2,10 @@

To use this existing project structure and scripts for your new ML project, you can quickly get started from the existing repository, bootstrap and create a template that works for your ML project. Bootstraping will prepare a similar directory structure for your project which includes renaming files and folders, deleting and cleaning up some directories and fixing imports and absolute path based on your project name. This will enable reusing various resources like pre-built pipelines and scripts for your new project.

To bootstrap from the existing MLOpsPython repository clone this repository and run bootstrap.py script as below
To bootstrap from the existing MLOpsPython repository clone this repository, ensure Python is installed locally, and run bootstrap.py script as below
dtzar marked this conversation as resolved.
Show resolved Hide resolved

>python bootstrap.py --d [dirpath] --n [projectname]
`python bootstrap.py --d [dirpath] --n [projectname]`

Where [dirpath] is the absolute path to the root of your directory where MLOps repo is cloned and [projectname] is the name of your ML project
Where `[dirpath]` is the absolute path to the root of your directory where MLOps repo is cloned and `[projectname]` is the name of your ML project.

[This article](https://docs.microsoft.com/azure/machine-learning/tutorial-convert-ml-experiment-to-production#use-your-own-model-with-mlopspython-code-template) will also assist to use this code template for your own ML project.
70 changes: 37 additions & 33 deletions bootstrap/bootstrap.py
Original file line number Diff line number Diff line change
Expand Up @@ -57,38 +57,6 @@ def deletedir(self):
os.system(
'rmdir /S /Q "{}"'.format(os.path.join(self._project_directory, dir))) # NOQA: E501
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rmdir won't work for xplat, probably out of scope for this PR.


def replaceprojectname(self):
# Replace instances of diabetes_regression within files
dirs = [r".env.example",
r".pipelines\azdo-base-pipeline.yml",
r".pipelines\azdo-pr-build-train.yml",
r".pipelines\diabetes_regression-ci-build-train.yml",
r".pipelines\diabetes_regression-ci-image.yml",
r".pipelines\diabetes_regression-template-get-model-version.yml", # NOQA: E501
r".pipelines\diabetes_regression-variables.yml",
r"environment_setup\Dockerfile",
r"environment_setup\install_requirements.sh",
r"ml_service\pipelines\diabetes_regression_build_train_pipeline_with_r_on_dbricks.py", # NOQA: E501
r"ml_service\pipelines\diabetes_regression_build_train_pipeline_with_r.py", # NOQA: E501
r"ml_service\pipelines\diabetes_regression_build_train_pipeline.py", # NOQA: E501
r"ml_service\pipelines\diabetes_regression_verify_train_pipeline.py", # NOQA: E501
r"ml_service\util\create_scoring_image.py",
r"diabetes_regression\azureml_environment.json",
r"diabetes_regression\conda_dependencies.yml",
r"diabetes_regression\evaluate\evaluate_model.py",
r"diabetes_regression\training\test_train.py"] # NOQA: E501

for file in dirs:
fin = open(os.path.join(self._project_directory, file),
"rt", encoding="utf8")
data = fin.read()
data = data.replace("diabetes_regression", self.project_name)
fin.close()
fin = open(os.path.join(self._project_directory, file),
"wt", encoding="utf8")
fin.write(data)
fin.close()

def cleandir(self):
# Clean up directories
dirs = ["data", "experimentation"]
Expand All @@ -108,6 +76,40 @@ def validateargs(self):
raise Exception("Project name should be 3 to 15 chars long")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now that we're replacing more things, I think our valid set of chars might be reduced. \w and _ are probably it.



def replaceprojectname(project_dir, project_name, rename_name):
# Replace instances of rename_name within files with project_name
dirs = [r".env.example",
r".pipelines\azdo-base-pipeline.yml",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

\ won't work for xplat, might be out of scope for this PR.

r".pipelines\azdo-pr-build-train.yml",
r".pipelines\diabetes_regression-ci-build-train.yml",
r".pipelines\diabetes_regression-ci-image.yml",
r".pipelines\diabetes_regression-template-get-model-version.yml", # NOQA: E501
r".pipelines\diabetes_regression-variables.yml",
r"environment_setup\Dockerfile",
r"environment_setup\install_requirements.sh",
r"ml_service\pipelines\diabetes_regression_build_train_pipeline_with_r_on_dbricks.py", # NOQA: E501
r"ml_service\pipelines\diabetes_regression_build_train_pipeline_with_r.py", # NOQA: E501
r"ml_service\pipelines\diabetes_regression_build_train_pipeline.py", # NOQA: E501
r"ml_service\pipelines\diabetes_regression_verify_train_pipeline.py", # NOQA: E501
r"ml_service\util\create_scoring_image.py",
r"diabetes_regression\azureml_environment.json",
r"diabetes_regression\conda_dependencies.yml",
r"diabetes_regression\evaluate\evaluate_model.py",
r"diabetes_regression\register\register_model.py",
r"diabetes_regression\training\test_train.py"] # NOQA: E501

for file in dirs:
fin = open(os.path.join(project_dir, file),
"rt", encoding="utf8")
data = fin.read()
data = data.replace(rename_name, project_name)
fin.close()
fin = open(os.path.join(project_dir, file),
"wt", encoding="utf8")
fin.write(data)
fin.close()


def main(args):
parser = argparse.ArgumentParser(description='New Template')
parser.add_argument("--d", type=str,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should also stick to the standard of -d or --directory, same for below

Expand All @@ -122,7 +124,9 @@ def main(args):
helper.validateargs()
# helper.clonerepo()
helper.cleandir()
helper.replaceprojectname()
replaceprojectname(project_directory, project_name,
"diabetes_regression")
replaceprojectname(project_directory, project_name, "diabetes")
helper.deletedir()
helper.renamefiles()
helper.renamedir()
Expand Down
4 changes: 2 additions & 2 deletions diabetes_regression/register/register_model.py
Original file line number Diff line number Diff line change
Expand Up @@ -119,15 +119,15 @@ def register_aml_model(
if (build_id != 'none'):
model_already_registered(model_name, exp, run_id)
run = Run(experiment=exp, run_id=run_id)
tagsValue = {"area": "diabetes", "type": "regression",
tagsValue = {"area": "diabetes_regression",
"BuildId": build_id, "run_id": run_id,
"experiment_name": exp.name}
if (build_uri is not None):
tagsValue["BuildUri"] = build_uri
else:
run = Run(experiment=exp, run_id=run_id)
if (run is not None):
tagsValue = {"area": "diabetes", "type": "regression",
tagsValue = {"area": "diabetes_regression",
"run_id": run_id, "experiment_name": exp.name}
else:
print("A model run for experiment", exp.name,
Expand Down
2 changes: 1 addition & 1 deletion ml_service/util/create_scoring_image.py
Original file line number Diff line number Diff line change
Expand Up @@ -39,7 +39,7 @@
runtime="python",
conda_file="conda_dependencies.yml",
description="Image with ridge regression model",
tags={"area": "diabetes", "type": "regression"},
tags={"area": "diabetes_regression"},
)

image = Image.create(
Expand Down