From 24a5ffe4f8ff092cdf1192f4faf631342b2a83e4 Mon Sep 17 00:00:00 2001 From: Anton Schwaighofer Date: Mon, 7 Jun 2021 16:07:15 +0100 Subject: [PATCH 1/4] docs --- docs/environment.md | 26 ++-------- docs/innereye_as_submodule.md | 95 +++++++++++++++++++++++++++++++++++ docs/sample_tasks.md | 81 ++++++++--------------------- 3 files changed, 122 insertions(+), 80 deletions(-) create mode 100644 docs/innereye_as_submodule.md diff --git a/docs/environment.md b/docs/environment.md index 1c1cc613a..be17b507b 100644 --- a/docs/environment.md +++ b/docs/environment.md @@ -4,29 +4,13 @@ In order to work with the solution, your OS environment will need [git](https://git-scm.com/) and [git lfs](https://git-lfs.github.com/) installed. Depending on the OS that you are running the installation instructions may vary. Please refer to respective documentation sections on the tools' websites for detailed instructions. -## Using the InnerEye code as a git submodule of your project +We recommend using PyCharm or VSCode as the Python editor. + You have two options for working with our codebase: -* You can fork the InnerEye-DeepLearning repository, and work off that. +* You can fork the InnerEye-DeepLearning repository, and work off that. We recommend that because it is easiest to set up. * Or you can create your project that uses the InnerEye-DeepLearning code, and include InnerEye-DeepLearning as a git -submodule. - -If you go down the second route, here's the list of files you will need in your project (that's the same as those -given in [this document](building_models.md)) -* `environment.yml`: Conda environment with python, pip, pytorch -* `settings.yml`: A file similar to `InnerEye\settings.yml` containing all your Azure settings -* A folder like `ML` that contains your additional code, and model configurations. -* A file `ML/runner.py` that invokes the InnerEye training runner, but that points the code to your environment and Azure -settings; see the [Building models](building_models.md) instructions for details. - -You then need to add the InnerEye code as a git submodule, in folder `innereye-submodule`: -```shell script -git submodule add https://github.com/microsoft/InnerEye-DeepLearning innereye-submodule -``` -Then configure your Python IDE to consume *both* your repository root *and* the `innereye-submodule` subfolder as inputs. -In Pycharm, you would do that by going to Settings/Project Structure. Mark your repository root as "Source", and -`innereye-submodule` as well. - -We recommend using PyCharm or VSCode as the Python editor. +submodule. We only recommended that if you are very handy with Python. More details about this option +[are here](innereye_as_submodule.md). ## Windows Subsystem for Linux Setup When developing on a Windows machine, we recommend using [the Windows Subsystem for Linux, WSL2](https://docs.microsoft.com/en-us/windows/wsl/about). diff --git a/docs/innereye_as_submodule.md b/docs/innereye_as_submodule.md new file mode 100644 index 000000000..4fb7d0dd7 --- /dev/null +++ b/docs/innereye_as_submodule.md @@ -0,0 +1,95 @@ +# Using the InnerEye code as a git submodule of your project + +You can use InnerEye as a submodule in your own project. +If you go down that route, here's the list of files you will need in your project (that's the same as those +given in [this document](building_models.md)) +* `environment.yml`: Conda environment with python, pip, pytorch +* `settings.yml`: A file similar to `InnerEye\settings.yml` containing all your Azure settings +* A folder like `ML` that contains your additional code, and model configurations. +* A file like `myrunner.py` that invokes the InnerEye training runner, but that points the code to your environment +and Azure settings; see the [Building models](building_models.md) instructions for details. Please see below for how +`myrunner.py` should look like. + +You then need to add the InnerEye code as a git submodule, in folder `innereye-deeplearning`: +```shell script +git submodule add https://github.com/microsoft/InnerEye-DeepLearning innereye-deeplearning +``` +Then configure your Python IDE to consume *both* your repository root *and* the `innereye-deeplearning` subfolder as inputs. +In Pycharm, you would do that by going to Settings/Project Structure. Mark your repository root as "Source", and +`innereye-deeplearning` as well. + +Example commandline runner that uses the InnerEye runner (called `myrunner.py` above): +```python +import sys +from pathlib import Path + + +# This file here mimics how the InnerEye code would be used as a git submodule. + +# Ensure that this path correctly points to the root folder of your repository. +repository_root = Path(__file__).absolute() + + +def add_package_to_sys_path_if_needed() -> None: + """ + Checks if the Python paths in sys.path already contain the /innereye-deeplearning folder. If not, add it. + """ + is_package_in_path = False + innereye_submodule_folder = repository_root / "innereye-deeplearning" + for path_str in sys.path: + path = Path(path_str) + if path == innereye_submodule_folder: + is_package_in_path = True + break + if not is_package_in_path: + print(f"Adding {innereye_submodule_folder} to sys.path") + sys.path.append(str(innereye_submodule_folder)) + + +def main() -> None: + try: + from InnerEye import ML # noqa: 411 + except: + add_package_to_sys_path_if_needed() + + from InnerEye.ML import runner + print(f"Repository root: {repository_root}") + # Check here that yaml_config_file correctly points to your settings file + runner.run(project_root=repository_root, + yaml_config_file=Path("settings.yml"), + post_cross_validation_hook=None) + + +if __name__ == '__main__': + main() + +``` + +## Adding new models + +1. Set up a directory outside of InnerEye to holds your configs. In your repository root, you could have a folder +`InnerEyeLocal`, parallel to the InnerEye submodule, alongside `settings.yml` and `myrunner.py`. + +The example below creates a new flavour of the Glaucoma model in `InnerEye/ML/configs/classification/GlaucomaPublic`. +All that needs to be done is change the dataset. We will do this by subclassing GlaucomaPublic in a new config +stored in `InnerEyeLocal/configs` +1. Create folder `InnerEyeLocal/configs` +1. Create a config file called GlaucomaPublicExt.py there which extends the GlaucomaPublic class that looks like +```python +from InnerEye.ML.configs.classification.GlaucomaPublic import GlaucomaPublic + + +class GlaucomaPublicExt(GlaucomaPublic): + def __init__(self) -> None: + super().__init__() + self.azure_dataset_id="name_of_your_dataset_on_azure" +``` +1. In `settings.yml`, set `model_configs_namespace` to `InnerEyeLocal.configs` so this config +is found by the runner. Set `extra_code_directory` to `InnerEyeLocal`. + +#### Start Training +Run the following to start a job on AzureML: +``` +python myrunner.py --azureml=True --model=GlaucomaPublicExt +``` +See [Model Training](building_models.md) for details on training outputs, resuming training, testing models and model ensembles. diff --git a/docs/sample_tasks.md b/docs/sample_tasks.md index 7a6fe2ad5..371b11dd8 100644 --- a/docs/sample_tasks.md +++ b/docs/sample_tasks.md @@ -1,7 +1,8 @@ # Sample Tasks -Two sample tasks for the classification and segmentation pipelines. -This document will walk through the steps in [Training Steps](building_models.md), but with specific examples for each task. +This document contains two sample tasks for the classification and segmentation pipelines. + +The document will walk through the steps in [Training Steps](building_models.md), but with specific examples for each task. Before trying tp train these models, you should have followed steps to set up an [environment](environment.md) and [AzureML](setting_up_aml.md) ## Sample classification task: Glaucoma Detection on OCT volumes @@ -32,39 +33,9 @@ If you choose that, you can start training via ``` python InnerEye/ML/runner.py --model=GlaucomaPublic --azureml=True ``` -- Alternatively, you can create a separate runner and a separate model configuration folder. The steps described -below refer to this route. - -#### Setting up a second runner -1. Set up a directory outside of InnerEye to holds your configs, as in -[Setting Up Training](building_models.md#setting-up-training). After this step, you should have a folder InnerEyeLocal - beside InnerEye with files `settings.yml` and `ML/runner.py`. - -#### Creating the classification model configuration -The full configuration for the Glaucoma model is at `InnerEye/ML/configs/classification/GlaucomaPublic`. -All that needs to be done is change the dataset. We will do this by subclassing GlaucomaPublic in a new config -stored in `InnerEyeLocal/ML` -1. Create folder configs/classification under InnerEyeLocal/ML -1. Create a config file called GlaucomaPublicExt.py there which extends the GlaucomaPublic class that looks like -```python -from InnerEye.ML.configs.classification.GlaucomaPublic import GlaucomaPublic +- Alternatively, you can use InnerEye-DeepLearning via a submodule. Please check [here](innereye_as_submodule.md). -class GlaucomaPublicExt(GlaucomaPublic): - def __init__(self) -> None: - super().__init__() - self.azure_dataset_id="name_of_your_dataset_on_azure" -``` -1. In `settings.yml`, set `model_configs_namespace` to `InnerEyeLocal.ML.configs` so this config -is found by the runner. Set `extra_code_directory` to `InnerEyeLocal`. - -#### Start Training -Run the following to start a job on AzureML -``` -python InnerEyeLocal/ML/runner.py --azureml=True --model=GlaucomaPublicExt -``` -See [Model Training](building_models.md) for details on training outputs, resuming training, testing models and model ensembles. - ## Sample segmentation task: Segmentation of Lung CT This example is based on the [Lung CT Segmentation Challenge 2017](https://wiki.cancerimagingarchive.net/display/Public/Lung+CT+Segmentation+Challenge+2017) [[2]](#2). @@ -83,34 +54,26 @@ InnerEye.CreateDataset.Runner.exe dataset --datasetRootDirectory= None: - super().__init__(azure_dataset_id="name_of_your_dataset_on_azure") -``` -1. In `settings.yml`, set `model_configs_namespace` to `InnerEyeLocal.ML.configs` so this config -is found by the runner. Set `extra_code_directory` to `InnerEyeLocal`. - -### Start Training -Run the following to start a job on AzureML +class Lung(SegmentationModelBase): + def __init__(self, **kwargs: Any) -> None: + fg_classes = ["spinalcord", "lung_r", "lung_l", "heart", "esophagus"] + fg_display_names = ["SpinalCord", "Lung_R", "Lung_L", "Heart", "Esophagus"] + super().__init__( + azure_dataset_id="my_lung_dataset", + architecture="UNet3D", + feature_channels=[32], +... +``` +If you are using InnerEye as a submodule, please add a new configuration that is a subclass of `Lung`, and set +the `azure_dataset_id` field there, as described for the Glaucoma model [here](innereye_as_submodule.md). +1. You can now run the following command to start a job on AzureML: ``` -python InnerEyeLocal/ML/runner.py --azureml=True --model=LungExt --train=True +python InnerEye/ML/runner.py --azureml=True --model=Lung ``` See [Model Training](building_models.md) for details on training outputs, resuming training, testing models and model ensembles. From 147c712adf226fa58a49c1e0a92ddb60caa2c403 Mon Sep 17 00:00:00 2001 From: Anton Schwaighofer Date: Tue, 8 Jun 2021 16:16:13 +0100 Subject: [PATCH 2/4] PR feedback --- docs/innereye_as_submodule.md | 8 ++++---- docs/sample_tasks.md | 23 ++++++++++++++++++----- 2 files changed, 22 insertions(+), 9 deletions(-) diff --git a/docs/innereye_as_submodule.md b/docs/innereye_as_submodule.md index 4fb7d0dd7..9b5c247a5 100644 --- a/docs/innereye_as_submodule.md +++ b/docs/innereye_as_submodule.md @@ -74,12 +74,12 @@ The example below creates a new flavour of the Glaucoma model in `InnerEye/ML/co All that needs to be done is change the dataset. We will do this by subclassing GlaucomaPublic in a new config stored in `InnerEyeLocal/configs` 1. Create folder `InnerEyeLocal/configs` -1. Create a config file called GlaucomaPublicExt.py there which extends the GlaucomaPublic class that looks like +1. Create a config file `InnerEyeLocal/configs/GlaucomaPublicExt.py` which extends the `GlaucomaPublic` class +like this: ```python from InnerEye.ML.configs.classification.GlaucomaPublic import GlaucomaPublic - -class GlaucomaPublicExt(GlaucomaPublic): +class MyGlaucomaModel(GlaucomaPublic): def __init__(self) -> None: super().__init__() self.azure_dataset_id="name_of_your_dataset_on_azure" @@ -90,6 +90,6 @@ is found by the runner. Set `extra_code_directory` to `InnerEyeLocal`. #### Start Training Run the following to start a job on AzureML: ``` -python myrunner.py --azureml=True --model=GlaucomaPublicExt +python myrunner.py --azureml=True --model=MyGlaucomaModel ``` See [Model Training](building_models.md) for details on training outputs, resuming training, testing models and model ensembles. diff --git a/docs/sample_tasks.md b/docs/sample_tasks.md index 371b11dd8..750f2c33d 100644 --- a/docs/sample_tasks.md +++ b/docs/sample_tasks.md @@ -27,13 +27,26 @@ description below). ### Setting up training You have two options for running the Glaucoma model: -- You can directly work on a fork of the InnerEye repository. In this case, you need to modify `AZURE_DATASET_ID` -in `GlaucomaPublic.py` to match the dataset upload location, called `name_of_your_dataset_on_azure` above. -If you choose that, you can start training via +- You can directly work on a fork of the InnerEye repository. Create a config file `InnerEye/ML/configs/MyGlaucoma.py` + which extends the GlaucomaPublic class like this: +```python +from InnerEye.ML.configs.classification.GlaucomaPublic import GlaucomaPublic + +class MyGlaucomaModel(GlaucomaPublic): + def __init__(self) -> None: + super().__init__() + self.azure_dataset_id="name_of_your_dataset_on_azure" +``` +The value for `self.azure_dataset_id` should match the dataset upload location, called +`name_of_your_dataset_on_azure` above. + +Once that config is in place, you can start training in AzureML via ``` -python InnerEye/ML/runner.py --model=GlaucomaPublic --azureml=True +python InnerEye/ML/runner.py --model=MyGlaucomaModel --azureml=True ``` -- Alternatively, you can use InnerEye-DeepLearning via a submodule. Please check [here](innereye_as_submodule.md). + +As an alternative to working with a fork of the repository, you can use InnerEye-DeepLearning via a submodule. +Please check [here](innereye_as_submodule.md) for details. ## Sample segmentation task: Segmentation of Lung CT From 00db5b59ab0a85013a43dc18fef4b4f58a2fff2f Mon Sep 17 00:00:00 2001 From: Anton Schwaighofer Date: Wed, 9 Jun 2021 14:27:21 +0100 Subject: [PATCH 3/4] docu fixes --- docs/sample_tasks.md | 63 ++++++++++++++++++++++++-------------------- 1 file changed, 34 insertions(+), 29 deletions(-) diff --git a/docs/sample_tasks.md b/docs/sample_tasks.md index 750f2c33d..57b7c69b2 100644 --- a/docs/sample_tasks.md +++ b/docs/sample_tasks.md @@ -10,28 +10,26 @@ Before trying tp train these models, you should have followed steps to set up an This example is based on the paper [A feature agnostic approach for glaucoma detection in OCT volumes](https://arxiv.org/pdf/1807.04855v3.pdf). ### Downloading and preparing the dataset -1. The dataset is available [here](https://zenodo.org/record/1481223#.Xs-ehzPiuM_) [[1]](#1). +The dataset is available [here](https://zenodo.org/record/1481223#.Xs-ehzPiuM_) [[1]](#1). -1. After downloading and extracting the zip file, run the [create_glaucoma_dataset_csv.py](https://github.com/microsoft/InnerEye-DeepLearning/blob/main/InnerEye/Scripts/create_glaucoma_dataset_csv.py) +After downloading and extracting the zip file, run the [create_glaucoma_dataset_csv.py](https://github.com/microsoft/InnerEye-DeepLearning/blob/main/InnerEye/Scripts/create_glaucoma_dataset_csv.py) script on the extracted folder. ``` python create_dataset_csv.py /path/to/extracted/folder ``` This will convert the dataset to csv form and create a file `dataset.csv`. -1. Upload this folder (with the images and `dataset.csv`) to Azure Blob Storage. For details on creating a storage account, +Finally, upload this folder (with the images and `dataset.csv`) to Azure Blob Storage. For details on creating a storage account, see [Setting up AzureML](setting_up_aml.md#step-4-create-a-storage-account-for-your-datasets). The dataset should go into a container called `datasets`, with a folder name of your choice (`name_of_your_dataset_on_azure` in the description below). -### Setting up training +### Creating the model configuration and starting training -You have two options for running the Glaucoma model: -- You can directly work on a fork of the InnerEye repository. Create a config file `InnerEye/ML/configs/MyGlaucoma.py` +Next, you need to create a configuration file `InnerEye/ML/configs/MyGlaucoma.py` which extends the GlaucomaPublic class like this: ```python from InnerEye.ML.configs.classification.GlaucomaPublic import GlaucomaPublic - class MyGlaucomaModel(GlaucomaPublic): def __init__(self) -> None: super().__init__() @@ -55,38 +53,45 @@ This example is based on the [Lung CT Segmentation Challenge 2017](https://wiki. ### Downloading and preparing the dataset -1. The dataset [[3]](#3)[[4]](#4) can be downloaded [here](https://wiki.cancerimagingarchive.net/display/Public/Lung+CT+Segmentation+Challenge+2017#021ca3c9a0724b0d9df784f1699d35e2). -1. The next step is to convert the dataset from DICOM-RT to NIFTI. Before this, place the downloaded dataset in another - parent folder, which we will call `datasets`. This file structure is expected by the converison tool. -1. Use the [InnerEye-CreateDataset](https://github.com/microsoft/InnerEye-createdataset) to create a NIFTI dataset - from the downloaded (DICOM) files. +The dataset [[3]](#3)[[4]](#4) can be downloaded [here](https://wiki.cancerimagingarchive.net/display/Public/Lung+CT+Segmentation+Challenge+2017#021ca3c9a0724b0d9df784f1699d35e2). + +You need to convert the dataset from DICOM-RT to NIFTI. Before this, place the downloaded dataset in another + parent folder, which we will call `datasets`. This file structure is expected by the conversion tool. + +Next, use the +[InnerEye-CreateDataset](https://github.com/microsoft/InnerEye-createdataset) commandline tools to create a +NIFTI dataset from the downloaded (DICOM) files. After installing the tool, run ```batch InnerEye.CreateDataset.Runner.exe dataset --datasetRootDirectory= --niftiDatasetDirectory= --dicomDatasetDirectory= --geoNorm 1;1;3 ``` Now, you should have another folder under `datasets` with the converted Nifti files. The `geonorm` tag tells the tool to normalize the voxel sizes during conversion. -1. Upload this folder (with the images and dataset.csv) to Azure Blob Storage. For details on creating a storage account, + +Finally, upload this folder (with the images and dataset.csv) to Azure Blob Storage. For details on creating a storage account, see [Setting up AzureML](setting_up_aml.md#step-4-create-a-storage-account-for-your-datasets). All files should go -into a folder in the `datasets` container, for example `my_lung_dataset`. -1. You can then modify the example model configuration in [Lung.py](../InnerEye/ML/configs/segmentation/Lung.py), and -add the `azure_dataset_id` field, so that it looks like: +into a folder in the `datasets` container, for example `my_lung_dataset`. This folder name will need to go into the +`azure_dataset_id` field of the model configuration, see below. + +### Creating the model configuration and starting training +You can then create a new model configuration, based on the template +[Lung.py](../InnerEye/ML/configs/segmentation/Lung.py). To do this, create a file +`InnerEye/ML/configs/segmentation/MyLungModel.py`, where you create a subclass of the template Lung model, and +add the `azure_dataset_id` field (i.e., the name of the folder that contains the uploaded data from above), +so that it looks like: ```python -class Lung(SegmentationModelBase): - def __init__(self, **kwargs: Any) -> None: - fg_classes = ["spinalcord", "lung_r", "lung_l", "heart", "esophagus"] - fg_display_names = ["SpinalCord", "Lung_R", "Lung_L", "Heart", "Esophagus"] - super().__init__( - azure_dataset_id="my_lung_dataset", - architecture="UNet3D", - feature_channels=[32], -... +from InnerEye.ML.configs.segmentation.Lung import Lung +class MyLungModel(Lung): + def __init__(self) -> None: + super().__init__() + self.azure_dataset_id = "my_lung_dataset" ``` -If you are using InnerEye as a submodule, please add a new configuration that is a subclass of `Lung`, and set -the `azure_dataset_id` field there, as described for the Glaucoma model [here](innereye_as_submodule.md). -1. You can now run the following command to start a job on AzureML: +If you are using InnerEye as a submodule, please add this configuration in your private configuration folder, +as described for the Glaucoma model [here](innereye_as_submodule.md). + +You can now run the following command to start a job on AzureML: ``` -python InnerEye/ML/runner.py --azureml=True --model=Lung +python InnerEye/ML/runner.py --azureml=True --model=MyLungModel ``` See [Model Training](building_models.md) for details on training outputs, resuming training, testing models and model ensembles. From 2c3380fbbd6ef3c96c84f4c9a0305b3b2d4f3120 Mon Sep 17 00:00:00 2001 From: Anton Schwaighofer Date: Wed, 9 Jun 2021 14:33:34 +0100 Subject: [PATCH 4/4] readme --- README.md | 11 ++++++++++- 1 file changed, 10 insertions(+), 1 deletion(-) diff --git a/README.md b/README.md index 00b82e8b9..2d26806a6 100644 --- a/README.md +++ b/README.md @@ -100,9 +100,18 @@ Further detailed instructions, including setup in Azure, are here: 1. [Debugging and monitoring models](docs/debugging_and_monitoring.md) 1. [Model diagnostics](docs/model_diagnostics.md) 1. [Move a model to a different workspace](docs/move_model.md) -1. [Deployment](docs/deploy_on_aml.md) 1. [Working with FastMRI models](docs/fastmri.md) +## Deployment +We offer a companion set of open-sourced tools that help to integrate trained CT segmentation models with clinical +software systems: +- The [InnerEye-Gateway](https://github.com/microsoft/InnerEye-Gateway) is a Windows service running in a DICOM network, +that can route anonymized DICOM images to an inference service. +- The [InnerEye-Inference](https://github.com/microsoft/InnerEye-Inference) component offers a REST API that integrates +with the InnnEye-Gateway, to run inference on InnerEye-DeepLearning models. + +Details can be found [here](docs/deploy_on_aml.md). + ![docs/deployment.png](docs/deployment.png) ## More information