Skip to content
This repository has been archived by the owner on Mar 21, 2024. It is now read-only.

ENH: Publish lung segmentation model #808

Merged
merged 14 commits into from
Nov 7, 2022
56 changes: 29 additions & 27 deletions InnerEye/ML/configs/segmentation/Lung.py
Original file line number Diff line number Diff line change
Expand Up @@ -25,48 +25,50 @@ class Lung(SegmentationModelBase):
def __init__(self, **kwargs: Any) -> None:
fg_classes = ["spinalcord", "lung_r", "lung_l", "heart", "esophagus"]
fg_display_names = ["SpinalCord", "Lung_R", "Lung_L", "Heart", "Esophagus"]

azure_dataset_id = kwargs.pop("azure_dataset_id", LUNG_AZURE_DATASET_ID)

super().__init__(
adam_betas=(0.9, 0.999),
architecture="UNet3D",
feature_channels=[32],
kernel_size=3,
azure_dataset_id=LUNG_AZURE_DATASET_ID,
crop_size=(64, 224, 224),
test_crop_size=(128, 512, 512),
image_channels=["ct"],
ground_truth_ids=fg_classes,
ground_truth_ids_display_names=fg_display_names,
azure_dataset_id=azure_dataset_id,
check_exclusive=False,
class_weights=equally_weighted_classes(fg_classes, background_weight=0.02),
colours=[(255, 255, 255)] * len(fg_classes),
crop_size=(64, 224, 224),
feature_channels=[32],
fill_holes=[False] * len(fg_classes),
roi_interpreted_types=["ORGAN"] * len(fg_classes),
largest_connected_component_foreground_classes=["lung_r", "lung_l", "heart"],
num_dataload_workers=2,
norm_method=PhotometricNormalizationMethod.CtWindow,
level=40,
window=400,
class_weights=equally_weighted_classes(fg_classes, background_weight=0.02),
train_batch_size=8,
ground_truth_ids_display_names=fg_display_names,
ground_truth_ids=fg_classes,
image_channels=["ct"],
inference_batch_size=1,
inference_stride_size=(64, 256, 256),
num_epochs=140,
kernel_size=3,
l_rate_polynomial_gamma=0.9,
l_rate=1e-3,
largest_connected_component_foreground_classes=["lung_l", "lung_r", "heart"],
level=-500,
loss_type=SegmentationLoss.SoftDice,
min_l_rate=1e-5,
l_rate_polynomial_gamma=0.9,
optimizer_type=OptimizerType.Adam,
opt_eps=1e-4,
adam_betas=(0.9, 0.999),
momentum=0.9,
weight_decay=1e-4,
monitoring_interval_seconds=0,
norm_method=PhotometricNormalizationMethod.CtWindow,
num_dataload_workers=2,
num_epochs=300,
opt_eps=1e-4,
optimizer_type=OptimizerType.Adam,
roi_interpreted_types=["ORGAN"] * len(fg_classes),
test_crop_size=(112, 512, 512),
train_batch_size=3,
use_mixed_precision=True,
use_model_parallel=True,
monitoring_interval_seconds=0,
loss_type=SegmentationLoss.SoftDice,
check_exclusive=False,
weight_decay=1e-4,
window=2200,
)
self.add_and_validate(kwargs)

def get_model_train_test_dataset_splits(self, dataset_df: pd.DataFrame) -> DatasetSplits:
# The first 24 subject IDs are the designated test subjects in this dataset.
test = list(map(str, range(0, 24)))
test = list(map(str, range(0, 9)))
train_val = list(dataset_df[~dataset_df.subject.isin(test)].subject.unique())

val = list(map(str, numpy.random.choice(train_val, int(len(train_val) * 0.1), replace=False)))
Expand Down
2 changes: 1 addition & 1 deletion docs/source/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@ InnerEye-DeepLearning Documentation
md/debugging_and_monitoring.md
md/model_diagnostics.md
md/move_model.md
md/hippocampus_model.md
rst/models

.. toctree::
:maxdepth: 1
Expand Down
69 changes: 6 additions & 63 deletions docs/source/md/hippocampus_model.md
Original file line number Diff line number Diff line change
@@ -1,79 +1,22 @@
# Trained model for hippocampal segmentation
# Hippocampus Segmentation Model

## Purpose

This documentation describes how to use our pre-trained model to segment the left and right hippocampi from brain MRI scans. The model was trained on data from the [ADNI](https://adni.loni.usc.edu/) dataset (for more information see the model card below). This data is publicly available via their website, but users must sign a Data Use Agreement in order to gain access. We do not provide access to the data. The following description assumes the user has their own dataset to evaluate/ retrain the model on.
This documentation describes our pre-trained model for segmentation of the left and right hippocampi from brain MRI scans. The model was trained on data from the [ADNI](https://adni.loni.usc.edu/) dataset (for more information see the model card below). This data is publicly available via their website, but users must sign a Data Use Agreement in order to gain access. We do not provide access to the data. The following description assumes the user has their own dataset to evaluate/ retrain the model on.

## Terms of use

Please note that this model is intended for research purposes only. You are responsible for the performance, the necessary testing, and if needed any regulatory clearance for any of the models produced by this toolbox.

---
## Download

## Usage
The hippocampus segmentation model can be downloaded from [this release](https://github.com/microsoft/InnerEye-DeepLearning/releases/tag/v0.5).

The following instructions assume you have completed the preceding setup steps in the [InnerEye README](https://github.com/microsoft/InnerEye-DeepLearning/), in particular, [Setting up Azure Machine Learning](setting_up_aml.md).

### Create an Azure ML Dataset

To evaluate this model on your own data, you will first need to register an [Azure ML Dataset](https://docs.microsoft.com/en-us/azure/machine-learning/v1/how-to-create-register-datasets). You can follow the instructions in the for [creating datasets](creating_dataset.md) in order to do this.

## Downloading the model

The saved weights from the trained Hippocampus model can be downloaded along with the source code used to train it from [our GitHub releases page](https://github.com/microsoft/InnerEye-DeepLearning/releases/tag/v0.5).

### Registering a model in Azure ML

To evaluate the model in Azure ML, you must first [register an Azure ML Model](https://docs.microsoft.com/en-us/python/api/azureml-core/azureml.core.model.model?view=azure-ml-py#remarks). To register the Hippocampus model in your AML Workspace, unpack the source code downloaded in the previous step and follow InnerEye's [instructions to upload models to Azure ML](move_model.md).

Run the following from a folder that contains both the `ENVIRONMENT/` and `MODEL/` folders (these exist inside the downloaded model files):

```shell
WORKSPACE="fill with your workspace name"
GROUP="fill with your resource group name"
SUBSCRIPTION="fill with your subscription ID"

python InnerEye/Scripts/move_model.py \
--action upload \
--path . \
--workspace_name $WORKSPACE \
--resource_group $GROUP \
--subscription_id $SUBSCRIPTION \
--model_id Hippocampus:118
```

### Evaluating the model

You can evaluate the model either in Azure ML or locally using the downloaded checkpoint files. These 2 scenarios are described in more detail, along with instructions in [testing an existing model](building_models.md#testing-an-existing-model).

For example, to evaluate the model on your Dataset in Azure ML, run the following from within the directory `*/MODEL/final_ensemble_model/`

```shell
CLUSTER="fill with your cluster name"
DATASET_ID="fill with your dataset name"

python InnerEye/ML/runner.py \
--azure_dataset_id $DATASET_ID \
--model Hippocampus \
--model_id Hippocampus:111 \
--experiment_name evaluate_hippocampus_model \
--azureml \
--no-train \
--cluster $CLUSTER
--restrict_subjects=0,0,+
```

### Connected components
## Connected components

It is possible to apply connected components as a post-processing step, although by default this is disabled. To enable, update the property `largest_connected_component_foreground_classes` of the Hippocampus class in `InnerEye/ML/configs/segmentation/Hippocampus.py`

### Deploy with InnerEye Gateway

To deploy this model, see the instructions in the [InnerEye README](https://github.com/microsoft/InnerEye-DeepLearning/).

---

## Hippocampal Segmentation Model Card
## Model Card

### Model details

Expand Down
67 changes: 67 additions & 0 deletions docs/source/md/lung_model.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,67 @@
# Lung Segmentation Model

## Purpose

This model is designed to perform segmentation of CT scans of human torsos. It is trained to identify 5 key structures: left lung, right lung, heart, spinalcord and esophagus.

## Download

The lung segmentation model can be downloaded from [this release](https://github.com/microsoft/InnerEye-DeepLearning/releases/tag/v0.8).

## Connected Components

It is possible to apply connected components as a post-processing step, and by default this is performed on the 3 largest structures: both lungs and the heart. To alter this behaviour, update the property `largest_connected_component_foreground_classes` of the Lung class in `InnerEye/ML/configs/segmentation/Lung.py`.

## Model Card

### Model Details

- Organisation: Biomedical Imaging Team at Microsoft Research, Cambridge UK.
- Model date: 31st October 2022.
- Model version: 1.0.
- Model type: ensemble of 3D UNet. Training details are as described in [this paper](https://jamanetwork.com/journals/jamanetworkopen/fullarticle/2773292).
- Training details: 5 fold ensemble model. Trained on the [LCTSC 2017 dataset](https://wiki.cancerimagingarchive.net/pages/viewpage.action?pageId=24284539) (described in detail below).
- License: The model is released under MIT license as described [here](https://github.com/microsoft/InnerEye-DeepLearning/blob/main/LICENSE).
peterhessey marked this conversation as resolved.
Show resolved Hide resolved
- Contact: innereyeinfo@microsoft.com.

### Terms of use

Please note that all models provided by InnerEye-DeepLearning are intended for research purposes only. You are responsible for the performance, the necessary testing, and if needed any regulatory clearance for any of the models produced by this toolbox.

### Limitations

The dataset used for training contains only 60 scans, 10 of which are withheld for testing. This limited amount of training data means that the model underperforms on the smaller structures (esophagus and spinalcord) and may not yet generalise well to data samples from outside the dataset.

Furthermore, the dataset description does not contain details on the population of patients used for creating the dataset. Therefore it is not possible to assess whether this model is suitable for use on a target population outside of the dataset.

peterhessey marked this conversation as resolved.
Show resolved Hide resolved
### Intended Uses

This model is intended for research purposes only. It is intended to be used as a starting-point for more challenging segmentation tasks or training using more thorough and comprehensive segmentation tasks.

### Training Data

This model is trained on the [LCTSC 2017 dataset](https://wiki.cancerimagingarchive.net/pages/viewpage.action?pageId=24284539). For a detailed description on this data, including the contouring guidelines, see [this page](https://wiki.cancerimagingarchive.net/pages/viewpage.action?pageId=24284539#242845396723d79f9909442996e4dd0af5e56a30).

The following steps were carrried out to create the dataset used for training this model:

1. Download the DICOM dataset from the above LCTSC 2017 link.
1. Use the [InnerEye-CreateDataset tool](https://github.com/microsoft/InnerEye-CreateDataset) to run the following command on the data:

```shell
.\InnerEye.CreateDataset.Runner.exe dataset --datasetRootDirectory=<path_to_DICOM_data> --niftiDatasetDirectory=lung_nifti --dicomDatasetDirectory=LCTSC --geoNorm 1 1 3 --groundTruthDescendingPriority esophagus spinalcord lung_r lung_l heart
```

1. Upload and register NIFTI dataset to Azure by following the [dataset creation](creating_dataset.md) guide.

peterhessey marked this conversation as resolved.
Show resolved Hide resolved
### Metrics

Metrics for the withheld test data (first 10 scans in the dataset), can be seen in the following table:

| Structure | count | DiceNumeric_mean | DiceNumeric_std | DiceNumeric_min | DiceNumeric_max | HausdorffDistance_mm_mean | HausdorffDistance_mm_std | HausdorffDistance_mm_min | HausdorffDistance_mm_max | MeanDistance_mm_mean | MeanDistance_mm_std | MeanDistance_mm_min | MeanDistance_mm_max |
|---------------|---------|------------------|-----------------|-----------------|-----------------|---------------------------|--------------------------|--------------------------|--------------------------|----------------------|---------------------|---------------------|---------------------|
| lung_l | 10 | 0.984 | 0.009 | 0.958 | 0.990 | 11.642 | 4.868 | 6.558 | 19.221 | 0.344 | 0.266 | 0.167 | 1.027 |
| lung_r | 10 | 0.983 | 0.009 | 0.960 | 0.991 | 10.764 | 3.307 | 6.325 | 16.156 | 0.345 | 0.200 | 0.160 | 0.797 |
| spinalcord | 10 | 0.860 | 0.050 | 0.756 | 0.912 | 27.213 | 22.015 | 12.000 | 81.398 | 1.750 | 2.167 | 0.552 | 7.209 |
| heart | 10 | 0.935 | 0.015 | 0.908 | 0.953 | 17.550 | 14.796 | 9.000 | 17.550 | 2.022 | 0.661 | 1.456 | 3.299 |
| esophagus | 10 | 0.728 | 0.128 | 0.509 | 0.891 | 23.503 | 25.679 | 6.173 | 72.008 | 3.207 | 4.333 | 0.409 | 13.991 |
| | | | | | | | | | | | | | |
100 changes: 100 additions & 0 deletions docs/source/rst/models.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,100 @@
Pre-Trained Models
==================

InnerEye-DeepLearning currently has two pre-trained models avaiable for use
in segmentation tasks. This page describes how to set up and use these models.
For specific information on the models, please refer to the relevant model card:

.. toctree::
:maxdepth: 1

../md/hippocampus_model.md
../md/lung_model.md


Terms of use
------------

Please note that all models provided by InnerEye-DeepLearning are intended for
research purposes only. You are responsible for the performance, the necessary testing,
and if needed any regulatory clearance for any of the models produced by this toolbox.

Usage
-----

The following instructions assume you have completed the preceding setup
steps in the `InnerEye
README <https://github.com/microsoft/InnerEye-DeepLearning/>`__, in
particular, `Setting up Azure Machine Learning <setting_up_aml.md>`__.

Create an AzureML Dataset
~~~~~~~~~~~~~~~~~~~~~~~~~

To evaluate pre-trained models on your own data, you will first need to register
an `Azure ML
Dataset <https://docs.microsoft.com/en-us/azure/machine-learning/v1/how-to-create-register-datasets>`__.
You can follow the instructions in the for `creating
datasets <creating_dataset.md>`__ in order to do this.

Downloading the models
~~~~~~~~~~~~~~~~~~~~~~

The saved weights for each model can be found in their respective :ref:`model cards<Pre-Trained Models>`.
You will need to download the weights and source code for the model that you wish to use.

Registering a model in Azure ML
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

To evaluate the model in Azure ML, you must first `register an Azure ML
Model <https://docs.microsoft.com/en-us/python/api/azureml-core/azureml.core.model.model?view=azure-ml-py#remarks>`__.
To register the pre-trained model in your AML Workspace, unpack the
source code downloaded in the previous step and follow InnerEye's
`instructions to upload models to Azure ML <move_model.md>`__.

Run the following from a folder that contains both the ``ENVIRONMENT/``
and ``MODEL/`` folders (these exist inside the downloaded model files):

.. code:: shell

WORKSPACE="fill with your workspace name"
GROUP="fill with your resource group name"
SUBSCRIPTION="fill with your subscription ID"

python InnerEye/Scripts/move_model.py \
--action upload \
--path . \
--workspace_name $WORKSPACE \
--resource_group $GROUP \
--subscription_id $SUBSCRIPTION \
--model_id <Model Name>:<Model Version>

Evaluating the model
~~~~~~~~~~~~~~~~~~~~

You can evaluate the model either in Azure ML or locally using the
downloaded checkpoint files. These 2 scenarios are described in more
detail, along with instructions in `testing an existing
model <building_models.md#testing-an-existing-model>`__.

For example, to evaluate the model on your Dataset in Azure ML, run the
following from within the directory ``*/MODEL/final_ensemble_model/``

.. code:: shell

CLUSTER="fill with your cluster name"
DATASET_ID="fill with your dataset name"

python InnerEye/ML/runner.py \
--azure_dataset_id $DATASET_ID \
--model <Model Name> \
--model_id <Model Name>:<Model Version> \
--experiment_name <experiement name> \
--azureml \
--no-train \
--cluster $CLUSTER
--restrict_subjects=0,0,+

Deploy with InnerEye Gateway
~~~~~~~~~~~~~~~~~~~~~~~~~~~~

To deploy a model using the InnerEye Gateway, see the instructions in the `Gateway Repo <https://github.com/microsoft/InnerEye-Gateway/>`__.