Skip to content

TomoLIBRA: 3D Volumetric Breast Density Estimation from 3D DBT Images Using Deep Learning

Notifications You must be signed in to change notification settings

vahluw/vbd-estimation-3d-dbt

Repository files navigation

TomoLIBRA: Deep Learning for Volumetric Breast Density Estimation Using 3D DBT Reconstructed Slices

Why use this?

TomoLIBRA is the first deep learning model that can estimate volumetric breast density (VBD) and absolute dense volume (ADV) from 3D DBT reconstructed slices. Using the GaNDLF framework, we trained a convolutional neural network (CNN) that can perform dense breast tissue segmentation on 3D DBT reconstructed slices, which are more commonly archived in clinical centers than the “raw” or “for processing” images. We envision that this model can be used to perform large retrospective breast density assessments and eventually perform prospective breast cancer risk assessments.

Documentation

This project is based on the GaNDLF framework (https://mlcommons.github.io/GaNDLF/), using a fork from the repository (e9d92ae). Please see below for more information on GanDLF specifically. We used to perform DL model development, inference, and the case control analysis. In this repository, we include the relevant code to perform segmentation inference using a sample of images in the case control analysis.

For more information on GaNDLF, please see below or go to https://mlcommons.github.io/GaNDLF/. Running this program requires GPU capabilities with cuda 11.8 or cuda 12.2. Given the large size of the DBT images, it is unlikely that model training or inference can occur on CPU-only hardware.

Please cite our work as follows:

Ahluwalia VS, Doiphode N, Mankowski WC, Cohen EA, Pati S, Pantalone L, Bakas S, Brooks A, Vachon CM, Conant EF, Gastounioti A, Kontos, D. Volumetric Breast Density Estimation From Three-Dimensional Reconstructed Digital Breast Tomosynthesis Images Using Deep Learning. JCO Clin Cancer Inform 8, e2400103(2024). DOI:10.1200/CCI.24.00103

Contents of Repository

1. output/: Inference output goes here. Model weights used in our model (imagenet_unet_best.pth.tar) also exist in this folder.
2. config_file.yaml: Configuration file specifying necessary parameters for GaNDLF
3. sample_inference.csv: Sample CSV file used as input for GaNDLF
4. sample_data/: Contains three sample 3D DBT reconstructed image volumes from the case control analysis. Note: all image volumes must contain the strings 'RCC', LCC', 'LMLO', or 'RMLO' in the file names specifying image laterality; these strings are case-sensitive. If one of the four strings is not present, image preprocessing may not work as intended.
5. preprocess_images.py: Preprocesses 3D reconstructed DBT image volumes so that they can be used with the DL algorithm. All DBT images must be preprocessed using this file before they can be sent to GaNDLF for training and/or inference.
6. calculate_vbd.py: Calculates VBD (%) on DL algorithm segmentation predictions and outputs VBD to a csv file.
7. GaNDLF/: Subdirectory containing code necessary to run GaNDLF-based training and inference

Installation and Running Inference on Sample Data

(base) $> module unload cuda                            # Unload current cuda module
(base) $> module load cuda/12.2                         # Load compatible cuda module (can also be 11.8)
(base) $> cd GaNDLF                                     # Move to subdirectory
(base) $> conda create -n venv_gandlf python=3.9 -y     # Create virtual environment
(base) $> conda activate venv_gandlf                    # Activate virtual environment
(venv_gandlf) $> pip install -e .                       # Install dependencies
(venv_gandlf) $> gandlf verify-install                  # Verify installation success
(venv_gandlf) $> cd ..                                  # Move to parent directory
# Preprocess reconstructed DBT images (must be in nii.gz format and output csv file named 'padded_data_inference.csv')
(venv_gandlf) $> python preprocess_images.py 'sample_data' 'postprocessing_sample_data' 'inference'     
# Run inference using DL algorithm with sample data        
(venv_gandlf) $> gandlf_run -c config_file.yaml -i 'padded_data_inference.csv' -m output/ -t False -d cuda      
# 'orig_path' should be replaced with subdirectory generated by GaNDLF inference under the 'output/' folder specifying a timestamp; cannot run this command until previous command has finished executing. An example of 'orig_path' could be something like '/output/20241009_174458'.
(venv_gandlf) $> python calculate_vbd.py 'padded_data_inference.csv' 'vbd_output.csv' 'orig_path'               

GaNDLF (Adapted from GaNDLF Repository)

Codacy
Code style: black

The Generally Nuanced Deep Learning Framework for segmentation, regression and classification.

GaNDLF all options

Why use this?

  • Supports multiple
    • Deep Learning model architectures
    • Data dimensions (2D/3D)
    • Channels/images/sequences
    • Prediction classes
    • Domain modalities (i.e., Radiology Scans and Digitized Histopathology Tissue Sections)
    • Problem types (segmentation, regression, classification)
    • Multi-GPU (on same machine) training
  • Built-in
    • Nested cross-validation (and related combined statistics)
    • Support for parallel HPC-based computing
    • Support for training check-pointing
    • Support for Automatic mixed precision
  • Robust data augmentation, courtesy of TorchIO
  • Handles imbalanced classes (e.g., very small tumor in large organ)
  • Leverages robust open source software
  • No need to write any code to generate robust models

Citation

Please cite the following article for GaNDLF (full paper):

@article{pati2023gandlf,
    author={Pati, Sarthak and Thakur, Siddhesh P. and Hamamc{\i}, {\.{I}}brahim Ethem and Baid, Ujjwal and Baheti, Bhakti and Bhalerao, Megh and G{\"u}ley, Orhun and Mouchtaris, Sofia and Lang, David and Thermos, Spyridon and Gotkowski, Karol and Gonz{\'a}lez, Camila and Grenko, Caleb and Getka, Alexander and Edwards, Brandon and Sheller, Micah and Wu, Junwen and Karkada, Deepthi and Panchumarthy, Ravi and Ahluwalia, Vinayak and Zou, Chunrui and Bashyam, Vishnu and Li, Yuemeng and Haghighi, Babak and Chitalia, Rhea and Abousamra, Shahira and Kurc, Tahsin M. and Gastounioti, Aimilia and Er, Sezgin and Bergman, Mark and Saltz, Joel H. and Fan, Yong and Shah, Prashant and Mukhopadhyay, Anirban and Tsaftaris, Sotirios A. and Menze, Bjoern and Davatzikos, Christos and Kontos, Despina and Karargyris, Alexandros and Umeton, Renato and Mattson, Peter and Bakas, Spyridon},
    title={GaNDLF: the generally nuanced deep learning framework for scalable end-to-end clinical workflows},
    journal={Communications Engineering},
    year={2023},
    month={May},
    day={16},
    volume={2},
    number={1},
    pages={23},
    abstract={Deep Learning (DL) has the potential to optimize machine learning in both the scientific and clinical communities. However, greater expertise is required to develop DL algorithms, and the variability of implementations hinders their reproducibility, translation, and deployment. Here we present the community-driven Generally Nuanced Deep Learning Framework (GaNDLF), with the goal of lowering these barriers. GaNDLF makes the mechanism of DL development, training, and inference more stable, reproducible, interpretable, and scalable, without requiring an extensive technical background. GaNDLF aims to provide an end-to-end solution for all DL-related tasks in computational precision medicine. We demonstrate the ability of GaNDLF to analyze both radiology and histology images, with built-in support for k-fold cross-validation, data augmentation, multiple modalities and output classes. Our quantitative performance evaluation on numerous use cases, anatomies, and computational tasks supports GaNDLF as a robust application framework for deployment in clinical workflows.},
    issn={2731-3395},
    doi={10.1038/s44172-023-00066-3},
    url={https://doi.org/10.1038/s44172-023-00066-3}
}

Documentation

GaNDLF has extensive documentation and it is arranged in the following manner:

Contributing

Please see the contributing guide for more information.

Weekly Meeting

The GaNDLF development team hosts a weekly meeting to discuss feature additions, issues, and general future directions. If you are interested to join, please send us an email!

Disclaimer

  • The software has been designed for research purposes only and has neither been reviewed nor approved for clinical use by the Food and Drug Administration (FDA) or by any other federal/state agency.
  • This code (excluding dependent libraries) is governed by the Apache License, Version 2.0 provided in the LICENSE file unless otherwise specified.

Contact

For more information or any support, please post on the Discussions section.

About

TomoLIBRA: 3D Volumetric Breast Density Estimation from 3D DBT Images Using Deep Learning

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published