VinDr-Mammo: A large-scale benchmark dataset for computer-aided diagnosis in full-field digital mammography

Description

This repository provides code used for de-identification and stratification of the VinDr-Mammo dataset, which can be downloaded via our project on Physionet . Python script for visualization of DICOM image is also provided.

Installation

To install required package via Pip, run

pip install -r requirements.txt

De-identification

See the deidentification.py file for more details.

Data Stratification

Please refer to the stratification.py file and the split_data.ipynb notebook. You may need to change the GLOBAL_PATH and LOCAL_PATH variables in split_data.ipynb to proper paths to the annotations files.

Visualization

Change the dicom_path variable in the visualize.py file to your desired DICOM file for visualization.

python visualize.py

License

This source code in released under Apache 2.0 License.

Citing

If you use the VinDr-Mammo dataset in your research please use the following BibTeX for citation:

@article{Nguyen2022.03.07.22272009,
  author={Nguyen, Hieu T. 
    and Nguyen, Ha Q. 
    and Pham, Hieu H. 
    and Lam, Khanh 
    and Le, Linh T. 
    and Dao, Minh 
    and Vu, Van},
  title={VinDr-Mammo: A large-scale benchmark dataset for computer-aided diagnosis in full-field digital mammography},
  year={2022},
  doi={10.1101/2022.03.07.22272009},
  URL={https://www.medrxiv.org/content/early/2022/03/10/2022.03.07.22272009},
  journal={medRxiv}
}

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
.github		.github
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
deidentification.py		deidentification.py
requirements.txt		requirements.txt
split_data.ipynb		split_data.ipynb
stratification.py		stratification.py
tags.csv		tags.csv
visualize.py		visualize.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

VinDr-Mammo: A large-scale benchmark dataset for computer-aided diagnosis in full-field digital mammography

Description

Installation

De-identification

Data Stratification

Visualization

License

Citing

About

Releases

Packages

Languages

License

vinbigdata-medical/vindr-mammo

Folders and files

Latest commit

History

Repository files navigation

VinDr-Mammo: A large-scale benchmark dataset for computer-aided diagnosis in full-field digital mammography

Description

Installation

De-identification

Data Stratification

Visualization

License

Citing

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages