Skip to content

Commit

Permalink
Kevin branch (#34)
Browse files Browse the repository at this point in the history
* ops using segmentation masks now accept Segmentation

* simplified loader/writer specification in IO ops

* simplified loader/writer specification in IO ops

* updated imports in pipeline

* improved ops repr

* minor fixes in IO and segmentation

* matching roi names in StructureSet.to_segmentation is now case-insensitive

* fixed segmentation method call in some functional ops

* fixed typo in pipeline warning message

* StructureSet now correctly handles ROIs with no contour data

* fixed error when trying to detect CSV delimiter in MetadataWriter

* MetadataWriter can now remove existing files from previous runs

* MetadataWriter can now remove existing files from previous runs

* Segmentation object now correctly handles indexing

* fixed segmentation mask handling in image statistics computation

* removed old structure set code

* removed old structure set code

* fixed label handling in image statistics computation

* fixed label handling in image statistics computation

* [In Progress] Adding documentation to ops.py

* added support for dynamic path specification in writers

* image CSV loader now accepts pandas DataFrames

* fixed indentation error

* fixed index column handling in ImageCSVLoader

* Revert "fixed index column handling in ImageCSVLoader"

This reverts commit fab58d2.

* fixed index column handling in ImageCSVLoader

* fixed index column handling in ImageCSVLoader

* ImageCSVLoader now returns correct keys

* ImageCSVLoader now correctly handles globbing in paths

* more informative exception handling in pipeline

* pipeline can now warn on error instead of raising exception

* [Docs] Cleaned up extra whitespace in ops.py

* Added convenience method to get all ops in a pipeline

* Fix empty array check in structure set conversion

* Fix segmentation to label image conversion

* Update README.md

* Fix spurious mixin in ops

* Fix type error when passing Numpy array to rotate

* [Bugfix] Fix slice number rounding when generating a mask from RTSTRUCT

* Fix crop size issue in crop_to_bounding_box

* Make imports easier

* Improved handling of regular expressions in structure set

* Fix slice matching issue when generating binary mask from RTStruct

* Fix binary mask generation from RTStruct when missing labels are present

* little updates

* added seg.nrrd compatibility

* supports RTSTRUCT processing without roi_names

* Fixed contour (RTSTRUCT) handling to rasterize multiple contours better.

* Fixed contour (RTSTRUCT) handling to rasterize multiple contours better.

* fixed structureset.py:148

* RT dose pipeline completed and tested. Error in segmentation not resolved

* PET readability added

* PET readability added

* Head-Neck-PET-CT pipeline + `read_dicom_auto`

* final pipeline working for doses and PET

* corrected the view

* reading in sitk format from beginning

* Completed PET overlay

* Modified RT dose, pipeline working for PET_CT quebec dataset

* Added DataGraph, now fetching subset of dataset is supported using graph query

* added crawl.py

* Introduced changes in DataGraph. Made the pipeline fully general. Made wrapper classes

* Rectified some bugs, added reference to rtplan in the crawler

* Now missing RTDOSE references are completed by RTPLAN

* small change

* Fixed dataset.csv writing issues

* major refactoring

* more refactoring

* Update radcure_simple.py

* Update loaders.py

* Update pet.py

* Delete .DS_Store

* Delete .DS_Store

* Delete .DS_Store

* Delete .DS_Store

* Delete .DS_Store

* Delete .DS_Store

* Delete .DS_Store

* ignore .DS_Store

* Update README.md

* quick README under ops

* tcia_sample

* example bash script with path parsing, fixed reference_frame

* housekeeping

* fixed dataset.csv

* sanity changes before purging dev-20200414

* Added test autopipeline and modalities, solved some autopipeline bugs, read_dicom_series and pet now supports series_id

* PT/RTDOSE metadata to csv

* fixed some bugs in autopipeline.py

* now the pipeline saves on exit

* deleted data

* now checks for existing subject id

* uncommented one line pytest

* uncommented one line pytest

* self.existing, dataset.csv fixed (#10)

* Added test autopipeline and modalities, solved some autopipeline bugs, read_dicom_series and pet now supports series_id

* PT/RTDOSE metadata to csv

* fixed some bugs in autopipeline.py

* now the pipeline saves on exit

* deleted data

* now checks for existing subject id

* uncommented one line pytest

* uncommented one line pytest

Co-authored-by: Vishwesh <vishweshramanathan@gmail.com>

* Added dataset class which can load from nrrds or directly from the dataset and convert to pytorch dataset

* Create build-ci.yml

* Update build-ci.yml

* Update requirements.txt

* bug fixes_1.0

* test and autopipe fixed

* bug fixes 2

* bug fixes 2

* added visualizations and some more bug fixes

* Create manual-test.yml

* Update build-ci.yml

* Update manual-test.yml

* PR tests - macos/ubuntu failing (#13)

* Added test autopipeline and modalities, solved some autopipeline bugs, read_dicom_series and pet now supports series_id

* PT/RTDOSE metadata to csv

* fixed some bugs in autopipeline.py

* now the pipeline saves on exit

* deleted data

* now checks for existing subject id

* uncommented one line pytest

* uncommented one line pytest

* Added dataset class which can load from nrrds or directly from the dataset and convert to pytorch dataset

* bug fixes_1.0

* test and autopipe fixed

* bug fixes 2

* fixed pipeline tests

* clean tests

* added workflow

* yml

* yml

* matplotlib

* trying other patient to avoid memoryerror

* set roi_names to avoid memoryerror

* cave

* indents

* Update manual-test.yml

Co-authored-by: Vishwesh <vishweshramanathan@gmail.com>

* fixed bugs regarding multiple connections, saving of metadata and loading of metadata

* small bug fix

* added demo.py

* Ready for

* Create main.yml (#15)

* Changed dataset class returns

* fix conflicts

* fixed test autopipe

* merging new features (#16)

* Added test autopipeline and modalities, solved some autopipeline bugs, read_dicom_series and pet now supports series_id

* PT/RTDOSE metadata to csv

* fixed some bugs in autopipeline.py

* now the pipeline saves on exit

* deleted data

* now checks for existing subject id

* uncommented one line pytest

* uncommented one line pytest

* Added dataset class which can load from nrrds or directly from the dataset and convert to pytorch dataset

* bug fixes_1.0

* test and autopipe fixed

* bug fixes 2

* bug fixes 2

* added visualizations and some more bug fixes

* fixed bugs regarding multiple connections, saving of metadata and loading of metadata

* small bug fix

* added demo.py

* Changed dataset class returns

* fix conflicts

* fixed test autopipe

Co-authored-by: Vishwesh <vishweshramanathan@gmail.com>

* fix path backslash issues

* fix path backslashes (#17)

* Added test autopipeline and modalities, solved some autopipeline bugs, read_dicom_series and pet now supports series_id

* PT/RTDOSE metadata to csv

* fixed some bugs in autopipeline.py

* now the pipeline saves on exit

* deleted data

* now checks for existing subject id

* uncommented one line pytest

* uncommented one line pytest

* Added dataset class which can load from nrrds or directly from the dataset and convert to pytorch dataset

* bug fixes_1.0

* test and autopipe fixed

* bug fixes 2

* bug fixes 2

* added visualizations and some more bug fixes

* fixed bugs regarding multiple connections, saving of metadata and loading of metadata

* small bug fix

* added demo.py

* Changed dataset class returns

* fix conflicts

* fixed test autopipe

* fix path backslash issues

Co-authored-by: Vishwesh <vishweshramanathan@gmail.com>

* Update main.yml

* Update main.yml

* Update README.md

* Update main.yml (#18)

* Update main.yml

* Update requirements.txt

* Update main.yml

* Update main.yml

* build binary/dist

* removed linter

* Update setup.py

* Update README.md

* Update README.md (#19)

* Update README.md

* added tests for Dataset class

* added tests for Dataset class

* Create LICENSE (#20)

* Create LICENSE

* Update setup.py

* Seg.nrrd quick fix

* Minor bug fixes

* test fix

* Added demo

* Update setup.py (#23)

* updated README

* Update README.md (#24)

* preliminary MRI functionality (MR-RTSTRUCT pairs)

* Skim2257 quick fix (#26)

* Updated crawler to force String on all meta fields

* Update setup.py

* first commit

* removed test files, changed gitignore

* changed file directory structure for imageautooutput

* split mask up into each contour

* change kwargs in put for basesubjectwriter

* still kinda failing...

* brought back basesubjectwriter

* .imgtools directory

* changed absolute paths to relative paths

* changed os.path.join to pathlib.Path.as_posix()

* removed unused cv2 import

* removed cv2 import

* appened is deprecated, changed to concat

* debug print

* removed debug print

* added sparse mask class and generating function

* testing out sparse mask

* funky NaN problem

* commented sparse mask

* overwrite all subjects

* space

* overwrite false

* metadata stuffs

* metadata in dataset.csv

* added modalities, num rtstruct and pixel size to metadata

* metadata bugfix

* a

* fixed wrong variable names for metadata stuff

* fixed pathlib float error

* relative paths and output folder paths for dataset.csv

* put metadata stuff into a util file

* deal with empty metadata

* messing around with sparse mask

* tried to save sparse mask, did some stuff with nnunet output format

* compliant with nnunet directory structure

* CLI Interface, argparse moved to utils

* fixed formatting problems with folder names

* train test split

* train size and random state optional

* merge conflicts

* changed warnings to not interrupt

* changed to warnings.warn for generate_sparse_mask

* merge

* resolving conflicts?

* args

* changes for roi names as a dict

* added regex dictionary option for non nnunet runs

* sparse mask global labelling for contour name: index

* got rid of file_name_convention stuff

* conflicts resolved

* yaml thing

* added list capabilities for the roi names dictionary

* dataset.json for nnunet

* CLI "autonew"

* changed all mutable defaults to None

* moved autotest changes to autopipeline and addede a few CLI args

* getting ready for merge to live

* test_components, test_modalities works with new AutoPipeline

* overwrite changes and error fix for nan paths again

* fixed if statement

* joblib parallel

* warnings for missing patients

* summary messages

* updated, passing tests. Updated version to 0.4

* update test

* yaml path cli

* yaml error check

* pandas error

* Fixed read_dicom_auto

* skips series check if seris is None

* updated readme to reflect v0.4 changes

* updated readme

* minor change

* remove .idea

* remote .idea

* git ignore

* refactor nnunetutils to nnunet

Co-authored-by: przebieglykaziu <mkazmierski.poznan@gmail.com>
Co-authored-by: Minoru Nakano <minoru.nakano@gmail.com>
Co-authored-by: Sejin Kim <sejinkim@uhnslurmbuildbox.uhnh4h.cluster>
Co-authored-by: Sejin Kim <sejinkim@node88.uhnh4h.cluster>
Co-authored-by: Sejin Kim <sejinkim@node40.uhnh4h.cluster>
Co-authored-by: Sejin Kim <sejinkim@node97.uhnh4h.cluster>
Co-authored-by: Sejin Kim <sejinkim@h4huhnlogin2.uhnh4h.cluster>
Co-authored-by: Vishwesh Ramanathan <vishwesh@Vishweshs-Air.uhn.ca>
Co-authored-by: Sejin Kim <sejinkim@node38.uhnh4h.cluster>
Co-authored-by: Vishwesh Ramanathan <ramanav@node49.uhnh4h.cluster>
Co-authored-by: Vishwesh Ramanathan <ramanav@node53.uhnh4h.cluster>
Co-authored-by: Vishwesh <vishweshramanathan@gmail.com>
Co-authored-by: Benjamin Haibe-Kains <bhaibeka@bhaibek1.uhn.ca>
Co-authored-by: Sejin Kim <sejinkim@node89.uhnh4h.cluster>
Co-authored-by: Sejin Kim <hello@sejin.kim>
Co-authored-by: Sejin Kim <40668167+skim2257@users.noreply.github.com>
Co-authored-by: Vishwesh Ramanathan <vishwesh@Vishweshs-MacBook-Air.local>
Co-authored-by: Kevin Qu <kqu@uhnslurmbuildbox.uhnh4h.cluster>
Co-authored-by: Kevin Qu <kqu@node90.uhnh4h.cluster>
  • Loading branch information
20 people authored Jun 25, 2022
1 parent 0cce0ee commit 66a85e2
Show file tree
Hide file tree
Showing 31 changed files with 2,016 additions and 440 deletions.
9 changes: 9 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -1,6 +1,15 @@
# test scripts
examples/adsf.py
.idea

#vscode files
/.idea

# data
data
examples/data/tcia_n*
scratch.ipynb
tests/temp

# macOS
.DS_Store
Expand Down
49 changes: 30 additions & 19 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,11 +1,17 @@
# Med-Imagetools: Transparent and Reproducible Medical Image Processing Pipelines in Python

<!--- These are examples. See https://shields.io for others or to customize this set of shields. You might want to include dependencies, project status and licence info here --->
![GitHub repo size](https://img.shields.io/github/repo-size/bhklab/med-imagetools)
![GitHub contributors](https://img.shields.io/github/contributors/bhklab/med-imagetools)
![GitHub stars](https://img.shields.io/github/stars/bhklab/med-imagetools?style=social)
![GitHub forks](https://img.shields.io/github/forks/bhklab/med-imagetools?style=social)

### Latest Updates (v0.4) - June 24th, 2022
New features include:
* AutoPipeline CLI
* nnU-Net compatibility mode (--nnunet)
* Built-in train/test split for both normal/nnU-Net modes
* Random seed for reproducible seeds
* Region of interest (ROI) yaml dictionary intake for RTSTRUCT processing

Med-Imagetools, a python package offers the perfect tool to transform messy medical dataset folders to deep learning ready format in few lines of code. It not only processes DICOMs consisting of different modalities (like CT, PET, RTDOSE and RTSTRUCTS), it also transforms them into deep learning ready subject based format taking the dependencies of these modalities into consideration.

## Introduction
Expand Down Expand Up @@ -38,27 +44,21 @@ pip install -e git+https://github.com/bhklab/med-imagetools.git
```
This will install the package in editable mode, so that the installed package will update when the code is changed.

## Demo
These google collab notebooks will introduce the main functionalities of med-imagetools. More information can be found [here](https://github.com/bhklab/med-imagetools/blob/master/examples/README.md)
#### Tutorial 1: Forming Dataset with med-imagetools Autopipeline

[![Google Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/skim2257/tcia_samples/blob/main/notebooks/Tutorial_1_Forming_Dataset_with_Med_Imagetools.ipynb)

#### Tutorial 2: Machine Learning with med-imagetools and torchio

[![Google Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/skim2257/tcia_samples/blob/main/notebooks/Tutorial_2_Machine_Learning_with_Med_Imagetools_and_torchio.ipynb)

## Getting Started
Med-Imagetools takes two step approch to turn messy medical raw dataset to ML ready dataset.
1. ***Autopipeline***: Crawls the raw dataset, forms a network and performs graph query, based on the user defined modalities. The relevant DICOMS, get processed and saved as nrrds
```
python imgtools/autopipeline.py\
[INPUT DIRECTORY] \
[OUTPUT DIRECTORY] \
--modalities [str: CT,RTSTRUCT,PT] \
--spacing [Tuple: (int,int,int)]\
--n_jobs [int]\
--visualize [bool: True/False]\
autopipeline\
[INPUT DIRECTORY] \
[OUTPUT DIRECTORY] \
--modalities [str: CT,RTSTRUCT,PT] \
--spacing [Tuple: (int,int,int)]\
--n_jobs [int]\
--visualize [flag]\
--nnunet [flag]\
--train_size [float]\
--random_state [int]\
--roi_yaml_path [str]
```
2. ***class Dataset***: This class converts processed nrrds to torchio subjects, which can be easily converted to torch dataset
```
Expand All @@ -69,12 +69,23 @@ Med-Imagetools takes two step approch to turn messy medical raw dataset to ML re
data_loader = torch.utils.data.DataLoader(data_set, batch_size=4, shuffle=True, num_workers=4)
```

## Demo (Incompatible with v0.4)
These google collab notebooks will introduce the main functionalities of med-imagetools. More information can be found [here](https://github.com/bhklab/med-imagetools/blob/master/examples/README.md)
#### Tutorial 1: Forming Dataset with med-imagetools Autopipeline

[![Google Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/skim2257/tcia_samples/blob/main/notebooks/Tutorial_1_Forming_Dataset_with_Med_Imagetools.ipynb)

#### Tutorial 2: Machine Learning with med-imagetools and torchio

[![Google Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/skim2257/tcia_samples/blob/main/notebooks/Tutorial_2_Machine_Learning_with_Med_Imagetools_and_torchio.ipynb)

## Contributors

Thanks to the following people who have contributed to this project:

* [@mkazmier](https://github.com/mkazmier)
* [@skim2257](https://github.com/skim2257)
* [@fishingguy456](https://github.com/fishingguy456)
* [@Vishwesh4](https://github.com/Vishwesh4)
* [@mnakano](https://github.com/mnakano)

Expand Down
Loading

0 comments on commit 66a85e2

Please sign in to comment.