Preprocessors to remove DICOM masks and generate segmentations.
pip install opencv-python pydicom matplotlib pandas
usage: [-h] --dcm-dir DCM_DIR
--label-dir LABEL_DIR
--target-dir TARGET_DIR
--mode {inspector,mask,roi,spreadsheet,cvat,transform,classify,label}
[--filterable-csv-file FILTERABLE_CSV_FILE]
[--filterable-dataset-type {train,valid,test}]
[--overwrite-label-type {cvat}]
[--overwrite-label-file OVERWRITE_LABEL_FILE]
[--new-shape NEW_SHAPE]
[--crop-image CROP_IMAGE]
[--jobs JOBS]
This is a preprocessor to remove DICOM masks and generate segmentations and
its inspectors, masks and ROIs.
optional arguments:
-h, --help show this help message and exit
--dcm-dir DCM_DIR The DICOM root directory
--label-dir LABEL_DIR
The JSON labels root directory
--target-dir TARGET_DIR
The destination root directory for outputs
--mode {inspector,mask,roi,spreadsheet,cvat}
inspector Generate four-in-one images to compare masks, overlay
and noise-eliminated with original image
mask Generate binary masks that will be used as Dataset
for segmentation models
roi Generate region-of-interest images that will be used
as Dataset for classification model
spreadsheet Generate CSV files that contains encrypted
patients identifiers and its file name
cvat Generate a XML file that contains segmentation mask polygons
to be uploaded on CVAT
transform Generate the original dataset images but necessarily transformed
classify Generate region-of-interest images that will be used
as Dataset for classification model, but sorts into phase labels
label Generate a CSV file that contains file name and its cancer phases
--filterable-csv-file FILTERABLE_CSV_FILE
The CSV file to be used for filtering broken datasets out
--filterable-dataset-type {train,valid,test}
The type of dataset source directory for querying filterable CSV file
A flag to keep issued rows in filterable CSV file
--overwrite-label-type {cvat}
The type of overrideable labels format to parse
cvat CVAT 1.1 XML annotation format
Pass 'annotations.xml' file to --overwrite-label-file argument
--overwrite-label-file OVERWRITE_LABEL_FILE
The label file to be used for overwriting dataset labels
--new-shape NEW_SHAPE
WxH. Resize the output image with desired width and height - e.g.) 224x224
--crop-image CROP_IMAGE
X:Y,W:H. Crop the output image to desired rectangle - e.g.) 90:0,480:480
--jobs JOBS Number of workers
usage: [-h] --csv-file CSV_FILE
--source-dirs SOURCE_DIRS
--target-dir TARGET_DIR
This is an assigner for assigning dataset validation job fairly.
optional arguments:
-h, --help show this help message and exit
--csv-file CSV_FILE The assignees CSV file
--source-dirs SOURCE_DIRS
dir1,dir2,dir3,.. The source directories to be assigned to
--target-dir TARGET_DIR
The destination directory where the assigned directory will be located
- *
- *.json
- *
- *.dcm
means the row have an issue, and the image will be truncated in the result.
Proper assignees CSV file is required to separate the dataset fairly.
- John
- 00000001
- *.jpg
- 00000004
- *.jpg
- James
- 00000002
- *.jpg
- 00000005
- *.jpg
- 00000006
- *.jpg
- Alice
- 00000003
- *.jpg