Dicom Image De-Identifier

This is a simple DICOM image de-identifier. Its primary function is to receive a DICOM file, extract its payload (one or more images), apply image detection to each corresponding image and then remove the corresponding detected areas from that image.

Installation

Execute in a terminal

python3 -m pip install -r requirements.txt

Results

Resulting text detection and removal of text from respective bounding boxes based on the CRAFT detector of Keras OCR.

Available Image Detector/OCR Engines

Text Removal

Pipeline

To select an image text removal pipeline, open main.py, in the line

PIPELINE = <PIPELINE_FUNCTION>

replace <PIPELINE_FUNCTION> with one of the functions within the lines

## ! Pipelines: Begin

...

## ! Pipelines: End

that can be found inside dcm_img_text_remover.py.

For one file conversion you can use

presidio_dicom_image_text_remover
pytesseract_dicom_image_text_remover
keras_ocr_dicom_image_text_remover

For multiple file conversions you can use

MassConversion

Input Files

For One Input File

To select one input DICOM file (with name e.g. pos2.dcm), first place it inside ../dataset/raw and specify its path through the parameter IN_PATH at main.py, e.g.

IN_PATH = 'pos2.dcm'

For Multiple Input Files

For multiple DICOM conversions simply paste your directory path (e.g. ../dataset/raw/direc) and specify its path through the parameter IN_PATH at main.py by placing this line at the beginning of the pipeline's function inside dcm_img_text_remover.py, e.g.

IN_PATH = '../dataset/raw/direc'

Output Files

For One Input File

You can find the cleaned DICOM file along with its prediction plot on the path ./dataset/clean with the corresponding filename as its input. If the plot is unwanted, it can be disabled by commenting out the lines

vis_obj = visuals.DetectionVisuals(...)

vis_obj.build_plt(...)

rw_obj.store_fig(...)

from the associated pipeline function inside dcm_img_text_remover.py.

Run Script

To execute navigate inside ./src and apply

python3 main.py -p <input_directory_path> --gpu

To avoid using GPU, one may remove the --gpu argument.

Technical Description

Pipelines

Keras OCR

A high abstraction of the dicom image text removal pipeline based on keras-ocr. In this demonstrative example, the DICOM file contains exactly one image defined as $I$. The output image $I'$ is the cleaned image, based on the trained CRAFT model's estimation of bounding box locations in the image.

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
dataset		dataset
gen_sample_out		gen_sample_out
src		src
.gitignore		.gitignore
Bembo.ttf		Bembo.ttf
LICENSE		LICENSE
README.md		README.md
arial.ttf		arial.ttf
dicom_image_deidentifier.drawio		dicom_image_deidentifier.drawio
fig0.png		fig0.png
fig1.png		fig1.png
fig2.jpeg		fig2.jpeg
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Dicom Image De-Identifier

Installation

Results

Available Image Detector/OCR Engines

Text Removal

Pipeline

Input Files

For One Input File

For Multiple Input Files

Output Files

For One Input File

Run Script

Technical Description

Pipelines

Keras OCR

About

Releases

Packages

Languages

License

fl0wxr/DICOMImageDeIdentifier

Folders and files

Latest commit

History

Repository files navigation

Dicom Image De-Identifier

Installation

Results

Available Image Detector/OCR Engines

Text Removal

Pipeline

Input Files

For One Input File

For Multiple Input Files

Output Files

For One Input File

Run Script

Technical Description

Pipelines

Keras OCR

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages