Program that uses deep learning to detect and recognize texts. The training data is optimized for subtitle text images.
Latest Version of Microsoft Visual C++ Redistributable
For GPU
pip install torch==2.5.1 torchvision==0.20.1 --index-url https://download.pytorch.org/whl/cu124
For CPU and/or Other Packages
pip install -r requirements.txt
python -m build
pip install git+https://github.com/voun7/Subtitle_OCR.git
OCR Models -
Models will be downloaded and placed in saved models
folder
from sub_ocr.subtitle_ocr import SubtitleOCR
reader = SubtitleOCR("ch", "saved models")
result = reader.ocr("image_1.jpg")
The output will be in a list format, each item represents a bounding box, the text detected and confidence score, respectively.
[{'bbox': ((636, 69), (1284, 72), (1284, 156), (636, 138)), 'text': "Test image text", 'score': 0.8736287951469421},
{'bbox': ((552, 848), (1364, 864), (1366, 946), (552, 921)), 'text': 'another image text', 'score': 0.8997976183891296}]