Skip to content

Using deep learning with PyTorch for a specialized subtitle text detection and recognition.

Notifications You must be signed in to change notification settings

voun7/Subtitle_OCR

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Subtitle OCR

python version

Program that uses deep learning to detect and recognize texts. The training data is optimized for subtitle text images.

Training Setup Instructions

Download and Install:

Latest Version of Microsoft Visual C++ Redistributable

Install Packages:

For GPU

pip install torch==2.5.1 torchvision==0.20.1 --index-url https://download.pytorch.org/whl/cu124

For CPU and/or Other Packages

pip install -r requirements.txt

Build Package:

python -m build

Usage

Install using pip:

pip install git+https://github.com/voun7/Subtitle_OCR.git

OCR Models - Models will be downloaded and placed in saved models folder

from sub_ocr.subtitle_ocr import SubtitleOCR

reader = SubtitleOCR("ch", "saved models")
result = reader.ocr("image_1.jpg")

The output will be in a list format, each item represents a bounding box, the text detected and confidence score, respectively.

[{'bbox': ((636, 69), (1284, 72), (1284, 156), (636, 138)), 'text': "Test image text", 'score': 0.8736287951469421},
{'bbox': ((552, 848), (1364, 864), (1366, 946), (552, 921)), 'text': 'another image text', 'score': 0.8997976183891296}]