Implementation of 'Scene Text Recognition with Sliding Convolutional Character Models'(pdf)
Sliding windows + CNN + CTC
While this implement might work for many cases, it is only tested for environment below:
python == 3.7.0
torch == 0.4.1
tqdm
numpy
warp-ctc(for pytorch 0.4)
CUDA 9.0.1
CUDNN 7.0.5
Follow this instruction
Note:Version of warp-ctc should be corresponding with pytorch. Related issue
Download IIIT5K dataset and release files to dataset folder.
Preprocess IIIT5K dataset
python3 prepare_IIIT5K_dataset.py
Train model:
python3 main.py --cuda=True --mode=train
Resume training:
python3 main.py --cuda=True --wram-up=True --mode=train
Test model:
python3 main.py --cuda=True --mode=test
Note:
model.bin
file is a pre-trained model which could achieve about 53% accuracy. (Due to the small training dataset)
If you find this work is useful in your research, please consider citing:
@article{yin2017scene,
title={Scene text recognition with sliding convolutional character models},
author={Yin, Fei and Wu, Yi-Chao and Zhang, Xu-Yao and Liu, Cheng-Lin},
journal={arXiv preprint arXiv:1709.01727},
year={2017}
}