This project is a pytorch implementation of fots detection for OCR, and we focus on achieved the detection algorithm only:paper link-FOTS: Fast Oriented Text Spotting with a Unified Network. The code is created by Ning Lu originally, and we would like to appreaciate to his contributions.
- Change some code to make the project work.
- Add PSPnet model to experiment, but is not work effectively(the another project that we have doing: code).
- Support visdom.
- Support pytorch-0.4.1 or higher.
We benchmark our code thoroughly on the dataset: ICDAR2015, using network architecture: resnet50. It's worth noting that, the project had used the multi-scale to train network and haven't done the skill of OHEM. Below are the results:
1). ICDAR2015 (scale=512):
model | #GPUs | batch size | lr | Recall | Precision | Hmean |
---|---|---|---|---|---|---|
Res-50 | 1080Ti | 4 | 1e-3 | 69.72% | 80.09% | 74.54% |
First of all, clone the code
git clone https://github.com/Vipermdl/OCR_detection_IC15
- Python 3.6
- Pytorch 0.4.1
- CUDA 8.0 or higher
- ICDAR 2015: Please download the dataset in the folder in your project named dataset, you can refer to any others. After downloading the data, creat softlinks in the folder data/.
Install all the python dependencies using pip:
pip install -r requirements.txt
Try:
python train.py
If you want to evlauate the detection performance, simply run
python eval.py
Below are some detection results:
This project is equally contributed by Ning Lu and DongLiang Ma, and many others (thanks to them!).