An OCR server that runs in a Docker container and provides a RESTful API. Support for Chinese and English. Model has integrated line detection and text recognition , out of the box . Model files and inference code from ModelScope.
The line detection model is tested on the MTWI test set with the following results:
Backbone | Recall | Precision | F-score |
---|---|---|---|
ResNet18 | 68.1 | 84.9 | 75.6 |
BenchMark for text recognition model is not yet available.
-
Requires Docker engine or Docker Desktop.
-
Since GPUs are used in Docker, Nvidia Docker also needs to be installed to provide GPU invocation capabilities in the container.The installation process can be referenced:Nvidia container-toolkit.
All other dependencies and models are included in the pre-built Docker image, but of course you can build the exact same image from scratch based on the source code.
- Clone the repo
git clone https://github.com/kenwaytis/OCR_modelscope.git
- (opsition) Start the server with the default Docker Image
docker compose up
- (opsition) Build the image from scratch
3.1 Modify the docker-compose.yml file from
image: paidax/ocr_modelscope:0.6.3
to
image: namespace/ocr_modelscope:0.6.3
3.2 Start the server
docker compose up
-
Because fastAPI was used to build the server, you can view the automatically generated documentation instructions at localhost:9533/docs.
-
Description:
URL:
localhost:9533/ocr_system
Request method:
POST
json description:
field name | required or not | type | note |
---|---|---|---|
images | yes | list[str] | base64 encoded images |
Request json example:
{
"images":["img1", "img2", "img3"]
}
@article{tang2019seglink++,
title={Seglink++: Detecting dense and arbitrary-shaped scene text by instance-aware component grouping},
author={Tang, Jun and Yang, Zhibo and Wang, Yongpan and Zheng, Qi and Xu, Yongchao and Bai, Xiang},
journal={Pattern recognition},
volume={96},
pages={106954},
year={2019},
publisher={Elsevier}
}