OCR GO Microservice

Api endpoints to call tesseract ocr functions

Getting started

With docker:

docker run -p5005:5005 ieferrari/ocr_go_microservice

Get the text from an image url:

curl -X POST \
     -d '{"msg": "https://pbs.twimg.com/media/EH-Pvo9WwAEKFwc?format=jpg&name=small"}' \
     -H "Content-Type: application/json" \
     http://127.0.0.1:5005/ocr_from_url

Basic installation

how to install tesseract

apt-get install automake ca-certificates g++ git libtool libleptonica-dev make pkg-config

git clone https://github.com/tesseract-ocr/tesseract.git

cd tesseract ./autogen.sh ./configure make sudo make install sudo ldconfig

https://notesalexp.org/tesseract-ocr/#tesseract_5.x sudo apt-get install tesseract-ocr

go get github.com/otiai10/gosseract/v2

wget https://github.com/tesseract-ocr/tessdata/raw/4.00/spa.traineddata tesseract --tessdata-dir . example.png outputbase -l spa --psm 3

Load test

On a 1 CPU, 1 GB RAM, vps server on Linode with Ubuntu 20.04.2 LTS

service not running	running iddle	high load
428 MB RAM	530 MB RAM	638 MB RAM
4.4 % CPU	5 % CPU	60 % CPU

Other languages alternatives

Depending on your architecture, it may be more efficient to call tesseract from a wrapper in your preferred language. This container is an alternative if your team is having troubles installing the tesseract components for a specific language, or if you want a centralized ocr implementation in the first place.

Python example:

import pytesseract
from PIL import Image
pytesseract.pytesseract.tesseract_cmd ='/usr/local/bin/tesseract'
print(pytesseract.image_to_string(Image.open('./example.png'), lang='spa').replace("º", 'o'))

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
.vscode		.vscode
app		app
.gitignore		.gitignore
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
load_test_at_1cpu_1Gb_ram_vps.png		load_test_at_1cpu_1Gb_ram_vps.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

OCR GO Microservice

Getting started

Basic installation

Load test

Other languages alternatives

About

Releases

Packages

Languages

License

ieferrari/ocr_go_microservice

Folders and files

Latest commit

History

Repository files navigation

OCR GO Microservice

Getting started

Basic installation

Load test

Other languages alternatives

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages