React native library to perform OCR on images. This library uses Tesseract library for image processing in android and vision library for iOS.
npm install rn-ocr-lib
# or
yarn add rn-ocr-lib
Create folder app/src/main/assets/tessdata
. Inside tessdata
place ${lang}.traineddata
. You can get the train data files from here.
Vision library is present in iOS from version 13 or above. So update ios/Podfile
.
platform :ios, '13.0'
Kindly refer the example project for usage for Android and iOS.
import { getText, useOCREventListener } from 'rn-ocr-lib';
getText
Call this method to initiate image processing
getText(data: string, dataInputType: DataInputType, options?: Partial<OCROptions>): void;
useOCREventListener
Call this hook to setup listener to listen to progress, result and error.
useOCREventListener(
(event: OCREventType, ocrEventResponse: OCREventResponse) => {
switch (event) {
case OCREvent.FINISHED:
return;
case OCREvent.PROGRESS:
return;
case OCREvent.ERROR:
return;
default:
return;
}
}
);
dataInputType
Input type | Value | Description |
---|---|---|
file | FILE | Path to image file |
base64 | BASE64 | Base64 string |
options
Option | Type | iOS | Android | Default | Description |
---|---|---|---|---|---|
ocrEngineMode | OCREngineMode | Yes | Yes | FAST | Type of mode between fast or accurate recognization |
pageSegMode | PageSegMode | No | Yes | PSM_OSD_ONLY | Page seg mode of tesseract |
lang | string[] | Yes | Yes | ["eng"] | Languages for which recognization is needed |
ocrEngineMode
Engine Mode | Value | Description |
---|---|---|
FAST | 0 | Fast mode where recognization will be faster but mismatch of words is possible |
ACCURATE | 1 | Accurate mode where time to process is more but more accurate text will be obtained |
FAST_ACCURATE | 2 | Relavant for android tesseract where train data is provided for accurate and fast results but traindata file may be bigger compared to previous modes |
pageSegMode (Android)
By default Tesseract expects a page of text when it segments an image. If you’re just seeking to OCR a small region, try a different segmentation mode.
Segmentation Mode | Value | Description |
---|---|---|
PSM_OSD_ONLY | 0 | Orientation and script detection (OSD) only. |
PSM_AUTO_OSD | 1 | Automatic page segmentation with OSD. |
PSM_AUTO_ONLY | 2 | Automatic page segmentation, but no OSD, or OCR. |
PSM_AUTO | 3 | Fully automatic page segmentation, but no OSD. (Default) |
PSM_SINGLE_COLUMN | 4 | Assume a single column of text of variable sizes. |
PSM_SINGLE_BLOCK_VERT_TEXT | 5 | Assume a single uniform block of vertically aligned text. |
PSM_SINGLE_BLOCK | 6 | Assume a single uniform block of text. |
PSM_SINGLE_LINE | 7 | Treat the image as a single text line. |
PSM_SINGLE_WORD | 8 | Treat the image as a single word. |
PSM_CIRCLE_WORD | 9 | Treat the image as a single word in a circle. |
PSM_SINGLE_CHAR | 10 | Treat the image as a single character. |
PSM_SPARSE_TEXT | 11 | Sparse text. Find as much text as possible in no particular order. |
PSM_SPARSE_TEXT_OSD | 12 | Sparse text with OSD. |
PSM_RAW_LINE | 13 | Raw line. Treat the image as a single text line, bypassing hacks that are Tesseract-specific. |
event
Event type | Value | Description |
---|---|---|
FINISHED | finished | Event when OCR completes |
PROGRESS | progress | Progress event when OCR is processing |
ERROR | error | Error event OCR fails and some error is being thrown |
ocrEventResponse
Key | Type | Description |
---|---|---|
text | string | Result text |
progress | number | Progress percent |
error | string | Error message |
Here is the list of supported languages for android.
Vision library currently has limited language support. It supports the following languages.
Language | Code |
---|---|
English | eng |
France | fra |
Italian | ita |
German | deu |
Spanish | spa |
Portuguese | por |
Chinese Simplified | chi_sim |
Chinese Traditional | chi_tra |
See the contributing guide to learn how to contribute to the repository and the development workflow.
MIT
Made with create-react-native-library