GitHub - gnonio/korporize: OCR - Object Character Recognition for any image you browse upon

korpora - OCR - Optical Character Recognition

Offline text recognition from any image. This web extension will enable context menu access to extract text from any image while browsing. Builds upon Tesseract.js

Install

Addons Mozilla page

Alternate install for advanced users

download this repository
follow instructions for Temporary installation in Firefox

Usage

Right click over an image in a web page
Select "Extract Text from Image"
A popup will open with korporize interface
Wait for tesseract to work in the background
Obtain results in korporize panel
(Optional) copy results to clipboard

To obtain good results:

make sure the automatic language detected is suitable for the characters in the image loaded
force another language via Options page
increase quality in Options page (try Normal or Best - both will take longer)
make sure you have a suitable page segmentation for the image (will make this choice handier in future releases)
choose a high resolution version of the image

Features

Extracts text from any image while browsing
Works offline (requires network only the first time a language is used to cache the dictionaries)
Automatic language detection (based on the visited web page)
Prevents downloading twice already loaded images

Notes

Careful with the size of language dictionaries
Expect around 8Mb for Normal and 12Mb for Best Quality per language
Aside from above dictionaries no other data is ever stored by korporize

Todo

Many other options for accessing Tesseract functionality (image from link, PDF load and save, etc...)
Preloading of language dictionaries (via Options page)
Provide some cache management options
Provide access as an API for other webextensions

Name		Name	Last commit message	Last commit date
Latest commit History 35 Commits
img		img
js		js
lib		lib
.gitignore		.gitignore
LICENSE.md		LICENSE.md
README.md		README.md
background-ui.js		background-ui.js
background.js		background.js
dev-install.md		dev-install.md
manifest.json		manifest.json
package.json		package.json
user-install.md		user-install.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

korpora - OCR - Optical Character Recognition

Install

Alternate install for advanced users

Usage

Features

Notes

Todo

About

Releases 5

Packages

Languages

License

gnonio/korporize

Folders and files

Latest commit

History

Repository files navigation

korpora - OCR - Optical Character Recognition

Install

Alternate install for advanced users

Usage

Features

Notes

Todo

About

Topics

Resources

License

Stars

Watchers

Forks

Releases 5

Packages 0

Languages

Packages