Skip to content

Wrapper script for tesseract accepting image pdf as input, outputting a searchable PDF.

Notifications You must be signed in to change notification settings

flangknecht/OS-X-Tesseract-Multipage-PDF

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 

Repository files navigation

Tesseract OCR Wrapper script for OS X

This script takes care of splitting an input pdf file into parts, performing OCR on the parts and assembling the individual recognized pdf files back into one searchable file.

Relies on Mac OS X "mdls" tool and a python script that's specific to OS X to reassemble.

About

Wrapper script for tesseract accepting image pdf as input, outputting a searchable PDF.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages