Skip to content

Learning how to impliment a python script to read a pdf file using tesseract and output it to a text file.

Notifications You must be signed in to change notification settings

Prometheus7435/python_pdf_ocr_to_text

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 

Repository files navigation

python_pdf_ocr_to_text

Learning how to impliment a python script to read a pdf file using tesseract and output it to a text file.

Requirements

Python

  • pytesseract
  • pdf2image
  • Python Image Library (PIL), if you want to use saved image files as the tesseract input

system

  • tesseract
  • your language package of tesseract

About

Learning how to impliment a python script to read a pdf file using tesseract and output it to a text file.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages