Skip to content

Latest commit

 

History

History
38 lines (30 loc) · 1.3 KB

README.org

File metadata and controls

38 lines (30 loc) · 1.3 KB

pentaplex

A receipt scanner and reader which makes use of tesseract-ocr and imagemagick. It executes five basic functionalities (hence the program’s name):

  1. scan receipt image (edge detection and warp transformation with opencv)
  2. preprocess scan (clean, sharpen, and contrast)
  3. run OCR (tesseract for optical character recognition)
  4. analyze OCR output (with fuzzy finder and preconfigured dictionary)
  5. summarize analysis in a csv file

To prepare for the scanning of the receipts, create a directory called imgs/ in the repository, and place pictures of the receipts in it; e.g. in Terminal (cd into the repository first) type something of the sort:

mkdir -p imgs/
cp ~/Downloads/*.JPG imgs/

Prerequisites

This program uses

Usage

To run pentaplex, type (of course cd into repository first):

./pentaplex [optional: auto]

Documentation

For code documentation visit: https://phdenzel.github.io/pentaplex/