Summarizer

Flask server that can be used to summarize Government Scheme PDFs.

The PDFs are first converted to images and then the text is extracted from them using OCR. Then the text is processed to select relevant information.

We extract the following information from the document:

The subject of the document
Date of scheme release
The list of ministries involved
One-two line summary of the pdf
3-5 important sentences from the document that conveys the gist of the document.

This has been done using the following technologies

Dependencies

(requirements.txt has been created to install all the dependencies)

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
pdfs		pdfs
Dockerfile		Dockerfile
README.md		README.md
app.py		app.py
requirements.txt		requirements.txt