Skip to content

Docify classifies PDFs into ten categories: Legal, Medical, Finance, Education, Business, News, Technical, Creative, Scientific, and Government. Utilizing sklearn, PyTesseract, and Naive Bayes, it ensures precise, efficient document organization and retrieval, enhancing decision-making and workflow automation across various industries.

Notifications You must be signed in to change notification settings

Shobhit141141/Docify

Repository files navigation

🔍Docify

Docify classifies PDFs into ten categories: Legal Medical Finance Education Business News Technical Creative Scientific and Government. Utilizing sklearn PyTesseract and Naive Bayes it ensures precise efficient document organization and retrieval enhancing decision-making and workflow automation across various industries.

🚀 Demo

💻 Built with

Technologies used in the project:

Python Pandas NumPy scikit-learn Flask Render HTML5

🌟 Features

Here are some of the project's best features:
Feature Description
Text Classification 📚 The model can classify text into predefined categories such as Legal, Medical, Finance, etc., based on its content.
PDF to Text Conversion 📄➡️📝 The application can convert PDF files uploaded by users into text format, allowing the model to analyze the content.
Custom Category Order 🧩 The model uses a custom category order defined by the user, allowing for flexibility in how different categories are prioritized and displayed.

📑 Document Categories

Category Emoji
Legal ⚖️
Medical 🏥
Finance 💰
Education 📚
Business 🏢
News 📰
Technical 💻
Creative 🎨
Scientific 🧪
Government 🏛️

🛠️ Installation Steps:

1. Clone the repo

git clone https://github.com/Shobhit141141/Docify.git

2. Install required libraries

pip install -r requirements.txt

3. Run the project

python app.py

About

Docify classifies PDFs into ten categories: Legal, Medical, Finance, Education, Business, News, Technical, Creative, Scientific, and Government. Utilizing sklearn, PyTesseract, and Naive Bayes, it ensures precise, efficient document organization and retrieval, enhancing decision-making and workflow automation across various industries.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published