Reading Companion- CV based OCR application with Gemini Integration

This is a simple yet powerful application that uses Optical Character Recognition (OCR) and Computer Vision to detect open windows, extract their content, filter the text, and suggest important phrases to search on Google using Gemini API integration. The application is useful in various scenarios, such as reading PDFs or attending lectures, by highlighting key points and providing one-click Google search functionality.

How to Use It

Download and extract the files:
- Clone the repository or download the file.zip file from GitHub and extract it.
- Navigate to the directory where the files are located.
Run the application:
- Execute the main.py script to start the application.
```
python main.py
```
Using the application:
- A window will open displaying a list of currently opened windows.
- Enter the number corresponding to the window you want to capture and click "Go".
- The application will display important phrases detected from the window content. Click any phrase to automatically search it on Google.
Simple as that!

Used Technologies

Computer Vision: For capturing and processing window screenshots.
OCR (Optical Character Recognition): To extract text from images using Tesseract.
AI Technology - Gemini API Integration: To analyze extracted text and suggest important phrases for Google search.

Features

Capture Open Windows: Lists and captures screenshots of currently open windows.
OCR Extraction: Extracts text from captured window screenshots.
AI-Powered Text Analysis: Uses Gemini API to analyze text and suggest important search phrases.
One-Click Google Search: Provides a simple GUI to search suggested phrases on Google with one click.
Always on Top: Keeps the application window always on top for easy access.

Installation

Clone the repository:

git clone https://github.com/Sajitha-Madugalle/Reading_Companion_OpenCV.git
cd Reading_Companion_OpenCV

Install dependencies: Ensure you have Python installed, then install the required packages using:
```
pip install -r requirements.txt
```
Configure Tesseract:
- Download and install Tesseract OCR from here.
- Ensure the Tesseract executable path is correctly set: Mostly in
```
pytesseract.pytesseract.tesseract_cmd = r'C:\Program Files\Tesseract-OCR\tesseract.exe'
```
Run the application:
```
python main.py
```

Screenshots

First Window showing the currently opened windows

Selected window and important phrases to search suggested by Gemini

Google Search Results

Contributing

Contributions are welcome!

Problems

Refresh rate and FPS tradeoff
Improved GUI
It is great if there is an floating icon to openup the window, so the reader is not get distracted.
Applying Deep Learning algorithms to identify Maths phrases, and solve them

Please feel free to submit a Pull Request or open an Issue to improve this project.

Fork the repository.
Create your feature branch:
```
git checkout -b feature/YourFeature
```
Commit your changes:
```
git commit -m 'Add your feature'
```
Push to the branch:
```
git push origin feature/YourFeature
```
Open a Pull Request.

License

This project is licensed under the MIT License. See the LICENSE file for details.

References

Fast Window Capture - OpenCV Object Detection in Games #4 See here. by Learn Code By Gaming
Realtime Text Detection in Images using Tesseract | OpenCV | Python | Tutorial for beginners See here. by DeepLearning_by_PhDScholar

Name		Name	Last commit message	Last commit date
Latest commit History 33 Commits
__pycache__		__pycache__
file		file
media		media
.gitattributes		.gitattributes
LICENSE		LICENSE
README.md		README.md
Reading Companion.mp4		Reading Companion.mp4
file.zip		file.zip
iconRC.ico		iconRC.ico
main.py		main.py
requirements.txt		requirements.txt
searchAI.py		searchAI.py
text_detection.py		text_detection.py
windowcapture.py		windowcapture.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Reading Companion- CV based OCR application with Gemini Integration

Table of Contents

How to Use It

Used Technologies

Features

Installation

Screenshots

Contributing

Problems

License

References

About

Releases

Packages

Languages

License

Sajitha-Madugalle/Reading_Companion_OpenCV

Folders and files

Latest commit

History

Repository files navigation

Reading Companion- CV based OCR application with Gemini Integration

Table of Contents

How to Use It

Used Technologies

Features

Installation

Screenshots

Contributing

Problems

License

References

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages