This is a simple yet powerful application that uses Optical Character Recognition (OCR) and Computer Vision to detect open windows, extract their content, filter the text, and suggest important phrases to search on Google using Gemini API integration. The application is useful in various scenarios, such as reading PDFs or attending lectures, by highlighting key points and providing one-click Google search functionality.
-
Download and extract the files:
- Clone the repository or download the
file.zip
file from GitHub and extract it. - Navigate to the directory where the files are located.
- Clone the repository or download the
-
Run the application:
- Execute the
main.py
script to start the application.
python main.py
- Execute the
-
Using the application:
- A window will open displaying a list of currently opened windows.
- Enter the number corresponding to the window you want to capture and click "Go".
- The application will display important phrases detected from the window content. Click any phrase to automatically search it on Google.
Simple as that!
- Computer Vision: For capturing and processing window screenshots.
- OCR (Optical Character Recognition): To extract text from images using Tesseract.
- AI Technology - Gemini API Integration: To analyze extracted text and suggest important phrases for Google search.
- Capture Open Windows: Lists and captures screenshots of currently open windows.
- OCR Extraction: Extracts text from captured window screenshots.
- AI-Powered Text Analysis: Uses Gemini API to analyze text and suggest important search phrases.
- One-Click Google Search: Provides a simple GUI to search suggested phrases on Google with one click.
- Always on Top: Keeps the application window always on top for easy access.
-
Clone the repository:
git clone https://github.com/Sajitha-Madugalle/Reading_Companion_OpenCV.git cd Reading_Companion_OpenCV
-
Install dependencies: Ensure you have Python installed, then install the required packages using:
pip install -r requirements.txt
-
Configure Tesseract:
- Download and install Tesseract OCR from here.
- Ensure the Tesseract executable path is correctly set: Mostly in
pytesseract.pytesseract.tesseract_cmd = r'C:\Program Files\Tesseract-OCR\tesseract.exe'
-
Run the application:
python main.py
First Window showing the currently opened windows
Selected window and important phrases to search suggested by Gemini
Contributions are welcome!
- Refresh rate and FPS tradeoff
- Improved GUI
- It is great if there is an floating icon to openup the window, so the reader is not get distracted.
- Applying Deep Learning algorithms to identify Maths phrases, and solve them
Please feel free to submit a Pull Request or open an Issue to improve this project.
- Fork the repository.
- Create your feature branch:
git checkout -b feature/YourFeature
- Commit your changes:
git commit -m 'Add your feature'
- Push to the branch:
git push origin feature/YourFeature
- Open a Pull Request.
This project is licensed under the MIT License. See the LICENSE file for details.
- Fast Window Capture - OpenCV Object Detection in Games #4 See here. by Learn Code By Gaming
- Realtime Text Detection in Images using Tesseract | OpenCV | Python | Tutorial for beginners See here. by DeepLearning_by_PhDScholar