Skip to content

Commit

Permalink
Added the arguments param options with good help message, and added b…
Browse files Browse the repository at this point in the history
…etter documentation for the each file
  • Loading branch information
moheladwy committed Dec 15, 2024
1 parent b907393 commit 33bf88c
Show file tree
Hide file tree
Showing 4 changed files with 184 additions and 54 deletions.
6 changes: 5 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -159,4 +159,8 @@ cython_debug/
# be found at https://github.com/github/gitignore/blob/main/Global/JetBrains.gitignore
# and can be added to the global gitignore or merged into this file. For a more nuclear
# option (not recommended) you can uncomment the following to ignore the entire idea folder.
#.idea/
.idea/

# VS Code
# Add .vscode/ to the .gitignore file if you are using Visual Studio Code
.vscode/
36 changes: 31 additions & 5 deletions OCR4Linux.py
Original file line number Diff line number Diff line change
@@ -1,13 +1,39 @@
# ========================================================================================================================
# Author:
# Mohamed Hussein Al-Adawy
# Version: 1.1.0
# Description:
# This script takes a screenshot of a selected area, extracts the text from the image, and copies it to the clipboard.
# The script uses grimblast for Wayland and scrot for X11 to take screenshots.
# The script uses tesseract to extract text from the image.
# The script uses wl-copy and cliphist for Wayland and xclip for X11 to copy the extracted text to the clipboard.
# The script uses a python script to extract text from the image.
# OCR4Linux.py is a Python script that handles image preprocessing and text extraction using Tesseract OCR.
# The script takes an input image, processes it for optimal OCR accuracy, and extracts text while preserving
# line breaks and layout.
#
# Features:
# - Image preprocessing (grayscale conversion, thresholding, noise removal)
# - Text extraction with layout preservation
# - Confidence-based filtering for improved accuracy
# - Support for multiple image formats
# - UTF-8 text output
#
# Dependencies:
# - PIL (Python Imaging Library)
# - pytesseract
# - OpenCV (cv2)
# - numpy
#
# Class Structure:
# TesseractConfig:
# - preprocess_image(): Enhances image quality for better OCR
# - extract_text_with_lines(): Extracts text while preserving layout
# - help(): Displays usage instructions
# - main(): Orchestrates the OCR process
#
# Usage:
# python OCR4Linux.py <image_path> <output_path>
#
# Example:
# python OCR4Linux.py screenshot.png output.txt
# ========================================================================================================================

import sys
import os
from PIL import Image
Expand Down
158 changes: 123 additions & 35 deletions OCR4Linux.sh
Original file line number Diff line number Diff line change
@@ -1,52 +1,127 @@
#!/bin/bash
# ========================================================================================================================
# Author:
# Mohamed Hussein Al-Adawy
# Author: Mohamed Hussein Al-Adawy
# Version: 1.1.0
# Description:
# This script takes a screenshot of a selected area, extracts the text from the image, and copies it to the clipboard.
# The script uses grimblast for Wayland and scrot for X11 to take screenshots.
# The script uses tesseract to extract text from the image.
# The script uses wl-copy and cliphist for Wayland and xclip for X11 to copy the extracted text to the clipboard.
# The script uses a python script to extract text from the image.
# The script requires the following packages to be installed:
# - python
# - tesseract
# - grimblast or scrot
# - wl-clipboard or xclip
# - cliphist
# OCR4Linux is a versatile text extraction tool for Linux systems that:
# 1. Takes screenshots of selected areas using:
# - grimblast for Wayland sessions
# - scrot for X11 sessions
# 2. Performs Optical Character Recognition (OCR) using tesseract by passing the screenshot to a Python script
# 3. Copies extracted text to clipboard using:
# - wl-copy and cliphist for Wayland
# - xclip for X11
#
# Features:
# - Support for both Wayland and X11 sessions
# - Configurable screenshot directory
# - Optional logging functionality
# - Optional screenshot retention
# - Command-line interface with various options
#
# Dependencies:
# - tesseract-ocr: For text extraction
# - grimblast/scrot: For screenshot capture
# - wl-clipboard/xclip: For clipboard operations
# - Python 3.x: For image processing
#
# Usage:
# ./OCR4Linux.sh [-r] [-d DIRECTORY] [-l] [-h]
# See './OCR4Linux.sh -h' for more details
# ========================================================================================================================

screenshot_name="screenshot_$(date +%d%m%Y_%H%M%S).jpg"
screenshot_dir="$HOME/Pictures/screenshots"
python_script_path="$HOME/.config/OCR4Linux"
python_script_name="OCR4Linux.py"
output_file_name="output_text.txt"
waiting_time=0.5
SCREENSHOT_NAME="screenshot_$(date +%d%m%Y_%H%M%S).jpg"
SCREENSHOT_DIRECTORY="$HOME/Pictures/screenshots"
OCR4Linux_HOME="$HOME/.config/OCR4Linux"
OCR4Linux_PYTHON_NAME="OCR4Linux.py"
TEXT_OUTPUT_FILE_NAME="output_text.txt"
LOGS_FILE_NAME="OCR4Linux.log"
SLEEP_DURATION=0.5
REMOVE_SCREENSHOT=false
KEEP_LOGS=false

# Display help message
show_help() {
echo "Usage: $(basename "$0") [OPTIONS]"
echo "Options:"
echo " -r Remove screenshot in the screenshot directory"
echo " -d DIRECTORY Set screenshot directory (default: $SCREENSHOT_DIRECTORY)"
echo " -l Keep logs"
echo " -h Show this help message, then exit"
echo "Example:"
echo " OCR4Linux.sh -s -d $HOME/screenshots -l"
echo " OCR4Linux.sh -s -l"
echo " OCR4Linux.sh -h"
echo "Note:"
echo " if you run \`OCR4Linux.sh\` only without any arguments, it will save the screenshot in the default directory $SCREENSHOT_DIRECTORY."
}

# Parse command line arguments
while getopts "rd:lh" opt; do
case $opt in
r) REMOVE_SCREENSHOT=true ;;
d) SCREENSHOT_DIRECTORY="$OPTARG" ;;
l) KEEP_LOGS=true ;;
h)
show_help
exit 0
;;
*)
show_help
exit 1
;;
esac
done

# Add log function
log_message() {
local message
message="[$(date '+%Y-%m-%d %H:%M:%S')] $1"
echo "$message" >&2
if [ "$KEEP_LOGS" = true ]; then
{
echo "$message"
} >>"$OCR4Linux_HOME/$LOGS_FILE_NAME"
fi
}

# Check if the required files exist.
check_if_files_exist() {
# Check if the screenshot directory exists, if not create it.
if [ ! -d "$screenshot_dir" ]; then
mkdir -p "$screenshot_dir"
log_message "Checking required files and directories..."

# Validate screenshot directory
if [ ! -d "$SCREENSHOT_DIRECTORY" ]; then
log_message "Creating screenshot directory: $SCREENSHOT_DIRECTORY since it does not exist."
if ! mkdir -p "$SCREENSHOT_DIRECTORY"; then
log_message "ERROR: Failed to create directory $SCREENSHOT_DIRECTORY"
exit 1
fi
log_message "Successfully created screenshot directory: $SCREENSHOT_DIRECTORY"
fi

# Check if the directory is writable
if [ ! -w "$SCREENSHOT_DIRECTORY" ]; then
log_message "ERROR: $SCREENSHOT_DIRECTORY is not writable"
exit 1
fi

# Check if the python script exists.
if [ ! -f "$python_script_path/$python_script_name" ]; then
echo "Error: $python_script_name not found in $python_script_path"
if [ ! -f "$OCR4Linux_HOME/$OCR4Linux_PYTHON_NAME" ]; then
log_message "ERROR: $OCR4Linux_PYTHON_NAME not found in $OCR4Linux_HOME"
exit 1
fi
}

# take shots using grimblast for wayland
takescreenshot_wayland() {
sleep $waiting_time
grimblast --notify copysave area "$screenshot_dir/$screenshot_name"
sleep $SLEEP_DURATION
grimblast --notify copysave area "$SCREENSHOT_DIRECTORY/$SCREENSHOT_NAME"
}

# take shots using scrot for x11
takescreenshot_x11() {
sleep $waiting_time
scrot -s -Z 0 -o -F "$screenshot_dir/$screenshot_name"
sleep $SLEEP_DURATION
scrot -s -Z 0 -o -F "$SCREENSHOT_DIRECTORY/$SCREENSHOT_NAME"
}

# Run the screenshot functions based on the session type.
Expand All @@ -56,26 +131,26 @@ takescreenshot() {
else
takescreenshot_x11
fi
log_message "Screenshot saved to $SCREENSHOT_DIRECTORY/$SCREENSHOT_NAME"
}

# Pass the screenshot to OCR tool to extract text from the image.
extract_text() {
python "$python_script_path/$python_script_name" \
"$screenshot_dir/$screenshot_name" \
"$python_script_path/$output_file_name"
python "$OCR4Linux_HOME/$OCR4Linux_PYTHON_NAME" \
"$SCREENSHOT_DIRECTORY/$SCREENSHOT_NAME" \
"$OCR4Linux_HOME/$TEXT_OUTPUT_FILE_NAME"
log_message "Text extraction completed successfully"
}

# Copy the extracted text to clipboard using wl-copy and cliphist.
copy_to_wayland_clipboard() {
cliphist store < "$python_script_path/$output_file_name"
cliphist store <"$OCR4Linux_HOME/$TEXT_OUTPUT_FILE_NAME"
cliphist list | head -n 1 | cliphist decode | wl-copy
rm "$python_script_path/$output_file_name"
}

# Copy the extracted text to clipboard using xclip.
copy_to_x11_clipboard() {
xclip -selection clipboard -t text/plain -i "$python_script_path/$output_file_name"
rm "$python_script_path/$output_file_name"
xclip -selection clipboard -t text/plain -i "$OCR4Linux_HOME/$TEXT_OUTPUT_FILE_NAME"
}

# Run the copy to clipboard functions based on the session type.
Expand All @@ -85,6 +160,16 @@ run_copy_to_clipboard() {
else
copy_to_x11_clipboard
fi
rm "$OCR4Linux_HOME/$TEXT_OUTPUT_FILE_NAME"
log_message "The extracted text has been copied to the clipboard."
}

# Remove the screenshot if the -r option is passed.
remove_image() {
if [ "$REMOVE_SCREENSHOT" = true ]; then
rm "$SCREENSHOT_DIRECTORY/$SCREENSHOT_NAME"
log_message "Screenshot $SCREENSHOT_NAME has been deleted since you passed the -l option."
fi
}

# Run the functions
Expand All @@ -93,6 +178,9 @@ main() {
takescreenshot
extract_text
run_copy_to_clipboard
remove_image
log_message "The script has finished successfully."
log_message "====================================================================================================="
}

main
38 changes: 25 additions & 13 deletions setup.sh
Original file line number Diff line number Diff line change
Expand Up @@ -2,18 +2,30 @@
# ========================================================================================================================
# Author:
# Mohamed Hussein Al-Adawy
# Version: 1.1.0
# Description:
# This script takes a screenshot of a selected area, extracts the text from the image, and copies it to the clipboard.
# The script uses grimblast for Wayland and scrot for X11 to take screenshots.
# The script uses tesseract to extract text from the image.
# The script uses wl-copy and cliphist for Wayland and xclip for X11 to copy the extracted text to the clipboard.
# The script uses a python script to extract text from the image.
# The script requires the following packages to be installed:
# - python
# - tesseract
# - grimblast or scrot
# - wl-clipboard or xclip
# - cliphist
# This setup script installs and configures OCR4Linux and its dependencies.
# It handles the installation of:
# 1. System requirements (tesseract, python packages)
# 2. Session-specific tools:
# - Wayland: grimblast, wl-clipboard, cliphist, rofi-wayland
# - X11: xclip, scrot, rofi
#
# Features:
# - Automatic detection and installation of AUR helper (yay)
# - Session-aware installation (Wayland/X11)
# - Configures necessary Python dependencies
# - Sets up required OCR language packs
#
# Requirements:
# - Arch Linux or Arch-based distribution
# - Internet connection for package downloads
# - sudo privileges for package installation
#
# Usage:
# chmod +x setup.sh
# ./setup.sh
# Follow the prompts to install dependencies
# ========================================================================================================================

# Define the required packages.
Expand Down Expand Up @@ -41,7 +53,7 @@ x11_session_apps=(

# Check if yay is installed.
check_yay() {
if ! command -v yay &> /dev/null; then
if ! command -v yay &>/dev/null; then
read -r -p "yay is not installed. Do you want to install yay? (y/n): " choice
if [ "$choice" = "y" ]; then
sudo pacman -S --needed --noconfirm git base-devel
Expand Down Expand Up @@ -76,4 +88,4 @@ main() {
cp -r ./* "$HOME/.config/OCR4Linux"
}

main
main

0 comments on commit 33bf88c

Please sign in to comment.