Skip to content

This tool is used to compare PDF files. With the pdf-comparator, you can streamline your PDF document analysis process and ensure the accuracy and consistency of your documents.

License

Notifications You must be signed in to change notification settings

VintLin/pdf-comparator

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

26 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

pdf-comparator

【English | Chinese | Japanese

📖 Overview

This tool is specifically designed for individuals who need to spend a significant amount of time proofreading the content of PDF files. It efficiently compares the differences between different PDF files. The sample comparison results generated by this tool allow for a quick identification of discrepancies in pixels and text between PDF files.

Sample Comparison Results:

❓ What Can PDF Comparator Do?

1. Image Difference Comparison

This tool generates a comparison result based on pixel differences between two PDF files, including four images. In the top two images, the red overlay indicates areas with pixel differences. To make differences more evident, two additional images are provided below. If the bottom-left image is pure white or the bottom-right image is pure black, it signifies that there are no differences between the two PDFs.

2. Text Difference Comparison

The tool will mark all recognizable text in the PDF with colored masks, where different colors have different meanings.

  • Green: The word remains unchanged.
  • Orange: Both the font size and color of the word have changed.
  • Red: The word is an added or modified word.

🖥️ Quick Start

Please follow the steps below:

  1. Clone the GitHub Repository: Clone the repository using the following command:
git clone https://github.com/VintLin/pdf-comparator.git
  1. Set up Python Environment: Open the "pdf-comparator" project directory and ensure you have Python 3.8 or higher. You can create and activate this environment using the following command, replacing "venv" with your preferred environment name:
cd pdf-comparator
python3 -m venv venv
  1. Install Dependencies: Install the required dependencies by running the following command:
pip3 install -r requirements.txt
  1. Run the Code Directly: Compare PDF files by running the following command:
python3 -m pdfcomparator "/compare_file_1.pdf" "/compare_file_2.pdf" "/result_folder/"
  1. Build an Executable: You can also build an executable using cx-Freeze as needed (the executable can be found in "/build/" after a successful build):
python3 setup.py build
  1. Run the Executable: Compare PDF files by running the following command with the executable:
./pdfcomparator.exe "/compare_file_1.pdf" "/compare_file_2.pdf" "/result_folder/"

Command Line Argument Usage

This program accepts the following command line arguments:

  • file1 (required): Path to input file 1. Please provide the path to the first file you want to compare.

  • file2 (required): Path to input file 2. Please provide the path to the second file you want to compare.

  • output_folder (required): Path to the output folder. Comparison results will be saved in this folder.

  • --cache or -c: Optional argument for specifying a cache path. If a cache path is specified, the program will use caching to accelerate the comparison process. Caching is not enabled by default.

Examples

Here are some usage examples:

# Perform comparison
python3 -m pdfcomparator file1.pdf file2.pdf output_folder/

# Perform comparison and enable caching
python3 -m pdfcomparator file1.pdf file2.pdf output_folder/ --cache /path/to/cache

👨‍💻‍ Contributors

Made with contrib.rocks.

⚖️ License

  • Source Code Licensing: Our project's source code is licensed under the MIT License. This license permits the use, modification, and distribution of the code, subject to certain conditions outlined in the MIT License.
  • Project Open-Source Status: The project is indeed open-source; however, this designation is primarily intended for non-commercial purposes. While we encourage collaboration and contributions from the community for research and non-commercial applications, it is important to note that any utilization of the project's components for commercial purposes necessitates separate licensing agreements.

🌟 Star History

Star History Chart

📬 Contact

If you have any questions, feedback, or would like to get in touch, please feel free to reach out to us via email at vintonlin@gmail.com

About

This tool is used to compare PDF files. With the pdf-comparator, you can streamline your PDF document analysis process and ensure the accuracy and consistency of your documents.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages