Skip to content

seaneschen/PDF2JSON

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

manual.md

# PDF to JSON Transcriber User Manual

Welcome to the PDF to JSON Transcriber user manual. This software is designed to transcribe text from PDF documents, such as guides and manuals, into an efficient JSON format. The resulting JSON documents will be true to the specific text of the PDF, optimized for training GPT models.

## System Requirements

- Python 3.6 or higher
- PyMuPDF 1.18.19

## Installation

Before running the application, you need to install the required dependencies. You can do this by running the following command in your terminal:

```bash
pip install -r requirements.txt

This will install PyMuPDF, which is necessary for reading PDF files.

Starting the Application

To start the application, navigate to the directory containing the main.py file and run:

python main.py

This will open the graphical user interface (GUI) of the PDF to JSON Transcriber.

Using the Software

Importing a PDF

  1. Click on the "Import PDF" button.
  2. Navigate to the location of the PDF file you wish to transcribe.
  3. Select the file and click "Open".

The path of the imported PDF will be displayed in the application window.

Exporting to JSON

  1. Once a PDF is imported, the "Export JSON" button will become active.
  2. Click on the "Export JSON" button.
  3. Choose the desired location to save the JSON file and provide a file name.
  4. Click "Save".

A success message will appear if the JSON file has been exported successfully. If there is an error during the text extraction process, an error message will be displayed.

Main Functions

  • Import PDF: Allows you to select and import a PDF file from your local storage.
  • Export JSON: Once a PDF is imported, you can export the transcribed text to a JSON file.
  • File Path Display: Shows the path of the currently imported PDF file.

Troubleshooting

If you encounter any issues with the software, please ensure that you have the correct version of Python installed and that all dependencies from the requirements.txt file have been installed properly.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages