A CLI tool to extract images from PowerPoint, Word and PDF files written in Python π. This script extract all images in your .pptx, .docx, or .pdf file into a local folder. The benefit of using this tool to extract images over taking screenshots is that you get the highest resolution possible.
- 1οΈβ£ Extract images from PowerPoint presentations
- 2οΈβ£ Extract images from Word (doc/docx) documents
- 3οΈβ£ Extract images from PDF files
- β¬οΈ Extract and download all images within a PowerPoint, Word or PDF
- π Supports all image file types (jpg, png, jp2, gif, tiff, ...)
- π Supports extracing images from: PowerPoint (.pptx, .ppt), Word (.docx, .doc) and PDF (.pdf)
- πΈ High resolution images: Images are not compressed
- π Runs locally: Keep your data
Create a virtual Python env
python3 -m venv env
Activate the virtual env
source env/bin/activate
Using pip install all dependencies
pip3 install -r requirements.txt
You need to have 7Zip installed because under the hood unzip
is used to unarchive and archive the pptx files.
python3 image_extractor.py <INPUT_FILE_PATH>
extracted_images
in the same folder as the original document.
Apache License 2.0: See LICENSE
file
Written and maintained by SlideSpeak.co