Refactor ImageWriter and add method for exporting an image from bytes. #737

pietermarsman · 2022-03-21T21:53:05Z

Pull request

Fixes #434. All images from that related PDF are now properly extracted.

Refactor ImageWriter and add method for exporting an image from bytes. E.g. when FlateDecode just results in a list of RGB bytes.

Also removed outputting the image size as part of the path name. Not sure what the goal for that is. This will break code that is expecting a specific image name. But the proper way to do that is to use ImageWriter as a user and use the output name. So I'm ok with changing this.

How Has This Been Tested?

With the PDF in the issue. Test cases still succeed. All existing code is refactored but should not have changed.

Checklist

I have formatted my code with black.
I have added tests that prove my fix is effective or that my feature
works
I have added docstrings to newly created methods and classes
I have optimized the code at least one time after creating the initial
version
I have updated the README.md or verified that this
is not necessary
I have updated the readthedocs documentation or
verified that this is not necessary
I have added a concise human-readable description of the change to
CHANGELOG.md

E.g. when FlateDecode just results in a list of RGB bytes.

* commit '1bf3c42b59125f4491d863e1c11dca7ebbe96adc': Use charset-normalizer instead of chardet (pdfminer#744) Refactor ImageWriter and add method for exporting an image from bytes. (pdfminer#737) Log warning and continue gracefully if errors in cmap (pdfminer#731) Fix log.debug statement in lzw.py by ensuring that self.table is always set (pdfminer#732) Raise KeyError when name in name2unicode is not of type str (pdfminer#733) Convert fontname to str if it is bytes in HTMLConverter (pdfminer#734) Fix github actions tag regex Fix github actions tag regex Bump version Add github action for releasing to pypi if git tag is added. (pdfminer#727)

pietermarsman force-pushed the 434-extract-flate-decode branch from 6987004 to 526f20b Compare March 21, 2022 21:53

pietermarsman added 2 commits March 21, 2022 22:54

Refactor ImageWriter and add method for exporting an image from bytes.

cc29480

E.g. when FlateDecode just results in a list of RGB bytes.

Added docstrings

b71477c

pietermarsman force-pushed the 434-extract-flate-decode branch from 526f20b to b71477c Compare March 21, 2022 21:54

Add CHANGELOG.md

d7f3187

pietermarsman requested a review from jstockwin March 21, 2022 21:55

pietermarsman added 2 commits March 21, 2022 22:59

Run black

c3f1719

Run black

6fc4dbf

jstockwin approved these changes Mar 22, 2022

View reviewed changes

pietermarsman merged commit 617e4c8 into master Mar 22, 2022

pietermarsman deleted the 434-extract-flate-decode branch March 22, 2022 19:58

pietermarsman mentioned this pull request Aug 22, 2022

ValueError: unrecognized image mode #795

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Refactor ImageWriter and add method for exporting an image from bytes. #737

Refactor ImageWriter and add method for exporting an image from bytes. #737

pietermarsman commented Mar 21, 2022 •

edited

Loading

Refactor ImageWriter and add method for exporting an image from bytes. #737

Refactor ImageWriter and add method for exporting an image from bytes. #737

Conversation

pietermarsman commented Mar 21, 2022 • edited Loading

pietermarsman commented Mar 21, 2022 •

edited

Loading