-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Enable image compression #1546
Comments
@pubpub-zz Running the following code compresses all content streams with DEFLATE, including the images, right? from pypdf import PdfReader, PdfWriter
reader = PdfReader("example.pdf")
writer = PdfWriter()
for page in reader.pages:
page.compress_content_streams() # This is CPU intensive!
writer.add_page(page)
with open("out.pdf", "wb") as f:
writer.write(f) |
There are also quite a lot of image compression algorithms |
Yes, it does but as mentionned in the doc, it does not work on all pdf files (at least it does not increase size as GS do)
pypdf column uses you quoted method |
Having a quick look to the code, It seams that only the content is deflated but not Ximages where big images should be. @finevine, |
Yeah but the idea could be to implement a setter to images and let people change images and act on each, as pikepdf offer but without too many options (pikepdf is too complicated I think)
That was the idea of my original feature request title.
I'm willing to help but don't know how to...
It's only an idea. Having only a getter on images is frustrating!
|
some notes/ideas about image setting: from pypdf import PdfReader, PdfWriter
from pypdf.generic import NameObject, NullObject
from PIL import Image
from io import BytesIO
w = PdfWriter()
w.append("resources/labeled-edges-center-image.pdf")
for p in w.pages:
for image_file_object in p.images:
print(image_file_object.name)
ii = Image.open(BytesIO(image_file_object.data))
b = BytesIO()
ii.save(b, "pdf", quality=60, resolution=19.0, optimize=True)
rrr = PdfReader(b)
n = NameObject("/" + "".join(image_file_object.name.split(".")[:-1]))
ind = p["/Resources"]["/XObject"].raw_get(n)
w._objects[ind.idnum] = NullObject() # to cleanup file
p["/Resources"]["/XObject"][n] = (
rrr.pages[0]["/Resources"]["/XObject"]["/image"].clone(w).indirect_reference
)
w.write("tt.pdf") edit : code updated |
Having the capability to replace images trivially extends to compressing a PDF file size by reducing the contained images. Closes #1546
Explanation
I want to replace images in a pdf with compressed ones.
Gettings the images and saving them to disk work like a charm with the example in doc.
But I cannot change them in the pdf
Code Example
How would your feature be used?
I have found a bunch of code that aimed at coding this feature:
but it doesn't work as expected.
The text was updated successfully, but these errors were encountered: