Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Processing weasyprint generated PDF with ghostscript raises warnings #615

Closed
hjpotter92 opened this issue Apr 12, 2018 · 2 comments · Fixed by #665
Closed

Processing weasyprint generated PDF with ghostscript raises warnings #615

hjpotter92 opened this issue Apr 12, 2018 · 2 comments · Fixed by #665
Labels
bug Existing features not working as expected
Milestone

Comments

@hjpotter92
Copy link

hjpotter92 commented Apr 12, 2018

This is an awe inspiring tool. Really!

I processed around 200 pages using weasyscript pretty easily. However, the user wanted those generated pdfs to be bound into a single file. I ran the following ghostscript command:

gs -dBATCH -dNOPAUSE -sDEVICE=pdfwrite -sOutputFile=../january.pdf *.pdf

I should mention that all the files in .pdf are opening fine in any of the PDF readers that I have. However, after executing the above script, the output was as follows:

GPL Ghostscript 9.20 (2016-09-26)
Copyright (C) 2016 Artifex Software, Inc.  All rights reserved.
This software comes with NO WARRANTY: see the file PUBLIC for details.
Processing pages 1 through 1.
Page 1
Processing pages 1 through 2.
Page 1
Page 2
Processing pages 1 through 2.
Page 1
Page 2
Processing pages 1 through 2.
Page 1
   **** Error reading a content stream. The page may be incomplete.
               Output may be incorrect.
   **** Error: File did not complete the page properly and may be damaged.
               Output may be incorrect.
Page 2
   **** Error reading a content stream. The page may be incomplete.
               Output may be incorrect.
   **** Error: File did not complete the page properly and may be damaged.
               Output may be incorrect.

   **** This file had errors that were repaired or ignored.
   **** The file was produced by: 
   **** >>>> WeasyPrint 0.42.3 (http://weasyprint.org/) <<<<
   **** Please notify the author of the software that produced this
   **** file that it does not conform to Adobe's published PDF
   **** specification.

   **** The rendered output from this file may be incorrect.
Processing pages 1 through 3.
Page 1
   **** Error reading a content stream. The page may be incomplete.
               Output may be incorrect.
   **** Error: File did not complete the page properly and may be damaged.
               Output may be incorrect.
Page 2
Page 3
   **** Error reading a content stream. The page may be incomplete.
               Output may be incorrect.
   **** Error: File did not complete the page properly and may be damaged.
               Output may be incorrect.

   **** This file had errors that were repaired or ignored.
   **** The file was produced by: 
   **** >>>> WeasyPrint 0.42.3 (http://weasyprint.org/) <<<<
   **** Please notify the author of the software that produced this
   **** file that it does not conform to Adobe's published PDF
   **** specification.

   **** The rendered output from this file may be incorrect.
Processing pages 1 through 2.
Page 1
Page 2
Processing pages 1 through 2.
Page 1
Page 2
Processing pages 1 through 2.
Page 1
Page 2
Processing pages 1 through 2.
Page 1
Page 2
Processing pages 1 through 2.
Page 1
   **** Error reading a content stream. The page may be incomplete.
               Output may be incorrect.
   **** Error: File did not complete the page properly and may be damaged.
               Output may be incorrect.
Page 2

   **** This file had errors that were repaired or ignored.
   **** The file was produced by: 
   **** >>>> WeasyPrint 0.42.3 (http://weasyprint.org/) <<<<
   **** Please notify the author of the software that produced this
   **** file that it does not conform to Adobe's published PDF
   **** specification.

   **** The rendered output from this file may be incorrect.
Processing pages 1 through 2.
Page 1
Page 2
Processing pages 1 through 2.
Page 1
Page 2
Processing pages 1 through 1.
Page 1
Processing pages 1 through 1.
Page 1
Processing pages 1 through 1.
Page 1
   **** Error reading a content stream. The page may be incomplete.
               Output may be incorrect.
   **** Error: File did not complete the page properly and may be damaged.
               Output may be incorrect.

   **** This file had errors that were repaired or ignored.
   **** The file was produced by: 
   **** >>>> WeasyPrint 0.42.3 (http://weasyprint.org/) <<<<
   **** Please notify the author of the software that produced this
   **** file that it does not conform to Adobe's published PDF
   **** specification.

   **** The rendered output from this file may be incorrect.

and opening the combined january.pdf, I see that there are a lot of blank pages now.

I can attach one of the files processed by weasyprint. Using the command above with just this file will show that while the pdf on its own has no errors, combining results in blank pages.

This might be caused because of a very large image in the file itself.

Any help to resolve this would be appreciated. Thanks

weasyprint-article.pdf

@liZe
Copy link
Member

liZe commented Apr 18, 2018

This is an awe inspiring tool. Really!

😄

However, after executing the above script, the output was as follows: […]

This problem is well known and discussed in #565, #596, #550 and #523. TL;DR: for some unknown reason, using the pdfrw library (used by WeasyPrint 0.41+) with Cairo 1.14.x generates PDF files that are broken for some PDF readers (at least GhostScript).

There's no perfect solution for now. I'd love to find some PDF guru to help us find what's wrong in the PDF file or in GhostScript, so that we can fix WeasyPrint, Cairo, pdfrw or GhostScript. Until then, the only choices we have is either use WeasyPrint 0.40 or use Cairo 1.15.4+.

@liZe liZe added the bug Existing features not working as expected label Apr 18, 2018
@liZe liZe added this to the 43 milestone Apr 18, 2018
@liZe
Copy link
Member

liZe commented Apr 21, 2018

I get this error with Ghostscript 9.20 but not with 9.21. 🎉 Could you try to update your version of Ghostscript?

liZe added a commit that referenced this issue Aug 6, 2018
pdfrw is a great piece of software, but we don't know PDF enough to debug the
problems we've met. It's safer to use the new cairo API and get back to manual
edition for attachments and bleed boxes.

We only have two regressions for now:
- some internal links are broken,
- PDF producer is not overwritten.

A mail has been sent to cairo's mailing-list about that:
https://lists.cairographics.org/archives/cairo/2018-August/028694.html

Fix #639, #615, fix #596, fix #565.
@liZe liZe closed this as completed in #665 Aug 17, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Existing features not working as expected
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants