Generate PDF files without embedding default PDF fonts #551

FelixSchwarz · 2017-12-15T09:38:55Z

I'm generating a few hundred PDF files with weasyprint and the total size of the PDF files starts to add up. That in turn starts pushing some other legacy systems quite badly so I'm looking for ways to reduce the final PDF size.

I noticed that proprietary PDF tools are able to "optimize" my generated PDFs which results in way smaller PDF files. Now I'm looking for ways to do achieve similar effects with WeasyPrint.

From what I can see in Evince each of my PDFs has some embedded fonts (even though I don't have any special font requirements). My hope was that I could somehow tell WeasyPrint to use Helvetica which is (AFAIK) a PDF default font which doesn't need to be embedded.

Unfortunately so far I was unable to achieve this. WeasyPrint always embeds a font. Is there any way to tell WeasyPrint it should NOT embed Helvetica/ensure only standard PDF fonts are used?

brnosouza · 2020-03-25T18:00:18Z

Do you guys have any update on this issue? I would love to use this feature

liZe · 2020-03-27T15:30:52Z

Do you guys have any update on this issue? I would love to use this feature

The best way to get this is to get rid of Cairo (see #841), but that’s hard to do.

bl-ue · 2021-01-25T21:50:04Z

Any progress since March?

FelixSchwarz · 2021-01-25T22:05:06Z

@bl-ue I think #1232 and https://github.com/CourtBouillon/pydyf should provide a path forward so this can be implemented.

grewn0uille · 2021-01-25T22:09:58Z

@bl-ue I think #1232 and https://github.com/CourtBouillon/pydyf should provide a path forward so this can be implemented.

That’s right!
We replaced Cairo by our own PDF generator on the master branch. There is some work to do before we can make a release. You can follow #1232 to track the progress.
After a release without Cairo, we will be able to work on features like this one!

bl-ue · 2021-01-25T22:14:09Z

Wow! Thank you for the fast response!! 🤩

We over @tldr-pages use WeasyPrint several times a day, and the PDF that contains all of our (4000?) pages is getting a little big...:smile:

I'll be sure to track that issue and introduce the rest of the team to it. Again, thank you guys for your extremely fast responses!!!

liZe · 2021-01-26T22:39:34Z

I'll be sure to track that issue and introduce the rest of the team to it. Again, thank you guys for your extremely fast responses!!!

You’re welcome!

I’ve read your script, and your problem is not the one described by this issue. Your problem is that fonts are embedded multiple times when pages from multiple WeasyPrint documents are put in a single PDF. Could you please open a new issue with a link to your Python script? Thank you!

The hb_face value is different when the Pango font is different, and it causes problems when multiple Pango contexts are used (for example when pages from multiple documents are mixed). Related to #551.

liZe · 2021-02-01T21:48:57Z

@bl-ue Your problem is fixed (I hope) with the current master branch. If you can test TLDR with the next version of WeasyPrint, I’d be glad to know if it works for you (and of course get your bug reports 😉). If it doesn’t work, please open a new issue!

bl-ue · 2021-02-01T21:51:14Z

Okay @liZe, wonderful! I'm sorry I never got to the issue—I got busy :)

P.S. Your team is really responsive! Good choice to pick WeasyPrint! :D

It’s probably slow, but at least it’s reliable. We can find a better solution later. Related to #551.

liZe · 2021-02-01T22:38:23Z

Here’s the result for French pages.

tldr-pages-pydyf.pdf

PDF size was 3.4MB, it’s now 400kB 🎉.

P.S. Your team is really responsive! Good choice to pick WeasyPrint! :D

❤️

mailq · 2024-11-04T20:57:08Z

I'd like to point again at this issue, as it is almost seven years old and the preconditions were met four years ago.

I'm not a Python developer, but I would argue that this fix should be only a few lines of code in fonts.py. If it encounters one of the 14 default PDF fonts, it should be handled differently. This request is not compatible with PDF/A creation, where all fonts have to be included. But it should be possible to get it working for "normal", "simple" PDFs, where I need this functionality.

liZe · 2024-11-05T16:17:02Z

I'm not a Python developer, but I would argue that this fix should be only a few lines of code in fonts.py.

I bet it’s not. 😄

Even the test to find if a font matches the embedded font names is probably not trivial. PDF/A is another detail, but there are many open questions to solve.

For example, what happens when a document uses a character in Helvetica that’s outside the Latin character set defined by the PDF specification? We have to embed the Helvetica font, because these characters are not supported by the Helvetica font provided PDF readers. It means that for each font, we have to check the list of glyphs used in a document to define if the font has to be embedded or not. Of course, the supported encodings depend on the font. And for Latin characters, it’s not one encoding, it’s actually 4 different encodings proposed by the specification.

And of course, we’ll need quite a lot of tests. 😄

If someone is interested in opening a pull request, it would be wonderful, we could discuss and improve the PR. But pretending that a bug is "only a few lines of code in fonts.py" minimizes the problem a little bit.

liZe added the feature New feature that should be supported label Jan 4, 2018

liZe changed the title ~~how to use PDF default fonts (= reduced PDF size)?~~ Generate PDF files without embedding default PDF fonts Jan 4, 2018

older-pack mentioned this issue Dec 4, 2020

Update the PDF version? tldr-pages/tldr#4969

Closed

older-pack mentioned this issue Jan 25, 2021

Reduce the PDF file size tldr-pages/tldr#5186

Closed

liZe added a commit that referenced this issue Feb 1, 2021

Use face content hash as key for fonts

05b9c71

It’s probably slow, but at least it’s reliable. We can find a better solution later. Related to #551.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Generate PDF files without embedding default PDF fonts #551

Generate PDF files without embedding default PDF fonts #551

FelixSchwarz commented Dec 15, 2017

brnosouza commented Mar 25, 2020

liZe commented Mar 27, 2020

bl-ue commented Jan 25, 2021

FelixSchwarz commented Jan 25, 2021

grewn0uille commented Jan 25, 2021

bl-ue commented Jan 25, 2021

liZe commented Jan 26, 2021

liZe commented Feb 1, 2021

bl-ue commented Feb 1, 2021 •

edited

Loading

liZe commented Feb 1, 2021

mailq commented Nov 4, 2024

liZe commented Nov 5, 2024

Generate PDF files without embedding default PDF fonts #551

Generate PDF files without embedding default PDF fonts #551

Comments

FelixSchwarz commented Dec 15, 2017

brnosouza commented Mar 25, 2020

liZe commented Mar 27, 2020

bl-ue commented Jan 25, 2021

FelixSchwarz commented Jan 25, 2021

grewn0uille commented Jan 25, 2021

bl-ue commented Jan 25, 2021

liZe commented Jan 26, 2021

liZe commented Feb 1, 2021

bl-ue commented Feb 1, 2021 • edited Loading

liZe commented Feb 1, 2021

mailq commented Nov 4, 2024

liZe commented Nov 5, 2024

bl-ue commented Feb 1, 2021 •

edited

Loading