Reduce the PDFs Size
In this new WeasyPrint version, a lot of work has been done to reduce the size of the generated documents.
Size optimizations have been added thanks to the financial support of Code & Co.. It’s been a real pleasure to work with them 😻.
A New pydyf Version
In order to reduce the size of the generated PDFs, a new version of pydyf has been released.
This pydyf version avoids writing useless characters in the PDF (like spaces, for example). This works for all types of PDFs (PDF/UA, PDF/A, regular PDFs) and thus all of your documents will be lighter!
Also in this version, we add a new option to allow the creation of compressed PDFs.
By default, the generated PDFs will be compressed. But you can disable that using
uncompressed-pdf option in WeasyPrint.
New Optimize Options
uncompressed-pdf option is related to an API change ⚠️.
Until then, you were able to choose several optimizations through the
-O option. Now this option is deprecated and has been
replaced by different options with clear names to be more intuitive and
to make it easier for you to customize your optimizations!
These new options include:
--optimize-images option, formerly
-O images, reduces
the size of the images in the documents. To do that, we rely on Pillow.
It’s a lossless optimization for your PDFs.
By default, fonts are optimized to take up less space in the documents. You can
disable that and include the full fonts by using the
Other information related to the fonts can be stored in the PDF, like the hinting.
By default, WeasyPrint doesn’t include this information, but you can now do
it by using the
For now, it’s more like we have new options to increase the size of the PDFs
but not really to reduce it… So let’s talk about
One of the easiest ways to reduce the size of a document is to reduce the size of the images it contains.
For that, we add a new
--jpeg-quality option which allows you
to choose the quality of the JPEG images in your PDFs.
The quality is between 0 (worst) and 95 (best). The lower the quality is, the smaller the PDF is. You can try different values and see which one works well for you!
Another new option has been added for images:
This option allows you to set the maximum resolution for embedded images in the PDFs.
That’s all for the size of the generated PDFs, let’s talk about memory used.
Reduce the Memory Used
One of the most consuming memory thing in WeasyPrint are images.
As said before, we use Pillow to deal with images in WeasyPrint. Until then, we stored an image all the long of the documents’s generation.
Now, when a Pillow image is created, it’s transformed into a ready-to-store-in-pdf image and then forgotten, thus WeasyPrint consumes less memory.
On top of that, a new option has been added related to that:
With this option, you can specify a folder to store the images on the disk instead of storing them in memory.
That’s a lot of changes to reduce the size of the generated PDFs, and what is always nice are numbers, so let’s compare options on different documents 🤓.
For this, we’re going to use the following documents:
- Odyssey: sample from WeasyPerf, with a lot of text
- HTML5: sample from WeasyPerf, with a lot of links
- All Cats Are Beautiful: book sample, with a lot of images
- Report: report sample, a quite common document with text and images
The different comparisons are going to be made on:
- previous WeasyPrint and pydyf versions
- new WeasyPrint and pydyf versions
Results are in bytes.
|WeasyPrint v58.1 (ref.)||1'157'953||1'435'069||5'266'989||685'147|
|with the new pydyf||971'714 (-16%)||291'305 (-79%)||5'280'512 (+0.2%)||667'042 (-2%)|
||971'713 (-16%)||291'306 (-79%)||4'864'745 (-7%)||573'802 (-16%)|
||971'711 (-16%)||291'309 (-79%)||4'241'525 (-19%)||500'068 (-27%)|
|with ||971'710 (-16%)||291'309 (-79%)||1'372'049 (-73%)||374'106 (-45%)|
The most impressive results are on the HTML5 documents, which is a lot of text and links.
There are no change with the images optimizations with Odyssey and HTML5 samples as these documents don’t contain any images.
What’s Coming Next?
What’s coming next is a hard question for us. Every time we said something, we were wrong, so no more assumptions!
What’s coming next is rather a question for you 😝. If you want some new features, some bug fixes, don’t hesitate to contact us to share your needs. You can also become a sponsor on OpenCollective, it’s really helping us to secure time to work on WeasyPrint and its dependencies.
Have fun with this new WeasyPrint version, we hope that all of these improvements will be useful for you!