WeasyPrint v66 Has Been Released
WeasyPrint v66 has been released, and thanks to the wonderful NLnet foundation, we now have a solid PDF/UA… among other good news!
Solid PDF/UA Generation
Accessibility, Both for Legal Reasons and Real People
Accessibility in technology is a large topic. Even with good will, it’s really hard to get perfect results, because specifications and validation tools are sometimes really complex. 🤯
Moreover, getting documents that are accessible for computers is not the same as getting documents that are accessible for people. Blindly following the rules is not enough: we have to understand them deeply, so that the content can actually be understood using a screen reader, for example.
Our first PDF/UA implementation has been realized with validation topic in mind. We knew that it wasn’t enough, but it was a very useful way to learn more about this topic, as a first step. Of course, some of the choices made at this moment were not adapted to a clean and solid support, but at least we got a large overview of the topic and an implementation that could meet some of the legal requirements.
NLnet gave us an amazing opportunity to go further: have the time to focus, learn from our mistakes, and provide better accessibility results for humans.
Technical Challenges
The corner stone of PDF accessibility is tagging: we must store whether a text is a title, a paragraph, a table column header… or just a list bullet. For that, we have a reliable source of information, as the HTML provides a structure we can reuse in our HTML document.
The same applies to content nesting. HTML tags are nested in a way we want to keep in the PDF. Unfortunately this nesting hierarchy is not the same at the hierarchy used while drawing the content on the PDF, because it relies on stacking contexts, which are often unrelated.
That’s why linking the logical and drawing structures together is a challenge. With our first implementation, the document structure was based on the drawing structure, mainly because it was easier technically: we could build the tag tree while drawing the PDF. The problem is that the drawing structure is not the logical structure, and this was leading to obvious errors in the way content was nested. Even if the document was valid according to the specification, its content was not correctly ordered and organized for people.
The only reasonable solution we had was to rewrite the whole tagging system, to build the tree according to the semantic HTML tree, and then link the drawing elements to this tree.
Separating the Content from the Layout
This way is harder to code, but it leads to much better results, as we keep the original tag nesting order independently from the way the content is drawn. It considers the HTML as the only source of truth to define the document semantics.
Actually, this question is not new in the world of web pages. Separating the content from the graphical layout is an old dream, and CSS has been a wonderful tool to achieve this goal. We, CourtBouillon, have always followed this mantra when designing documents for clients: create the content first, semantically, without thinking about the way it will be displayed.
This choice may require slightly better CSS skills, but we deeply think that the result is worth the investment: it clearly draws a line between the content structure and the design, it helps us greatly to create HTML documents that are easy to read and to maintain. And now, thanks to the new implementation of PDF/UA, it gives the possibility to generate accessible documents with a single command-line option and no extra work.
What’s More?
Flex Layout
Following our work on the flex layout for the previous version, we committed many small fixes to correctly handle margins, basis sizes, blockification, etc. More unit test have been added, and 9 more tests of the W3C test suite pass!
Interestingly, many small fixes were actually related to other parts of the code, and will also benefit other layouts such as floats or grid. We’re happy to see that our implementation from last version is reliable and received a lot of positive feedback from our users! 😄
Better Footnotes
In this new version, we fixed many details related to footnotes:
- footnotes are now reported to the next page when they would force page breaks because of orphans,
- style is now correctly applied to footnote markers,
- footnote calls now inherit from footnotes, improving interoperability with other renderers,
- bottom margins on footnotes don’t break the layout anymore.
Fine-tuned Page Breaks for Tables
Handling page breaks in long, large, complex tables can be hard. We fixed a couple of bugs that were allowing unwanted page breaks after headers and page groups, giving more possibilities to define page-breaking rules in complex tables.
And More…
We can’t list here all the changes brought to you by version 66, but here are some features and bug fixes 🐛 that could be useful for you:
- support of
lh
andrlh
(relative to line-height) units, - support of
tranform-origin
in SVG, - circles drawn instead of rectangles for dotted borders and lines,
- better position of
outside
markers, - management of no-break spaces for hyphenation…
What’s Next?
We hope you’ll have fun with this new version and that all these new features will be useful for you!
What’s next? Thanks to NLnet, version 67 will bring CMYK and color profiles support for advanced printing. Color management is a topic we have a lot to learn about, we’re excited to bring this feature that a lot of users are waiting for. If you’re interested in testing this feature, don’t hesitate to contact us! 💌
You use WeasyPrint in your company? You’d like to get personalized support? It’s time to subscribe to our consulting packages! You can also become a proud WeasyPrint sponsor to give us more time to work on amazing new features 🚀 that will be useful for you in the future… even if you don’t know that yet!
If you’re interested in contributing to WeasyPrint, opening issues and pull requests is a good way to start. If you want to dive into the code, a list of good first issues is waiting for you! Choose your favorite issue and write a short comment, we’ll be happy to help you.