When working with Puppeteer, you will realize what tremendous power it has. It can render HTML exceptionally well. Since it uses the Chrome rendering engine, you get incredibly high-fidelity renderings, which is why we at cloudlayer.io chose to use Puppeteer from the start. However, it also has significant limitations that can cause many headaches when trying to do certain things using Puppeteer instead of WKHTML or something like PrinceXML. That is why we are busy at work augmenting Puppeteer to add these capabilities into our service.
One of the limitations of Puppeteer is the ability to create intricate headers. Since you are using HTML, you could develop headings in your HTML and use some Print CSS media tags to get things looking right with some success. But you will quickly realize that it has pretty severe drawbacks. One is how and where the page breaks occur and how the content gets laid out. In many circumstances, your content may get overlapped with your header and vice versa. What can make this situation even worse is that you may have this only occur on a tiny percentage of your PDF generations, making it challenging to track down the issue.
So how did we solve this? We created a system that allows you to define your header in your HTML. You wrap your header in a div or any other HTML element. You can even include an image/logo of your choice. Then when calling our URL to PDF API endpoint, you can pass in parameters for the headerTemplate.
For example, lets say your HTML looks like the following:
Let's say for the sake of simplicity, this page is hosted at: "https://example.com/header"
Note: This path doesn't exist. This is for illustration purposes only.
Here is an example payload you would pass into our URL to the PDF endpoint:
How it works
You may be wondering how this works. We are extracting the HTML element that you specify in the selector. Note: This can be any selector. In our example, we are using a class selector. Since Puppeteer doesn't support external resources, everything must be converted to embeddable or inline formats. After extracting the HTML element from the DOM, it's sent through a custom parsing engine, where this work is performed. Images, for example, are converted to embedded images, and styles are applied based on the settings specified in the parameters.
What about the "method" : "extract" parameter? What is that all about? It turns out that creating headers to cover everyone's scenario is complicated. The extraction technique is the solution that was just covered. We have additional solutions that we will be adding to cover every header/footer template out there!
How do you get started?
If you are interested in testing our service, create an account that will let you create 100 documents at no cost. If you like our service, you can upgrade to one of our affordable packages. Check out our pricing for more details.