Puppeteer waitUntil Options

Puppeteer has an option called waitUntil where you can pass in several options. These options change the behavior of how and when it will complete the rendering of your page and return the results.

Puppeteer waitUntil Options

Here at cloudlayer.io, we use Puppeteer to render our HTML and Websites to generate high-fidelity results. No other tool, like a web browser, can produce the exact results you're looking for, no matter what HTML, Javascript, or CSS you use.

But we don't just use any web browser; we use Chrome and a headless version of Chrome that we have customized for our own needs. We made it more performant, resilient, scalable, and more.

If you have been using Puppeteer on your own, you will know it comes with many intricacies and pain points. One of those is ensuring all your content is fully loaded before outputting your result as a PDF or an Image.

So what is the best way to ensure your content is all there?

Puppeteer has an option called waitUntil, where you can pass in several options. These options change the behavior of how and when it will complete the rendering of your page and return the results.

Below are the options currently available as of this writing:

  • load - consider navigation to be finished when the load event is fired.
  • domcontentloaded - consider navigation to be finished when the DOMContentLoaded event is fired.
  • networkidle0 - consider navigation finished when there are no more than 0 network connections for at least 500 ms.
  • networkidle2 - consider navigation finished when there are no more than 2 network connections for at least 500 ms.

So that begs the question, why doesn't it just wait until it's "finished" and render the result?  After all, when I go to a page, the browser loads it, and it's done, right?  Well, no, actually.  The progress bar on your browser may have stopped and appear to have finished, but in reality, many websites still hold open connections to the server.  

These connections are used to give you real-time updates, notifications, and things like that.  So they are necessary. Not every website uses them, but many do, and because of this, the browser has no way of knowing if the website is indeed actually finished.  There is no definitive way, but several indicators can be used to determine if you "think" it's finished.

So you have loaded and domcontentloaded mentioned above. Based on static events, that will be very consistent. However, if you are getting inconsistent content loading using those events, you should move on to the more heuristic-based options. That's when you would use networkidle0 and networkidle2as these are heuristic-based methodologies for determining if a page is fully loaded. Since these are heuristic-based, they are imperfect and will only cover some scenarios.

Note: We will discuss some edge cases these don't cover and what you can do about them further below.

So when should you use each one?

  • networkidle0 is specifically tailored for SPA-based applications or applications written with code that explicitly closes their connections when finished. For example, anything that uses fetch requests.

  • networkidle2 is tailored more towards the page that uses streams, or long-lived connections, such as polling or background tasks that involve network connections. It's important to note that if the website keeps more than 2 active connections open, this option will timeout and indicate the page gets completed.

So there are some edge cases that none of these options would fix, and this is by no means a complete list of these but the most common.

  • Lazy-loaded images
  • Lazy-loaded content based on scrolling position
  • Videos or animated content

The fix is relatively simple for lazy-loaded images and lazy-loaded content. Scroll the content of the page to the end and render the result. You would also most likely want to use either networkidle0 or networkidle2 in conjunction. One catch to this solution is infinitely scrolling sites could cause a memory exception during excessively large renderings. You will want to build in some techniques to prevent that from occurring, or you could use our service, where we have done all the hard work for you. All you have to do is pass in the options.

  • autoScroll will simulate scrolling the page. It will attempt to scroll down the page, which will cause the lazy loaded elements on the page to render themselves. Combining this with networkidle0 or networkidle2, could give you proper results. Each scenario behaves differently, so there is no guarantee.

Now that you know how to use these options, you can check out our service with a free account and begin using our Puppeteer backed API endpoints.

Learn How to Automate your PDF Generation Process
Using our service you can automate the generation of PDF documents such as invoices, receipts, and more. In this article, we explain how to use our API and Zapier Integrations for generating dynamic PDF documents.
Learn How to use HTML to Generate Dynamic Images
Find out how to use HTML to generate rich images with useful things like graphs, QRCodes, dynamic text, and more.
Using the Puppeteer Header Template with Images and Styles
Learn how to use Puppeteer header templates and how we customized them with our service to make it easier and more feature rich.