Connecting Puppeteer
Puppeteer can connect to Browserless through Chrome's DevTools Protocol (CDP) using websockets. This is the primary and recommended way to connect to Browserless, as it provides a stable and reliable connection.
- Javascript
- Python
import puppeteer from "puppeteer-core";
// Connecting to Chrome locally
const browser = await puppeteer.launch();
// Connecting to Browserless
const browser = await puppeteer.connect({
browserWSEndpoint: `wss://production-sfo.browserless.io?token=${TOKEN}`
});
import asyncio
from pyppeteer import connect_over_cdp
async def main():
# Connecting to Chrome locally
browser = await launch()
# Connecting to Browserless
browser = await connect_over_cdp(
endpoint=f"wss://production-sfo.browserless.io?token={TOKEN}"
)
Basic Usage
In order to use the Browserless service, simply change the following:
Before browserless
import puppeteer from "puppeteer";
const browser = await puppeteer.launch();
const page = await browser.newPage();
// ...
After browserless
import puppeteer from "puppeteer-core";
const browser = await puppeteer.connect({
browserWSEndpoint: `wss://production-sfo.browserless.io?token=YOUR_API_TOKEN_HERE`,
});
const page = await browser.newPage();
// ...
Get Started
Sign up for an API key
To use Browserless, you'll need an API key. You can get one by:
- Sign up or login to your browserless account
- Navigate to your dashboard
- Copy your API key from the dashboard
Add Puppeteer to your project
When using Browserless, you should use puppeteer-core
instead of puppeteer
. This is because:
puppeteer-core
doesn't download Chromium binaries, which aren't needed when connecting to a remote browser- It's lighter and faster to install
- It's more suitable for production environments
Install it using:
npm install puppeteer-core
# or
yarn add puppeteer-core
Code Snippet
Here's a sample snippet that demonstrates how to use Puppeteer with Browserless
import puppeteer from "puppeteer-core";
async function main() {
const url = "https://www.example.com";
const token = "YOUR_API_TOKEN_HERE";
const browser = await puppeteer.connect({
browserWSEndpoint: `wss://production-sfo.browserless.io/?token=${token}`,
});
const page = await browser.newPage();
await page.goto("https://www.example.com");
const title = await page.title();
console.log(`The page's title is: ${title}`);
await browser.close();
}
main().catch(error => {
console.error("Unhandled error in main function:", error);
});
If your Puppeteer scripts are getting blocked by bot detectors, you can use BrowserQL to generate a browser instance with advanced stealth features, that you can then connect to with the reconnect
method.
Reduce await
's as much as possible
Most of puppeteer is async, meaning any command with await
in front of it (or .then
's) is going to make a round-trip from your application to browserless and back. While you can only do so much to limit this you should definitely try and do as much as possible. For instance, use page.evaluate
over page.$selector
as you can accomplish a lot in one evaluate
versus multiple $selector
calls.
DON'T DO
const $button = await page.$(".buy-now");
const buttonText = await $button.getProperty("innerText");
const clicked = await $button.click();
DO
const buttonText = await page.evaluate(() => {
const $button = document.querySelector(".buy-now");
const clicked = $button.click();
return $button.innerText;
});
Using Proxies
When using Puppeteer with Browserless, you can set up a proxy by adding the proxy parameter to your connection URL, you can also geolocate your IP address with the proxyCountry parameter.
import puppeteer from "puppeteer-core";
const browser = await puppeteer.connect({
browserWSEndpoint: `wss://production-sfo.browserless.io/?token=${TOKEN}&proxy=residential&proxyCountry=us`,
});
const page = await browser.newPage();
await page.goto("https://icanhazip.com/");
console.log(await page.content());
await browser.close();
Using one or both of these flags will cost 6 units per MB of traffic, so consider rejecting media to save on MB consumed.
// place this before navigating to the site.
await page.setRequestInterception(true);
page.on('request', (request) => {
// Block certain resources
if (request.url().endsWith('.jpg') || request.url().endsWith('.png')) {
request.abort();
} else {
request.continue()
}
});
For more detailed information about using in-built or third party proxies with Puppeteer, see our Proxies documentation.