Puppeteer Customizations
This page covers advanced Puppeteer configurations and customizations when working with Browserless.
Migrating from Local Chrome to Browserless
To migrate from local Chrome to Browserless, the main change is switching from puppeteer.launch()
to puppeteer.connect()
with a WebSocket endpoint. It is recommended to use puppeteer-core
instead of puppeteer
as it's more lightweight and doesn't include Chromium binaries:
Before Browserless
import puppeteer from "puppeteer";
const browser = await puppeteer.launch();
const page = await browser.newPage();
// ... your automation code
After Browserless
import puppeteer from "puppeteer-core";
const browser = await puppeteer.connect({
browserWSEndpoint: `wss://production-sfo.browserless.io?token=YOUR_API_TOKEN_HERE`,
});
const page = await browser.newPage();
// ... your automation code
If your Puppeteer scripts are getting blocked by bot detectors, you can use BrowserQL to generate a browser instance with advanced stealth features, that you can then connect to with the reconnect
method.
Performance Optimization: Reduce await
Operations
Most Puppeteer operations are async, meaning each await
command makes a network round-trip from your application to Browserless and back. These network round-trips add latency and slow down your automation. To minimize this, batch operations together whenever possible.
The key principle: use page.evaluate()
to run multiple DOM operations in a single round-trip, rather than making separate calls for each operation. Inside page.evaluate()
, code runs directly in the browser context, so multiple DOM queries and manipulations happen without additional network calls.
Examples
DON'T DO (3 network round-trips):
const $button = await page.$(".buy-now");
const buttonText = await $button.getProperty("innerText");
const clicked = await $button.click();
DO (1 network round-trip):
const buttonText = await page.evaluate(() => {
const $button = document.querySelector(".buy-now");
$button.click();
return $button.innerText;
});
Using Proxies
When using Puppeteer with Browserless, you can set up a proxy by adding the proxy parameter to your connection URL. You can also geolocate your IP address with the proxyCountry
parameter.
Basic Proxy Setup
import puppeteer from "puppeteer-core";
// Connect with residential proxy located in the US
const browser = await puppeteer.connect({
browserWSEndpoint: `wss://production-sfo.browserless.io/?token=${TOKEN}&proxy=residential&proxyCountry=us`,
});
const page = await browser.newPage();
// Visit a site that shows your IP address to verify proxy is working
await page.goto("https://icanhazip.com/");
console.log(await page.content());
await browser.close();
Cost Optimization
Using one or both of these flags will cost 6 units per MB of traffic, so consider rejecting media to save on MB consumed:
// Enable request interception before navigating to any page
await page.setRequestInterception(true);
page.on('request', (request) => {
// Block image resources to reduce bandwidth usage and costs
if (request.url().endsWith('.jpg') || request.url().endsWith('.png')) {
request.abort();
} else {
// Allow all other requests to proceed
request.continue()
}
});
For more detailed information about using in-built or third party proxies with Puppeteer, see our Proxies documentation.
Session Management
Creating Persistent Sessions
For long-running automation tasks, you can create persistent sessions that maintain state across multiple operations:
import puppeteer from "puppeteer-core";
const browser = await puppeteer.connect({
browserWSEndpoint: `wss://production-sfo.browserless.io/?token=${TOKEN}`,
});
// Create a persistent context
const context = await browser.createIncognitoBrowserContext();
const page = await context.newPage();
// Your automation code here
await page.goto("https://example.com");
// ... perform actions
// Keep the session alive for future use
// await browser.close(); // Don't close if you want to reuse
Managing Multiple Pages
const pages = [];
for (let i = 0; i < 3; i++) {
const page = await browser.newPage();
pages.push(page);
}
// Work with multiple pages simultaneously
await Promise.all(pages.map(async (page, index) => {
await page.goto(`https://example${index}.com`);
// ... perform actions
}));
Error Handling and Retry Logic
Implement robust error handling for production automation:
async function robustPageOperation(page, operation, maxRetries = 3) {
for (let attempt = 1; attempt <= maxRetries; attempt++) {
try {
return await operation(page);
} catch (error) {
console.log(`Attempt ${attempt} failed:`, error.message);
if (attempt === maxRetries) {
throw error;
}
// Wait before retrying
await new Promise(resolve => setTimeout(resolve, 1000 * attempt));
}
}
}
// Usage example
const result = await robustPageOperation(page, async (page) => {
await page.goto("https://example.com");
return await page.title();
});
Advanced Browser Options
Custom Launch Parameters
const browser = await puppeteer.connect({
browserWSEndpoint: `wss://production-sfo.browserless.io/?token=${TOKEN}&--disable-web-security&--disable-features=VizDisplayCompositor`,
});
Setting Viewport and User Agent
const page = await browser.newPage();
await page.setViewport({ width: 1920, height: 1080 });
await page.setUserAgent('Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36');
Monitoring and Debugging
Performance Monitoring
// Enable performance monitoring
const client = await page.target().createCDPSession();
await client.send('Performance.enable');
// Start monitoring
await client.send('Performance.getMetrics');
Network Request Monitoring
// Monitor network requests
page.on('request', request => {
console.log('Request:', request.url());
});
page.on('response', response => {
console.log('Response:', response.url(), response.status());
});
Best Practices
- Always close resources: Ensure you close pages and browsers when done
- Use try-catch blocks: Wrap operations in proper error handling
- Implement timeouts: Set reasonable timeouts for operations
- Monitor memory usage: Be aware of memory consumption in long-running scripts
- Use connection pooling: Reuse browser connections when possible
For more advanced configurations and troubleshooting, see our Launch Options and Troubleshooting documentation.