Best Practices
This document outlines essential best practices for creating robust and reliable Puppeteer and Playwright scripts when using Browserless. These practices will help you avoid common pitfalls, improve performance, and create more maintainable automation code.
Wait for the page to be ready
When is your page really fully ready? This really depends on your site. Some sites can actually be loaded from early stages of loading while others might need more things to load and execute. So you might have to wait for a load event, or wait for a selector to appear or in worst case scenario, a hardcoded time, which we try to avoid. Here are a few things you can wait for to make sure the information is available:
- Load events (
load
,domcontentloaded
,networkidle
) - Specific selectors to appear or become visible
- Network requests to complete
- Custom events fired by the page
Wait Until the Page has Loaded
- JavaScript
- Python
import puppeteer from "puppeteer-core";
const browser = await puppeteer.connect({
browserWSEndpoint: `wss://production-sfo.browserless.io/?token=YOUR_API_TOKEN_HERE`,
});
const page = await browser.newPage();
// Fast navigation - use when you only need DOM content
await page.goto('https://example.com', {
waitUntil: 'domcontentloaded',
timeout: 30000
});
// Use networkidle2 only when you need all resources loaded
await page.goto('https://spa-app.com', {
waitUntil: 'networkidle2',
timeout: 60000
});
from playwright.async_api import async_playwright
async with async_playwright() as p:
browser = await p.chromium.connect_over_cdp(
"wss://production-sfo.browserless.io/?token=YOUR_API_TOKEN_HERE"
)
page = await browser.new_page()
# Fast navigation - use when you only need DOM content
await page.goto('https://example.com',
wait_until='domcontentloaded',
timeout=30000)
# Use networkidle only when you need all resources loaded
await page.goto('https://spa-app.com',
wait_until='networkidle',
timeout=60000)
For detailed guidance on choosing the right waitUntil
option, see our waitUntil blog post and timeout troubleshooting guide.
Wait for Specific Selectors
Waiting for specific selectors to appear is one of the most reliable methods to ensure your target content is ready. This approach is particularly useful for dynamic content that loads asynchronously.
- JavaScript
- Python
// Wait for a specific element to appear
await page.waitForSelector('.dynamic-content');
// Wait for element to be visible
await page.waitForSelector('.loading-spinner', { hidden: true });
// Wait for multiple elements
await page.waitForSelector('.product-list .product-item');
# Wait for a specific element to appear
await page.wait_for_selector('.dynamic-content')
# Wait for element to be visible
await page.wait_for_selector('.loading-spinner', state='hidden')
# Wait for multiple elements
await page.wait_for_selector('.product-list .product-item')
Wait for Network Requests
Sometimes you need to wait for specific network requests to complete before proceeding. This is especially useful when your target data comes from API calls.
- JavaScript
- Python
// Wait for a specific request to complete
await page.waitForResponse(response =>
response.url().includes('/api/data') && response.status() === 200
);
// Wait for all requests to finish
await page.waitForLoadState('networkidle');
// Wait for multiple requests
await Promise.all([
page.waitForResponse(response => response.url().includes('/api/users')),
page.waitForResponse(response => response.url().includes('/api/posts'))
]);
# Wait for a specific request to complete
async with page.expect_response(lambda response: '/api/data' in response.url and response.status == 200):
pass
# Wait for all requests to finish
await page.wait_for_load_state('networkidle')
# Wait for multiple requests
await asyncio.gather(
page.wait_for_response(lambda response: '/api/users' in response.url),
page.wait_for_response(lambda response: '/api/posts' in response.url)
)
Wait for Custom Events
Many modern web applications fire custom events when specific actions are complete. You can listen for these events to know exactly when your target functionality is ready.
- JavaScript
- Python
// Wait for a custom event
await page.evaluate(() => {
return new Promise((resolve) => {
document.addEventListener('dataLoaded', resolve, { once: true });
});
});
// Wait for multiple custom events
await Promise.all([
page.evaluate(() => new Promise(resolve =>
document.addEventListener('userDataReady', resolve, { once: true })
)),
page.evaluate(() => new Promise(resolve =>
document.addEventListener('contentReady', resolve, { once: true })
))
]);
# Wait for a custom event
await page.evaluate("""
() => new Promise((resolve) => {
document.addEventListener('dataLoaded', resolve, { once: true });
})
""")
# Wait for multiple custom events
await asyncio.gather(
page.evaluate("""
() => new Promise(resolve =>
document.addEventListener('userDataReady', resolve, { once: true })
)
"""),
page.evaluate("""
() => new Promise(resolve =>
document.addEventListener('contentReady', resolve, { once: true })
)
""")
)
One of the most impactful optimizations you can make is changing how you handle page navigation. Instead of waiting for all network activity to cease, use domcontentloaded
for faster execution when you don't need all resources loaded.
Avoid network latency
It's important to call the nearest regional endpoint to reduce network latency and improve performance. Using an endpoint that's geographically closer to your application will significantly reduce the time it takes to establish connections and transfer data. For a list of available regional endpoints and load balancing options, see our load balancers documentation.
Concurrency limits
Always close your browser sessions properly to avoid hitting concurrency limits:
- Puppeteer
- Playwright
import puppeteer from "puppeteer-core";
const browser = await puppeteer.connect({
browserWSEndpoint: `wss://production-sfo.browserless.io/?token=YOUR_API_TOKEN_HERE`,
});
try {
const page = await browser.newPage();
await page.goto('https://example.com', { waitUntil: 'domcontentloaded' });
// Your automation logic here
} catch (error) {
console.error('Automation failed:', error.message);
} finally {
// Always close the browser, even on errors
if (browser.isConnected()) {
await browser.close();
}
}
import playwright from "playwright";
const browser = await playwright.chromium.connectOverCDP(
`wss://production-sfo.browserless.io/?token=YOUR_API_TOKEN_HERE`
);
try {
const page = await browser.newPage();
await page.goto('https://example.com', { waitUntil: 'domcontentloaded' });
// Your automation logic here
} catch (error) {
console.error('Automation failed:', error.message);
} finally {
// Always close the browser, even on errors
if (browser.isConnected()) {
await browser.close();
}
}
Reduce Network Round-trips
Minimize await
calls by using page.evaluate()
for multiple DOM operations:
- Puppeteer
- Playwright
// DON'T DO - Multiple round-trips
const button = await page.$('.buy-now');
const buttonText = await button.getProperty('innerText');
const isVisible = await button.isIntersectingViewport();
await button.click();
// DO - Single round-trip
const result = await page.evaluate(() => {
const button = document.querySelector('.buy-now');
const buttonText = button.innerText;
const isVisible = button.offsetParent !== null;
button.click();
return { buttonText, isVisible, clicked: true };
});
// DON'T DO - Multiple round-trips
const button = await page.locator('.buy-now');
const buttonText = await button.textContent();
const isVisible = await button.isVisible();
await button.click();
// DO - Single round-trip
const result = await page.evaluate(() => {
const button = document.querySelector('.buy-now');
const buttonText = button.textContent;
const isVisible = button.offsetParent !== null;
button.click();
return { buttonText, isVisible, clicked: true };
});
Use Helper Classes
Create a helper class to abstract monitoring, logging, and common operations from your business logic. This makes your code more maintainable and reusable.
- Puppeteer
- Playwright
import puppeteer from "puppeteer-core";
class BrowserlessHelper {
constructor(token, region = 'production-sfo') {
this.token = token;
this.region = region;
this.browser = null;
this.page = null;
}
async connect() {
this.browser = await puppeteer.connect({
browserWSEndpoint: `wss://${this.region}.browserless.io/?token=${this.token}`,
});
this.setupBrowserMonitoring();
return this.browser;
}
async createPage() {
if (!this.browser) {
throw new Error('Browser not connected. Call connect() first.');
}
this.page = await this.browser.newPage();
this.setupPageMonitoring();
return this.page;
}
setupBrowserMonitoring() {
this.browser.on('disconnected', () => {
console.error('Browser disconnected unexpectedly');
});
this.browser.on('targetdestroyed', target => {
console.warn(`Target destroyed: ${target.url()}`);
});
}
setupPageMonitoring() {
this.page.on('requestfailed', request => {
console.error(`Request failed: ${request.url()} - ${request.failure().errorText}`);
});
this.page.on('response', response => {
if (!response.ok()) {
console.error(`Response error: ${response.url()} - ${response.status()}`);
}
});
this.page.on('pageerror', error => {
console.error(`Page error: ${error.message}`);
});
}
async navigateWithRetry(url, options = {}) {
const defaultOptions = {
waitUntil: 'domcontentloaded',
timeout: 30000,
...options
};
const maxRetries = 3;
let lastError;
for (let i = 0; i < maxRetries; i++) {
try {
await this.page.goto(url, defaultOptions);
return;
} catch (error) {
lastError = error;
console.warn(`Navigation attempt ${i + 1} failed: ${error.message}`);
if (i < maxRetries - 1) {
await this.delay(1000 * (i + 1)); // Exponential backoff
}
}
}
throw lastError;
}
async screenshotOnError(error, filename = 'error-screenshot.png') {
try {
if (this.page) {
await this.page.screenshot({ path: filename, fullPage: true });
console.log(`Screenshot saved: ${filename}`);
}
} catch (screenshotError) {
console.error('Failed to take screenshot:', screenshotError.message);
}
}
async cleanup() {
try {
if (this.browser && this.browser.isConnected()) {
await this.browser.close();
}
} catch (error) {
console.error('Error during cleanup:', error.message);
}
}
delay(ms) {
return new Promise(resolve => setTimeout(resolve, ms));
}
}
// Usage example
const helper = new BrowserlessHelper('YOUR_API_TOKEN_HERE');
try {
await helper.connect();
const page = await helper.createPage();
await helper.navigateWithRetry('https://example.com');
// Your automation logic here
const title = await page.title();
console.log('Page title:', title);
} catch (error) {
console.error('Automation failed:', error.message);
await helper.screenshotOnError(error);
} finally {
await helper.cleanup();
}
import playwright from "playwright";
class BrowserlessHelper {
constructor(token, region = 'production-sfo') {
this.token = token;
this.region = region;
this.browser = null;
this.page = null;
}
async connect() {
this.browser = await playwright.chromium.connectOverCDP(
`wss://${this.region}.browserless.io/?token=${this.token}`
);
this.setupBrowserMonitoring();
return this.browser;
}
async createPage() {
if (!this.browser) {
throw new Error('Browser not connected. Call connect() first.');
}
this.page = await this.browser.newPage();
this.setupPageMonitoring();
return this.page;
}
setupBrowserMonitoring() {
this.browser.on('disconnected', () => {
console.error('Browser disconnected unexpectedly');
});
}
setupPageMonitoring() {
this.page.on('requestfailed', request => {
console.error(`Request failed: ${request.url()} - ${request.failure()}`);
});
this.page.on('response', response => {
if (!response.ok()) {
console.error(`Response error: ${response.url()} - ${response.status()}`);
}
});
this.page.on('pageerror', error => {
console.error(`Page error: ${error.message}`);
});
}
async navigateWithRetry(url, options = {}) {
const defaultOptions = {
waitUntil: 'domcontentloaded',
timeout: 30000,
...options
};
const maxRetries = 3;
let lastError;
for (let i = 0; i < maxRetries; i++) {
try {
await this.page.goto(url, defaultOptions);
return;
} catch (error) {
lastError = error;
console.warn(`Navigation attempt ${i + 1} failed: ${error.message}`);
if (i < maxRetries - 1) {
await this.delay(1000 * (i + 1)); // Exponential backoff
}
}
}
throw lastError;
}
async screenshotOnError(error, filename = 'error-screenshot.png') {
try {
if (this.page) {
await this.page.screenshot({ path: filename, fullPage: true });
console.log(`Screenshot saved: ${filename}`);
}
} catch (screenshotError) {
console.error('Failed to take screenshot:', screenshotError.message);
}
}
async cleanup() {
try {
if (this.browser && this.browser.isConnected()) {
await this.browser.close();
}
} catch (error) {
console.error('Error during cleanup:', error.message);
}
}
delay(ms) {
return new Promise(resolve => setTimeout(resolve, ms));
}
}
// Usage example
const helper = new BrowserlessHelper('YOUR_API_TOKEN_HERE');
try {
await helper.connect();
const page = await helper.createPage();
await helper.navigateWithRetry('https://example.com');
// Your automation logic here
const title = await page.title();
console.log('Page title:', title);
} catch (error) {
console.error('Automation failed:', error.message);
await helper.screenshotOnError(error);
} finally {
await helper.cleanup();
}
Getting Help
If you continue to experience issues after implementing these best practices:
- Check your account dashboard for usage metrics
- Review our troubleshooting guides for specific issues
- Contact Browserless support for assistance
These best practices will help you create more reliable, maintainable, and efficient browser automation scripts with Browserless.
Next Steps
Explore these key areas: