Batch DOM Queries in a Single Browser Call
When you connect to Browserless over WebSocket, every page.$eval() call is a separate network round-trip between your Node process and a remote Chrome instance. Bundle all DOM reads into one page.evaluate() call and you pay that cost once, no matter how many selectors you need. For more performance tips with remote browsers, see BQL Best Practices.
- A Browserless API token from your account dashboard
Steps
- REST API
- Frameworks
- BQL
The /function endpoint runs the entire script server-side inside the same browser process, so there's naturally only one round-trip per request. No special batching needed.
- cURL
- JavaScript
- Python
- Java
- C#
1. Send the request
curl -X POST \
"https://production-sfo.browserless.io/function?token=YOUR_API_TOKEN_HERE" \
-H "Content-Type: application/javascript" \
--data-raw 'module.exports = async ({ page }) => {
await page.goto("https://example.com");
const data = await page.evaluate(() => ({
title: document.title,
heading: document.querySelector("h1")?.textContent?.trim() ?? null,
links: Array.from(document.querySelectorAll("a[href]")).map(a => ({
text: a.textContent.trim(),
href: a.href,
})),
paragraphCount: document.querySelectorAll("p").length,
}));
return data;
};'
2. Check the output
{
"title": "Example Domain",
"heading": "Example Domain",
"links": [
{ "text": "More information...", "href": "https://www.iana.org/domains/reserved" }
],
"paragraphCount": 2
}
1. Send the request
const code = `module.exports = async ({ page }) => {
await page.goto('https://example.com');
const data = await page.evaluate(() => ({
title: document.title,
heading: document.querySelector('h1')?.textContent?.trim() ?? null,
links: Array.from(document.querySelectorAll('a[href]')).map(a => ({
text: a.textContent.trim(),
href: a.href,
})),
paragraphCount: document.querySelectorAll('p').length,
}));
return data;
};`;
const response = await fetch(
'https://production-sfo.browserless.io/function?token=YOUR_API_TOKEN_HERE',
{
method: 'POST',
headers: { 'Content-Type': 'application/javascript' },
body: code,
}
);
const data = await response.json();
console.log(data);
2. Check the output
Run with node batch-dom-queries.mjs. The response is a flat JSON object with title, heading, links[], and paragraphCount.
1. Install dependencies
pip install requests
2. Send the request
import requests
code = """
module.exports = async ({ page }) => {
await page.goto('https://example.com');
const data = await page.evaluate(() => ({
title: document.title,
heading: document.querySelector('h1')?.textContent?.trim() ?? null,
links: Array.from(document.querySelectorAll('a[href]')).map(a => ({
text: a.textContent.trim(),
href: a.href,
})),
paragraphCount: document.querySelectorAll('p').length,
}));
return data;
};
"""
response = requests.post(
'https://production-sfo.browserless.io/function?token=YOUR_API_TOKEN_HERE',
headers={'Content-Type': 'application/javascript'},
data=code.encode('utf-8'),
)
print(response.json())
3. Check the output
Run with python batch_dom_queries.py. The printed dict has title, heading, links[], and paragraphCount, the same shape as the JSON output above.
1. Send the request
import java.net.URI;
import java.net.http.*;
String token = "YOUR_API_TOKEN_HERE";
String endpoint = "https://production-sfo.browserless.io/function?token=" + token;
String code = """
module.exports = async ({ page }) => {
await page.goto('https://example.com');
const data = await page.evaluate(() => ({
title: document.title,
heading: document.querySelector('h1')?.textContent?.trim() ?? null,
links: Array.from(document.querySelectorAll('a[href]')).map(a => ({
text: a.textContent.trim(),
href: a.href,
})),
paragraphCount: document.querySelectorAll('p').length,
}));
return data;
};
""";
HttpClient client = HttpClient.newHttpClient();
HttpRequest request = HttpRequest.newBuilder()
.uri(URI.create(endpoint))
.header("Content-Type", "application/javascript")
.POST(HttpRequest.BodyPublishers.ofString(code))
.build();
HttpResponse<String> response = client.send(request, HttpResponse.BodyHandlers.ofString());
System.out.println(response.body());
2. Check the output
Run the class. The response body is JSON with title, heading, links, and paragraphCount.
1. Send the request
using System.Net.Http;
using System.Text;
string token = "YOUR_API_TOKEN_HERE";
string endpoint = $"https://production-sfo.browserless.io/function?token={token}";
string code = @"module.exports = async ({ page }) => {
await page.goto('https://example.com');
const data = await page.evaluate(() => ({
title: document.title,
heading: document.querySelector('h1')?.textContent?.trim() ?? null,
links: Array.from(document.querySelectorAll('a[href]')).map(a => ({
text: a.textContent.trim(),
href: a.href,
})),
paragraphCount: document.querySelectorAll('p').length,
}));
return data;
};";
using (HttpClient httpClient = new HttpClient())
{
var content = new StringContent(code, Encoding.UTF8, "application/javascript");
var response = await httpClient.PostAsync(endpoint, content);
string result = await response.Content.ReadAsStringAsync();
Console.WriteLine(result);
}
2. Check the output
Run the program. The response body is JSON with title, heading, links, and paragraphCount.
With a remote WebSocket connection, each page.$eval() is a separate network call. Your Node process serializes the selector, sends it over the wire, waits for Chrome to run it, then gets the result back. Do that four times in sequence and you've added four round-trips of latency. One page.evaluate() with all four queries inside cuts that to a single trip.
- Puppeteer
- Playwright
1. Install dependencies
npm install puppeteer-core
2. Connect and query
import puppeteer from 'puppeteer-core';
const browser = await puppeteer.connect({
browserWSEndpoint: 'wss://production-sfo.browserless.io?token=YOUR_API_TOKEN_HERE',
});
try {
const page = await browser.newPage();
await page.goto('https://example.com', { waitUntil: 'networkidle2' });
// Don't do this — four separate round-trips between Node and Chrome:
// const title = await page.title();
// const heading = await page.$eval('h1', el => el.textContent.trim());
// const links = await page.$$eval('a[href]', els => els.map(a => a.href));
// const paragraphCount = await page.$$eval('p', els => els.length);
// Do this instead — one round-trip, all data comes back together:
const data = await page.evaluate(() => ({
title: document.title,
heading: document.querySelector('h1')?.textContent?.trim() ?? null,
links: Array.from(document.querySelectorAll('a[href]')).map(a => ({
text: a.textContent.trim(),
href: a.href,
})),
paragraphCount: document.querySelectorAll('p').length,
}));
console.log(data);
} finally {
// Always close to release the session even on error.
await browser.close();
}
3. Check the output
Run with node batch-dom-queries.mjs. You get a single object with a string title, a nullable string heading, an array of {text, href} link objects, and an integer paragraphCount.
{
"title": "Example Domain",
"heading": "Example Domain",
"links": [
{ "text": "More information...", "href": "https://www.iana.org/domains/reserved" }
],
"paragraphCount": 2
}
1. Install dependencies
npm install playwright-core
2. Connect and query
Playwright (like Puppeteer) stringifies the evaluate() callback before sending it to Chrome. That means closures don't work. Any variable from Node-land that the function references will be undefined inside Chrome. Pass Node-land values as a second argument to evaluate() instead.
import { chromium } from 'playwright-core';
const browser = await chromium.connectOverCDP(
'wss://production-sfo.browserless.io/chromium/playwright?token=YOUR_API_TOKEN_HERE'
);
try {
const context = browser.contexts()[0];
const page = await context.newPage();
await page.goto('https://example.com', { waitUntil: 'networkidle' });
// One evaluate call returns all the data you need:
const data = await page.evaluate(() => ({
title: document.title,
heading: document.querySelector('h1')?.textContent?.trim() ?? null,
links: Array.from(document.querySelectorAll('a[href]')).map(a => ({
text: a.textContent.trim(),
href: a.href,
})),
paragraphCount: document.querySelectorAll('p').length,
}));
console.log(data);
} finally {
// Always close to release the session even on error.
await browser.close();
}
3. Check the output
Run with node batch-dom-queries.mjs. Same shape as the Puppeteer output: a flat object with title, heading, links[], and paragraphCount.
{
"title": "Example Domain",
"heading": "Example Domain",
"links": [
{ "text": "More information...", "href": "https://www.iana.org/domains/reserved" }
],
"paragraphCount": 2
}
1. Write the mutation
BQL runs entirely server-side. See Writing BrowserQL for full details. There's no client process making sequential calls over a WebSocket. The whole query executes in one browser pass, so the round-trip problem doesn't exist and there's nothing to batch.
mutation BatchDOMQueries {
goto(url: "https://example.com", waitUntil: domContentLoaded) {
status
}
heading: text(selector: "h1")
links: mapSelector(selector: "a[href]") {
innerText
href: attribute(name: "href")
}
}
2. Run it
Paste into the BQL IDE and click Run.
3. Check the output
The response wraps everything in a data object. goto returns the HTTP status, heading is a string, and links is an array of {innerText, href} objects. Note that BQL uses innerText rather than text. It reflects the computed text content, including nested elements.
{
"data": {
"goto": { "status": 200 },
"heading": "Example Domain",
"links": [
{ "innerText": "More information...", "href": "https://www.iana.org/domains/reserved" }
]
}
}