Batch DOM queries in a single browser call

When you connect to Browserless over WebSocket, every page.$eval() call is a separate network round-trip between your Node process and a remote Chrome instance. Bundle all DOM reads into one page.evaluate() call and you pay that cost once, no matter how many selectors you need. For more performance tips with remote browsers, see BQL Best Practices.

Prerequisites

A Browserless API token from your account dashboard

Steps

REST API
Frameworks
BQL

The /function endpoint runs the entire script server-side inside the same browser process, so there's naturally only one round-trip per request. No special batching needed.

cURL
JavaScript
Python
Java
C#

View Full Code on GitHub

1. Send the request

curl -X POST \
  "https://production-sfo.browserless.io/function?token=YOUR_API_TOKEN_HERE" \
  -H "Content-Type: application/javascript" \
  --data-raw 'module.exports = async ({ page }) => {
    await page.goto("https://scraping-sandbox.netlify.app/products");

    const data = await page.evaluate(() => ({
      title: document.title,
      heading: document.querySelector("h1")?.textContent?.trim() ?? null,
      links: Array.from(document.querySelectorAll("a[href]")).map(a => ({
        text: a.textContent.trim(),
        href: a.href,
      })),
      productCount: document.querySelectorAll(".product-title").length,
    }));

    return data;
  };'

2. Check the output

{
  "title": "Product Catalog | Scraping Sandbox",
  "heading": "Product Catalog",
  "links": [
    { "text": "Contact Us", "href": "https://scraping-sandbox.netlify.app/contact-us" }
  ],
  "productCount": 8
}

View Full Code on GitHub

1. Send the request

const code = `module.exports = async ({ page }) => {
  await page.goto('https://scraping-sandbox.netlify.app/products');

  const data = await page.evaluate(() => ({
    title: document.title,
    heading: document.querySelector('h1')?.textContent?.trim() ?? null,
    links: Array.from(document.querySelectorAll('a[href]')).map(a => ({
      text: a.textContent.trim(),
      href: a.href,
    })),
    productCount: document.querySelectorAll('.product-title').length,
  }));

  return data;
};`;

const response = await fetch(
  'https://production-sfo.browserless.io/function?token=YOUR_API_TOKEN_HERE',
  {
    method: 'POST',
    headers: { 'Content-Type': 'application/javascript' },
    body: code,
  }
);

const data = await response.json();
console.log(data);

2. Check the output

Run with node batch-dom-queries.mjs. The response is a flat JSON object with title, heading, links[], and productCount.

View Full Code on GitHub

1. Install dependencies

pip install requests

2. Send the request

import requests

code = """
module.exports = async ({ page }) => {
  await page.goto('https://scraping-sandbox.netlify.app/products');

  const data = await page.evaluate(() => ({
    title: document.title,
    heading: document.querySelector('h1')?.textContent?.trim() ?? null,
    links: Array.from(document.querySelectorAll('a[href]')).map(a => ({
      text: a.textContent.trim(),
      href: a.href,
    })),
    productCount: document.querySelectorAll('.product-title').length,
  }));

  return data;
};
"""

response = requests.post(
    'https://production-sfo.browserless.io/function?token=YOUR_API_TOKEN_HERE',
    headers={'Content-Type': 'application/javascript'},
    data=code.encode('utf-8'),
)

print(response.json())

3. Check the output

Run with python batch_dom_queries.py. The printed dict has title, heading, links[], and productCount, the same shape as the JSON output above.

View Full Code on GitHub

1. Send the request

import java.net.URI;
import java.net.http.*;

String token = "YOUR_API_TOKEN_HERE";
String endpoint = "https://production-sfo.browserless.io/function?token=" + token;

String code = """
    module.exports = async ({ page }) => {
      await page.goto('https://scraping-sandbox.netlify.app/products');

      const data = await page.evaluate(() => ({
        title: document.title,
        heading: document.querySelector('h1')?.textContent?.trim() ?? null,
        links: Array.from(document.querySelectorAll('a[href]')).map(a => ({
          text: a.textContent.trim(),
          href: a.href,
        })),
        productCount: document.querySelectorAll('.product-title').length,
      }));

      return data;
    };
    """;

HttpClient client = HttpClient.newHttpClient();
HttpRequest request = HttpRequest.newBuilder()
    .uri(URI.create(endpoint))
    .header("Content-Type", "application/javascript")
    .POST(HttpRequest.BodyPublishers.ofString(code))
    .build();

HttpResponse<String> response = client.send(request, HttpResponse.BodyHandlers.ofString());
System.out.println(response.body());

2. Check the output

Run the class. The response body is JSON with title, heading, links, and productCount.

View Full Code on GitHub

1. Send the request

using System.Net.Http;
using System.Text;

string token = "YOUR_API_TOKEN_HERE";
string endpoint = $"https://production-sfo.browserless.io/function?token={token}";

string code = @"module.exports = async ({ page }) => {
  await page.goto('https://scraping-sandbox.netlify.app/products');

  const data = await page.evaluate(() => ({
    title: document.title,
    heading: document.querySelector('h1')?.textContent?.trim() ?? null,
    links: Array.from(document.querySelectorAll('a[href]')).map(a => ({
      text: a.textContent.trim(),
      href: a.href,
    })),
    productCount: document.querySelectorAll('.product-title').length,
  }));

  return data;
};";

using (HttpClient httpClient = new HttpClient())
{
    var content = new StringContent(code, Encoding.UTF8, "application/javascript");
    var response = await httpClient.PostAsync(endpoint, content);
    string result = await response.Content.ReadAsStringAsync();
    Console.WriteLine(result);
}

2. Check the output

Run the program. The response body is JSON with title, heading, links, and productCount.

With a remote WebSocket connection, each page.$eval() is a separate network call. Your Node process serializes the selector, sends it over the wire, waits for Chrome to run it, then gets the result back. Do that four times in sequence and you've added four round-trips of latency. One page.evaluate() with all four queries inside cuts that to a single trip.

Puppeteer
Playwright

View Full Code on GitHub

1. Install dependencies

npm install puppeteer-core

2. Connect and query

import puppeteer from 'puppeteer-core';

const browser = await puppeteer.connect({
  browserWSEndpoint: 'wss://production-sfo.browserless.io?token=YOUR_API_TOKEN_HERE',
});

try {
  const page = await browser.newPage();
  await page.goto('https://scraping-sandbox.netlify.app/products', { waitUntil: 'networkidle2' });

  // Don't do this — four separate round-trips between Node and Chrome:
  // const title = await page.title();
  // const heading = await page.$eval('h1', el => el.textContent.trim());
  // const links = await page.$$eval('a[href]', els => els.map(a => a.href));
  // const productCount = await page.$$eval('.product-title', els => els.length);

  // Do this instead — one round-trip, all data comes back together:
  const data = await page.evaluate(() => ({
    title: document.title,
    heading: document.querySelector('h1')?.textContent?.trim() ?? null,
    links: Array.from(document.querySelectorAll('a[href]')).map(a => ({
      text: a.textContent.trim(),
      href: a.href,
    })),
    productCount: document.querySelectorAll('.product-title').length,
  }));

  console.log(data);
} finally {
  // Always close to release the session even on error.
  await browser.close();
}

3. Check the output

Run with node batch-dom-queries.mjs. You get a single object with a string title, a nullable string heading, an array of {text, href} link objects, and an integer productCount.

{
  "title": "Product Catalog | Scraping Sandbox",
  "heading": "Product Catalog",
  "links": [
    { "text": "Contact Us", "href": "https://scraping-sandbox.netlify.app/contact-us" }
  ],
  "productCount": 8
}

View Full Code on GitHub

1. Install dependencies

npm install playwright-core

2. Connect and query

Playwright (like Puppeteer) stringifies the evaluate() callback before sending it to Chrome. That means closures don't work. Any variable from Node-land that the function references will be undefined inside Chrome. Pass Node-land values as a second argument to evaluate() instead.

import { chromium } from 'playwright-core';

const browser = await chromium.connectOverCDP(
  'wss://production-sfo.browserless.io/chromium/playwright?token=YOUR_API_TOKEN_HERE'
);

try {
  const context = browser.contexts()[0];
  const page = await context.newPage();
  await page.goto('https://scraping-sandbox.netlify.app/products', { waitUntil: 'networkidle' });

  // One evaluate call returns all the data you need:
  const data = await page.evaluate(() => ({
    title: document.title,
    heading: document.querySelector('h1')?.textContent?.trim() ?? null,
    links: Array.from(document.querySelectorAll('a[href]')).map(a => ({
      text: a.textContent.trim(),
      href: a.href,
    })),
    productCount: document.querySelectorAll('.product-title').length,
  }));

  console.log(data);
} finally {
  // Always close to release the session even on error.
  await browser.close();
}

3. Check the output

Run with node batch-dom-queries.mjs. Same shape as the Puppeteer output: a flat object with title, heading, links[], and productCount.

{
  "title": "Product Catalog | Scraping Sandbox",
  "heading": "Product Catalog",
  "links": [
    { "text": "Contact Us", "href": "https://scraping-sandbox.netlify.app/contact-us" }
  ],
  "productCount": 8
}

View Full Code on GitHub

1. Write the mutation

BQL runs entirely server-side. See Writing BrowserQL for full details. There's no client process making sequential calls over a WebSocket. The whole query executes in one browser pass, so the round-trip problem doesn't exist and there's nothing to batch.

mutation BatchDOMQueries {
  goto(url: "https://scraping-sandbox.netlify.app/products", waitUntil: domContentLoaded) {
    status
  }
  heading: text(selector: "h1")
  links: mapSelector(selector: "a[href]") {
    innerText
    href: attribute(name: "href")
  }
}

2. Run it

Paste into the BQL IDE and click Run.

3. Check the output

The response wraps everything in a data object. goto returns the HTTP status, heading is a string, and links is an array of {innerText, href} objects. Note that BQL uses innerText rather than text. It reflects the computed text content, including nested elements.

{
  "data": {
    "goto": { "status": 200 },
    "heading": "Product Catalog",
    "links": [
      { "innerText": "Contact Us", "href": "https://scraping-sandbox.netlify.app/contact-us" }
    ]
  }
}

Steps​

Steps