How it Works
Browserless V2 is available in production via two domains: production-sfo.browserless.io
and production-lon.browserless.io
Browserless works almost identically to how most libraries and web-drivers work when ran locally. There's no additional software you need to install on your production machines or complicated setup steps. As a matter of fact, the only thing you really need to when using the browserless service is merely change where your code references the browser.
Browserless runs browsers in a cloud environment, and exposes most of the Chrome DevTools protocol and the Playwright Protocols to you. On top of exposing these commands, it also:
- Isolates your session from all others.
- Can run concurrent requests without interfering with others.
- Cleans up sessions after 30 seconds.
- Starts a clean copy of a browser for each session.
- Restarts automatically if anything crashes.
- Queues requests if thresholds are met.
- Helps bypass bot detectors.
You can use the hosted service either by connecting an automation library as described below, or by using one of our HTTP APIs such as retrieving HTML from protected sites with BrowserQL.
How sessions work
Via Puppeteer.connect()
Libraries like puppeteer and chrome-remote-interface can hook into an existing Chrome instance by websocket. The hosted browserless service only supports this type of interface since you can pass in tokens and other query-params. Typically you only need to replace how you start Chrome with a connect-like statement:
import puppeteer from "puppeteer-core";
// Connecting to Chrome locally
const browser = await puppeteer.launch();
// Connecting to Browserless and using a proxy
const browser = await puppeteer.connect({
browserWSEndpoint: `https://production-sfo.browserless.io/?token=${TOKEN}&proxy=residential`,
});
After that your code should remain exactly the same.
Via Playwright.BrowserType.connect()
We support all Playwright protocols, and, just like with Puppeteer, you can easily switch to Browserless. The standard connect
method uses playwright's built-in browser-server to handle the connection. This, generally, is a faster and more fully-featured method since it supports most of the playwright parameters (such as using a proxy and more).
To connect to Browserless using Chrome, WebKit or Firefox, just make sure that the connection string matches the browser:
- Javascript
- Python
- Java
- C#
import playwright from "playwright";
// Connecting to Firefox locally
const browser = await playwright.firefox.launch();
// Connecting to Firefox via Browserless and using a proxy
const browser = await playwright.firefox.connect(`https://production-sfo.browserless.io/firefox/playwright?token=${TOKEN}&proxy=residential`);
from playwright.sync_api import sync_playwright
# Connecting to Firefox locally
with sync_playwright() as p:
browser = p.firefox.launch()
# Connecting to Firefox via Browserless with a proxy
with sync_playwright() as p:
browser = p.firefox.connect_over_cdp(f"https://production-sfo.browserless.io/firefox/playwright?token={TOKEN}&proxy=residential")
package org.example;
import com.microsoft.playwright.*;
import java.nio.file.Paths;
public class Main {
public static void main(String[] args) {
// Connecting to Firefox locally
Browser browserLocal = playwright.firefox().launch();
// Connecting to Firefox via Browserless and using a proxy
String wsEndpoint = String.format(
"wss://production-sfo.browserless.io/firefox/playwright?token=%s&proxy=residential",
TOKEN
);
BrowserType.ConnectOptions connectOptions = new BrowserType.ConnectOptions();
connectOptions.setWsEndpoint(wsEndpoint);
}
}
using System;
using System.Threading.Tasks;
using Microsoft.Playwright;
namespace PlaywrightExample
{
class Program
{
public static async Task Main(string[] args)
{
// Connecting to Firefox locally
using var playwright = await Playwright.CreateAsync();
var browserLocal = await playwright.Firefox.LaunchAsync();
// Connecting to Firefox via Browserless and using a proxy
using var playwright = await Playwright.CreateAsync();
string wsEndpoint = $"wss://production-sfo.browserless.io/firefox/playwright?token={TOKEN}&proxy=residential";
var browserRemote = await playwright.Firefox.ConnectAsync(wsEndpoint);
}
}
}
Via host
and port
(Chrome DevTools Protocol)
Many libraries for the Chrome DevTools Protocol will issue an HTTP request to one of the /json
endpoints exposed by the protocol. When this request happens, Browserless will respond with the resulting payload to allow remote programs to interact with it.
If you're looking to use the Browserless service with non-Node language, it's better to use the REST API's and /function
endpoint as Browserless can run puppeteer code for you. Take a look at our blog post about this interface here.
Introspection Request
# curl https://production-sfo.browserless.io/json/list?token=YOUR_API_TOKEN_HERE
[
{
"description":"",
"devtoolsFrontendUrl":"/devtools/inspector.html?ws=138.197.93.72:3000/devtools/page/da78a5e7-1db5-4d47-a2a5-07885088ad07",
"id":"da78a5e7-1db5-4d47-a2a5-07885088ad07",
"title":"about:blank",
"type":"page",
"url":"about:blank",
"webSocketDebuggerUrl":"ws://138.197.93.72:3000/devtools/page/da78a5e7-1db5-4d47-a2a5-07885088ad07"
}
]
The websocket endpoints are where commands from the protocol are emitted into, and Chrome will then emit responses back. Browserless does not modify or alter any of these messages. Once your session and underlying websocket are closed, Browserless will automatically clear that Target and session data.