BaaS Quick Start
Welcome to the Browser as a Service quick start! Here are some options for connecting your scripts or using our REST APIs for common tasks:
BaaS features are best suited for sites where bot detection isn't an issue, especially your own site. For getting past detectors, we'd recommend checking out BrowserQL.
API Token Setup
When you sign-up for a Browserless account, we create a unique token that allows you to interact with the service. Once your worker(s) are ready you should use this token anytime you interact with the service.
You can use this token with most of our integrations by simply appending a ?token=YOUR_API_TOKEN_HERE
as a query-string parameter.
For the purposes of illustrating these examples, we'll assume your API-TOKEN is 094632bb-e326-4c63-b953-82b55700b14c
.
Connecting Puppeteer
Libraries like puppeteer and chrome-remote-interface can hook into an existing Chrome instance by websocket. The hosted Browserless service only supports this type of interface since you can pass in tokens and other query-params. Typically you only need to replace how you start Chrome with a connect-like statement:
import puppeteer from "puppeteer-core";
// Connecting to Chrome locally
const browser = await puppeteer.launch();
// Connecting to Browserless and using a proxy
const browser = await puppeteer.connect({
browserWSEndpoint: `wss://production-sfo.browserless.io/?token=${TOKEN}&proxy=residential`,
});
More Puppeteer Details
Currently, Browserless V2 is available in production via two domains: production-sfo.browserless.io
and production-lon.browserless.io
Puppeteer is well-supported by Browserless, and is easy to upgrade an existing service or app to use it.
Basic Usage
In order to use the Browserless service, simply change the following:
Before browserless
import puppeteer from "puppeteer";
const browser = await puppeteer.launch();
const page = await browser.newPage();
// ...
After browserless
import puppeteer from "puppeteer";
const browser = await puppeteer.connect({
browserWSEndpoint: `wss://production-sfo.browserless.io?token=YOUR_API_TOKEN_HERE`,
});
const page = await browser.newPage();
If you're running the docker container to replace the location of wss://production-sfo.browserless.io/
to wherever your container is running.
Code Snippet
Below is a copy-paste example (remember to replace the API key for yours!) that should be a great starting point since it shows how to use puppeteer's methods with basic exception handling and file saving:
import puppeteer from "puppeteer-core";
import fs from "fs";
async function main() {
let browser = null;
try {
const url = "https://www.example.com";
const token = "YOUR_API_TOKEN_HERE";
const launchArgs = JSON.stringify({
args: [`--window-size=1920,1080`],
headless: false,
stealth: true,
timeout: 30000
});
console.log("Connecting to browser...");
browser = await puppeteer.connect({
browserWSEndpoint: `wss://production-sfo.browserless.io/?token=${token}&launch=${launchArgs}`,
});
console.log("Creating new page...");
const page = await browser.newPage();
page.setViewport({ width: 1920, height: 1080 }) //it's best to use this in addition to --window-size
await page.setUserAgent('My Custom User Agent/1.0');
console.log(`User agent: ${await page.evaluate(() => navigator.userAgent)}`);
console.log(`Viewport size: ${JSON.stringify(await page.viewport())}`);
console.log("Navigating to example.com...");
await page.goto(url);
const title = await page.title();
console.log(`The page's title is: ${title}`);
const html = await page.content();
fs.writeFileSync("example.html", html);
console.log(`HTML file saved.`);
await page.screenshot({ path: "example.png" });
console.log(`Screenshot saved.`);
const pdfBuffer = await page.pdf({ format: "A4" });
fs.writeFileSync("example.pdf", pdfBuffer);
console.log(`PDF file saved.`);
} catch (error) {
console.error("An error occurred:", error.message);
} finally {
if (browser) {
try {
console.log("Closing browser...");
await browser.close();
} catch (closeError) {
console.error("Error while closing browser:", closeError.message);
}
}
}
}
// Run the main function
main().catch(error => {
console.error("Unhandled error in main function:", error);
});
Specifying launch flags
You can specify launch-arguments through an object sent on a query string inside the browserWSEndpoint
. As an example, if you want to start the browser with a pre-defined width and height, in headful mode and setting stealth, you can specify it like so:
import puppeteer from "puppeteer-core";
const launchArgs = JSON.stringify({
args: [`--window-size=1920,1080`, `--user-data-dir=/tmp/chrome/data-dir`],
headless: false,
stealth: true,
timeout: 5000,
});
const browser = await puppeteer.connect({
browserWSEndpoint: `wss://production-sfo.browserless.io/?token=YOUR_API_TOKEN_HERE&launch=${launchArgs}`,
});
//...
Connecting to unlocked browser sessions
If your Puppeteer scripts are getting blocked by bot detectors, you can use BrowserQL to generate a browser instance with advanced stealth features, that you can then connect to with the provided browserWSEndpoint
and cookies.
Connecting Playwright
We support all Playwright protocols, and, just like with Puppeteer, you can easily switch to Browserless. The standard connect method uses playwright's built-in browser-server to handle the connection. This, generally, is a faster and more fully-featured method since it supports most of the playwright parameters (such as using a proxy and more).
- Javascript
- Python
- Java
- C#
import playwright from "playwright";
// Connecting to Firefox locally
const browser = await playwright.firefox.launch();
// Connecting to Firefox via Browserless and using a proxy
const browser = await playwright.firefox.connect(`wss://production-sfo.browserless.io/firefox/playwright?token=${TOKEN}&proxy=residential`);
from playwright.sync_api import sync_playwright
# Connecting to Firefox locally
with sync_playwright() as p:
browser = p.firefox.launch()
# Connecting to Firefox via Browserless with a proxy
with sync_playwright() as p:
browser = p.firefox.connect_over_cdp(f"wss://production-sfo.browserless.io/firefox/playwright?token={TOKEN}&proxy=residential")
package org.example;
import com.microsoft.playwright.*;
import java.nio.file.Paths;
public class Main {
public static void main(String[] args) {
// Connecting to Firefox locally
Browser browserLocal = playwright.firefox().launch();
// Connecting to Firefox via Browserless and using a proxy
String wsEndpoint = String.format(
"wss://production-sfo.browserless.io/firefox/playwright?token=%s&proxy=residential",
TOKEN
);
BrowserType.ConnectOptions connectOptions = new BrowserType.ConnectOptions();
connectOptions.setWsEndpoint(wsEndpoint);
}
}
using System;
using System.Threading.Tasks;
using Microsoft.Playwright;
namespace PlaywrightExample
{
class Program
{
public static async Task Main(string[] args)
{
// Connecting to Firefox locally
using var playwright = await Playwright.CreateAsync();
var browserLocal = await playwright.Firefox.LaunchAsync();
// Connecting to Firefox via Browserless and using a proxy
using var playwright = await Playwright.CreateAsync();
string wsEndpoint = $"wss://production-sfo.browserless.io/firefox/playwright?token={TOKEN}&proxy=residential";
var browserRemote = await playwright.Firefox.ConnectAsync(wsEndpoint);
}
}
}
More Playwright Details
Playwright is a cross-browser library written by Microsoft to aide in cross-browser testing and development.
Warning: To avoid errors with no apparent reason, please make sure your playwright version is compatible with one of these versions.
Using the Playwright Protocol
The standard connect
method uses Playwright's built-in browser-server protocol to handle the connection. This, generally, is a faster and more fully-featured method since it supports most of the Playwright parameters (such as using a proxy and more). However, since this requires the usage of Playwright in our servers, your client's Playwright version should match ours.
Take a screenshot in Playwright with Firefox
import playwright from "playwright-core";
const pwEndpoint = `wss://production-sfo.browserless.io/firefox/playwright?token=YOUR_API_TOKEN_HERE`;
const browser = await playwright.firefox.connect(pwEndpoint);
const context = await browser.newContext();
const page = await context.newPage();
await page.goto("https://www.nexcess.net/web-tools/browser-information/");
await new Promise((resolve) => setTimeout(resolve, 50000));
await page.screenshot({
path: `firefox.png`,
});
await browser.close();
Similarly, if you need to use another browser, just make sure the Playwright Browser object matches the endpoint.
Using the Chrome DevTools Protocol
The connectOverCDP
method allows Playwright to connect through Chrome's DevTools Protocol. While this is more functionally similar to how puppeteer
operates, it does come with a slight performance hit since sessions are more "chatty" over the network versus Playwright's connect
. Furthermore, you can only use the Chrome for these connections.
Take a screenshot in Playwright
import playwright from "playwright";
const browser = await playwright.chromium.connectOverCDP(
"wss://production-sfo.browserless.io?token=YOUR_API_TOKEN_HERE"
);
const context = await browser.newContext();
const page = await context.newPage();
await page.goto("https://www.example.com/");
await page.screenshot({ path: "cdp.png" });
await browser.close();
Using 3rd Party Proxies with Playwright
When using Playwright with browserless, you can set up a 3rd party proxy by providing proxy configuration to the newContext()
method. This is different from how proxies are handled in Puppeteer, as Playwright allows you to specify proxy settings directly at the context level:
import playwright from "playwright-core";
const browser = await playwright.chromium.connectOverCDP(
"wss://production-sfo.browserless.io?token=YOUR_API_TOKEN_HERE"
);
const context = await browser.newContext({
proxy: {
server: "http://domain:port",
username: "username",
password: "password",
},
});
const page = await context.newPage();
await page.goto("https://icanhazip.com/");
console.log(await page.content());
await browser.close();
This approach applies the proxy configuration at the context level, which means all pages created from that context will use the specified proxy. For more detailed information about using proxies with Playwright, see our Proxies documentation.
What's Next?
BaaS provides a wide range of functionalities that help your web scraping process. To discover all the capabilities Browserless has to offer, start with the following guides:
Advanced Features
Learn about more advanced features that you can take advantage when using BaaS: