/unblock API
Currently, Browserless V2 is available in production via two domains: production-sfo.browserless.io
and production-lon.browserless.io
The /unblock
API is designed to bypass bot detection mechanisms such as Cloudflare, Datadome and other passive CAPTCHAs, allowing you to get a bot-protected website's content, cookies, and even a screenshot of the bypassed result.
You can then directly use the exported HTML or screenshot response, such as with Scrapy or Beautiful Soup. If you want to perform further automations with Playwright, Puppeteer or another CDP library, you can connect to the browser instance and inject cookies. Using the /unblock API is charged at 10 units per page.
This API is particularly useful for developers who need to automate web interactions on sites that employ sophisticated bot detection and blocking techniques. It offers four different ways to wait for preconditions to be met before returning a response.
You can check the full Open API schema here for all properties and documentation.
Basic Usage
JSON Payload
This is a JSON object containing the URL of the site you wish to unblock, along with the parameters you want in the response. Ideally you always want to set the ttl
the browserWSEndpoint
and cookies
.
{
"url": "https://example.com",
"browserWSEndpoint": true,
"cookies": true,
"content": false,
"screenshot": false,
"ttl": 30000
}
cURL Request
curl --request POST \
--url 'https://production-sfo.browserless.io/unblock?token=GOES-HERE' \
--header 'content-type: application/json' \
--data '{
"url": "https://example.com",
"browserWSEndpoint": true,
"cookies": true,
"content": false,
"screenshot": false,
"ttl": 30000
}'
Which will return a JSON response like this:
{
"browserWSEndpoint": "wss://production-sfo.browserless.io/p/53616c7465645f5f0d2e4012516859fdda7cc1ae0b16c6c5ec739d5d9f19a3d3c9b49c8a814b0fd1beae934b2e8050a0/devtools/browser/102ea3e9-74d7-42c9-a856-1bf254649b9a",
"content": null,
"cookies": [
{
name: "session_id",
value: "XYZ123",
domain: "example.com",
path: "/",
secure: true,
httpOnly: true,
},
],
"screenshot": null,
"ttl": 30000
}
After receiving the response with the browserWSEndpoint
and cookies
, you can use Puppeteer, Playwright or another CDP library to connect to the browser instance and inject the cookies to continue your scraping process:
Puppeteer Connection
import puppeteer from "puppeteer";
import unblock from "./utils";
// Example response from the API
const { browserWSEndpoint, cookies } = unblock("https://example.com");
const browser = await puppeteer.connect({ browserWSEndpoint });
const page = await browser.newPage();
// Inject cookies into the page
await page.setCookie(...response.cookies);
await page.goto("https://www.targetsite.com");
await page.screenshot({ path: "screenshot.png" });
await browser.close();
Waiting for Things
Browserless offers 4 different ways to wait for preconditions to be met on the page before returning the response. These are events
, functions
, selectors
and timeouts
.
waitForEvent
Waits for an event to happen on the page before continuing:
Example
JSON payload
// Will fail since the event never fires on this page,
// but used for demonstration purposes
{
"url": "https://example.com/",
"waitForEvent": {
"event": "fullscreenchange",
"timeout": 5000
}
}
cURL request
curl -X POST \
https://production-sfo.browserless.io/unblock?token=MY_API_TOKEN \
-H 'Cache-Control: no-cache' \
-H 'Content-Type: application/json' \
-d '{
"url": "https://example.com/",
"waitForEvent": {
"event": "fullscreenchange",
"timeout": 5000
}
}'
waitForFunction
Waits for the provided function to return before continuing. The function can be any valid JavaScript function including async
functions.
Example
JS function
async () => {
const res = await fetch("https://jsonplaceholder.typicode.com/todos/1");
const json = await res.json();
document.querySelector("h1").innerText = json.title;
};
JSON payload
{
"url": "https://example.com/",
"waitForFunction": {
"fn": "async()=>{let t=await fetch('https://jsonplaceholder.typicode.com/todos/1'),e=await t.json();document.querySelector('h1').innerText=e.title}",
"timeout": 5000
}
}
cURL request
curl -X POST \
https://production-sfo.browserless.io/unblock?token=MY_API_TOKEN \
-H 'Cache-Control: no-cache' \
-H 'Content-Type: application/json' \
-d '{
"url": "https://example.com/",
"waitForFunction": {
"fn": "async()=>{let t=await fetch('https://jsonplaceholder.typicode.com/todos/1'),e=await t.json();document.querySelector('h1').innerText=e.title}",
"timeout": 5000
}
}'
waitForSelector
Waits for a selector to appear on the page. If at the moment of calling this API, the selector already exists, the method will return immediately. If the selector doesn't appear after the timeout milliseconds of waiting the API will return a non-200 response code with an error message as the body of the response.
The object can have any of these values:
selector
: String, required — A valid CSS selector.hidden
Boolean, optional — Wait for the selected element to not be found in the DOM or to be hidden, i.e. havedisplay: none
orvisibility: hidden
CSS properties.timeout
: Number, optional — Maximum number of milliseconds to wait for the selector before failing.visible
: Boolean, optional — Wait for the selected element to be present in DOM and to be visible, i.e. to not havedisplay: none
orvisibility: hidden
CSS properties.
Example
JSON payload
{
"url": "https://example.com/",
"waitForSelector": {
"selector": "h1",
"timeout": 5000
}
}
cURL request
curl -X POST \
https://production-sfo.browserless.io/unblock?token=MY_API_TOKEN \
-H 'Cache-Control: no-cache' \
-H 'Content-Type: application/json' \
-d '{
"url": "https://example.com/",
"waitForSelector": {
"selector": "h1",
"timeout": 5000
}
}'