/content API
Currently, Browserless V2 is available in production via two domains: production-sfo.browserless.io
and production-lon.browserless.io
The content API allows for simple navigation to a site and capturing the page's content (including the <head>
section). Browserless will respond with a Content-Type
of text/html
, and string of the site's HTML after it has been rendered and evaluated inside the browser. This is useful for capturing the content of a page that has a lot of JavaScript or other interactivity.
You can check the full Open API schema here.
If the /content API is getting blocked by bot detectors, then we would recommend trying BrowserQL.
Basic Usage
JSON payload
{
"url": "https://example.com/"
}
cURL request
curl -X POST \
https://production-sfo.browserless.io/content?token=MY_API_TOKEN \
-H 'Cache-Control: no-cache' \
-H 'Content-Type: application/json' \
-d '{
"url": "https://example.com/"
}'
Rejecting Undesired Requests
You can use rejectResourceTypes
and rejectRequestPattern
to block undesired content, resources and requests.
JSON payload
// Will reject any images and .css files
{
"url": "https://browserless.io/"
"rejectResourceTypes": ["image"],
"rejectRequestPattern": ["/^.*\\.(css)"]
}
cURL Request
curl -X POST \
https://production-sfo.browserless.io/content?token=MY_API_TOKEN \
-H 'Cache-Control: no-cache' \
-H 'Content-Type: application/json' \
-d '{
"url": "https://browserless.io/"
"rejectResourceTypes": ["image"],
"rejectRequestPattern": ["/^.*\\.(css)"]
}'
Navigation Options
You can use the gotoOptions
to modify the default navigation behavior for the requested URL. The objects mirror Puppeteer's GoToOptions
interface.
JSON payload
{
"url": "https://example.com/",
"gotoOptions": { "waitUntil": "networkidle2" },
}
cURL request
curl -X POST \
https://production-sfo.browserless.io/content?token=MY_API_TOKEN \
-H 'Cache-Control: no-cache' \
-H 'Content-Type: application/json' \
-d '{
"url": "https://example.com/",
"gotoOptions": { "waitUntil": "networkidle2" },
}'
Continue on error
You can use bestAttempt
to make Browserless attempt to proceed when async events fail or timeout. This includes things like the goto
or waitForSelector
proprieties in the JSON payload.
JSON payload
{
"url": "https://example.com/",
"bestAttempt": true,
// This would fail without bestAttempt
"waitForSelector": { "selector": "table", "timeout": 500 }
}
cURL request
curl -X POST \
https://production-sfo.browserless.io/content?token=MY_API_TOKEN \
-H 'Cache-Control: no-cache' \
-H 'Content-Type: application/json' \
-d '{
"url": "https://example.com/",
"bestAttempt": true,
"waitForSelector": { "selector": "table", "timeout": 500 }
}'
Waiting for Things
Browserless offers 4 different ways to wait for preconditions to be met on page. These are events, functions, selectors and timeouts
waitForEvent
Waits for an event to happen on the page before cotinue
Example
JSON payload
// Will fail since the event never fires
{
"url": "https://example.com/",
"waitForEvent": {
"event": "fullscreenchange",
"timeout": 5000
}
}
cURL request
curl -X POST \
https://production-sfo.browserless.io/pdf?token=MY_API_TOKEN \
-H 'Cache-Control: no-cache' \
-H 'Content-Type: application/json' \
-d '{
"url": "https://example.com/",
"waitForEvent": {
"event": "fullscreenchange",
"timeout": 5000
}
}'
waitForFunction
Waits for the provided function to return before cotinue. The function can be any valid JavaScript or EcmaScript function, and async
functions are supported.
Example
JS function
async () => {
const res = await fetch('https://jsonplaceholder.typicode.com/todos/1');
const json = await res.json();
document.querySelector("h1").innerText = json.title;
}
JSON payload
{
"url": "https://example.com/",
"waitForFunction": {
"fn": "async()=>{let t=await fetch('https://jsonplaceholder.typicode.com/todos/1'),e=await t.json();document.querySelector('h1').innerText=e.title}",
"timeout": 5000
}
}
cURL request
curl -X POST \
https://production-sfo.browserless.io/pdf?token=MY_API_TOKEN \
-H 'Cache-Control: no-cache' \
-H 'Content-Type: application/json' \
-d '{
"url": "https://example.com/",
"waitForFunction": {
"fn": "async()=>{let t=await fetch('https://jsonplaceholder.typicode.com/todos/1'),e=await t.json();document.querySelector('h1').innerText=e.title}",
"timeout": 5000
}
}'
waitForSelector
Wait for a selector to appear in page. If at the moment of calling the method the selector already exists, the method will return immediately. If the selector doesn't appear after the timeout milliseconds of waiting, the function will throw.
The object can have any of these values:
selector
: String, required — A valid CSS selector.hidden
Boolean, optional — Wait for the selected element to not be found in the DOM or to be hidden, i.e. havedisplay: none
orvisibility: hidden
CSS properties.timeout
: Number, optional — Maximum number of milliseconds to wait for the selector before failing.visible
: Boolean, optional — Wait for the selected element to be present in DOM and to be visible, i.e. to not havedisplay: none
orvisibility: hidden
CSS properties.
Example
JSON payload
// Will fail since the event never fires
{
"url": "https://example.com/",
"waitForEvent": {
"event": "fullscreenchange",
"timeout": 5000
}
}
cURL request
curl -X POST \
https://production-sfo.browserless.io/pdf?token=MY_API_TOKEN \
-H 'Cache-Control: no-cache' \
-H 'Content-Type: application/json' \
-d '{
"url": "https://example.com/",
"waitForEvent": {
"event": "fullscreenchange",
"timeout": 5000
}
}'