For AI agents: a documentation index is available at /llms.txt
Skip to main content

Scrape Structured Data

Extract structured data from fully rendered JavaScript pages using CSS selectors, BrowserQL, or your preferred framework.

Prerequisites

Steps

Use the /scrape REST endpoint to extract structured data from a page. No WebSocket connection needed.

View Full Code on GitHub

1. Build the request

Append your token to the scrape endpoint and specify the selectors you want to extract:

https://production-sfo.browserless.io/scrape?token=YOUR_API_TOKEN_HERE

2. Send the request

curl -X POST \
"https://production-sfo.browserless.io/scrape?token=YOUR_API_TOKEN_HERE" \
-H "Content-Type: application/json" \
-d '{
"url": "https://example.com",
"elements": [
{ "selector": "h1" },
{ "selector": "p" }
]
}'

3. Check the output

The response is JSON with a data array. Each item corresponds to one selector and includes matched elements with their text, HTML, dimensions, and position:

{
"data": [
{
"selector": "h1",
"results": [
{
"attributes": [],
"height": 38,
"html": "Example Domain",
"left": 28,
"text": "Example Domain",
"top": 160,
"width": 716
}
]
},
{
"selector": "p",
"results": [
{
"attributes": [],
"height": 48,
"html": "This domain is for use in illustrative examples...",
"left": 28,
"text": "This domain is for use in illustrative examples...",
"top": 220,
"width": 716
}
]
}
]
}

Next steps