Skip to main content

Search API

๐Ÿšง BETA ๐Ÿšง

The Search API is currently in beta. Parameters and response shapes may change in future releases. ๐Ÿšง

The Web Search API is only available for Cloud plans. Contact us for more information here.

Search the web and optionally scrape each result page, returning structured, LLM-ready data. Combine multiple sources (web, news, images), geo-targeting, time filters, and category filters in a single request. When scraping is enabled, each result URL is fetched and processed into your preferred format: clean markdown, raw HTML, extracted links, or a screenshot.

Endpoint

  • Method: POST
  • Path: /search
  • Auth: token query parameter (?token=)
  • Content-Type: application/json
  • Response: application/json

Quickstartโ€‹

curl --request POST \
--url 'https://production-sfo.browserless.io/search?token=YOUR_API_TOKEN_HERE' \
--header 'Content-Type: application/json' \
--data '{
"query": "browserless.io"
}'

Response

{
"success": true,
"data": {
"web": [
{
"title": "Browserless - Headless Browser Automation",
"url": "https://www.browserless.io/",
"description": "Headless browser automation, without the hosting headaches.",
"position": 1
},
{
"title": "Browserless Documentation",
"url": "https://docs.browserless.io/",
"description": "Official documentation for the Browserless platform.",
"position": 2
}
// ...more results up to the specified limit
]
},
"totalResults": 10
}

Plan limitsโ€‹

The maximum number of results per search request (limit) varies by plan. The same cap applies whether you are performing a plain search, a search with scraping, or an image search.

PlanMax results per search
Free1
Prototyping3
Starter5
Scale and above10

If you request a limit higher than your plan allows, the API returns a 400 error. When limit is omitted, it defaults to 10 (or your plan's maximum, whichever is lower).

For self-hosted Enterprise deployments, these defaults are all 10 and can be configured via the MAX_SEARCH_RESULTS, MAX_SCRAPE_RESULTS, and MAX_SCRAPE_IMAGE_RESULTS environment variables.

Multiple sourcesโ€‹

Search across web, news, and images simultaneously. Each source returns its own array in the data object.

curl --request POST \
--url 'https://production-sfo.browserless.io/search?token=YOUR_API_TOKEN_HERE' \
--header 'Content-Type: application/json' \
--data '{
"query": "latest AI news",
"sources": ["web", "news", "images"],
"limit": 1
}'

Response

{
"success": true,
"data": {
"web": [
{
"title": "AI News - Latest Developments",
"url": "https://example.com/ai-news",
"description": "Breaking AI and machine learning news.",
"position": 1
}
],
"news": [
{
"title": "OpenAI Announces New Model",
"url": "https://example.com/openai-announcement",
"description": "OpenAI has released a new language model...",
"date": "2025-01-15",
"position": 1
}
],
"images": [
{
"title": "AI Generated Art",
"imageUrl": "https://example.com/image.jpg",
"imageWidth": 1200,
"imageHeight": 800,
"url": "https://example.com/gallery",
"position": 1
}
]
},
"totalResults": 3
}

Search with scrapingโ€‹

When scrapeOptions is provided, Browserless fetches each result URL and returns the content in your requested format. This is useful for building RAG pipelines, knowledge bases, and other LLM-powered workflows.

curl --request POST \
--url 'https://production-sfo.browserless.io/search?token=YOUR_API_TOKEN_HERE' \
--header 'Content-Type: application/json' \
--data '{
"query": "web scraping best practices",
"limit": 3,
"scrapeOptions": {
"formats": ["markdown"],
"onlyMainContent": true
}
}'

Response

{
"success": true,
"data": {
"web": [
{
"title": "Web Scraping Best Practices Guide",
"url": "https://example.com/scraping-guide",
"description": "A comprehensive guide to ethical web scraping.",
"position": 1,
"markdown": "# Web Scraping Best Practices\n\nWeb scraping is a powerful technique for extracting data...",
"metadata": {
"statusCode": 200,
"strategy": "http-fetch"
}
},
// ...more results up to the specified limit
]
},
"totalResults": 3
}

Language targetingโ€‹

You can use the lang parameter to search results in a specific language. This uses standard language codes (e.g. "en", "es", "de", "fr", "ja"). Defaults to "en" when not specified.

curl --request POST \
--url 'https://production-sfo.browserless.io/search?token=YOUR_API_TOKEN_HERE' \
--header 'Content-Type: application/json' \
--data '{
"query": "tutorial de Python",
"lang": "es"
}'

Response

{
"success": true,
"data": {
"web": [
{
"title": "El tutorial de Python โ€” documentaciรณn de Python - 3.14.3",
"url": "https://docs.python.org/es/3/tutorial/",
"description": "El intรฉrprete de Python es fรกcilmente extensible con funciones y tipos de datos implementados en C o C++...",
"position": 1
}
// ...more results up to the specified limit
]
},
"totalResults": 10
}

Category filtersโ€‹

Focus your search on specific content types. Categories modify the search query to target GitHub repositories, research papers, or PDF documents.

curl --request POST \
--url 'https://production-sfo.browserless.io/search?token=YOUR_API_TOKEN_HERE' \
--header 'Content-Type: application/json' \
--data '{
"query": "puppeteer automation",
"categories": ["github"]
}'

Response

{
"success": true,
"data": {
"web": [
{
"title": "GitHub - puppeteer/puppeteer: JavaScript API for Chrome and Firefox ยท GitHub",
"url": "https://github.com/puppeteer/puppeteer",
"description": "import puppeteer from 'puppeteer'; // Or import puppeteer from 'puppeteer-core'; // Launch the browser and open a new blank page. const browser = await puppeteer.launch(); const page = await browser.newPage(); // Navigate the page to a URL. await page.goto('https://developer.chrome.com/'); // Set screen size. await page.setViewport({width: 1080, height: 1024}); // Open the search menu using the keyboard. await page.keyboard.press('/'); // Type into search box using accessible input name. await page.locator('::-p-aria(Search)').fill('automate beyond recorder'); // Wait and click on first result.",
"position": 1
},
{
"title": "GitHub - addyosmani/puppeteer-webperf: Automating Web Performance testing with Puppeteer ๐ŸŽช",
"url": "https://github.com/addyosmani/puppeteer-webperf",
"description": "๐Ÿ•น Puppeteer is a Node library which provides a high-level API to control headless Chrome or Chromium over the DevTools Protocol. This repository has recipes for automating Web Performance measurement with Puppeteer.",
"position": 2
},
// ...more results up to the specified limit
]
},
"totalResults": 10
}

Available categories:

CategoryDescription
"github"Restricts results to GitHub repositories
"research"Targets academic sources (arxiv.org, scholar.google.com, etc.)
"pdf"Filters for PDF documents only

Time-based filteringโ€‹

Restrict results to a specific time range. Useful for finding recent content or filtering out outdated results.

curl --request POST \
--url 'https://production-sfo.browserless.io/search?token=YOUR_API_TOKEN_HERE' \
--header 'Content-Type: application/json' \
--data '{
"query": "browserless.io blog",
"tbs": "month"
}'

Response

{
"success": true,
"data": {
"web": [
{
"title": "Browserless blog",
"url": "https://www.browserless.io/blog",
"description": "4 days ago โ€” Discover insights, tips, and updates on our headless browser platform, empowering seamless automation at scale. Dive into our blog now!",
"position": 1
},
{
"title": "GraphQL vs. REST for Web Scraping APIs: A Practical Guide",
"url": "https://www.browserless.io/blog/graphql-vs-rest",
"description": "4 days ago โ€” This guide is a practical GraphQL vs REST decision framework for scraping and automation. We'll cover how data shape, rate limits, latency, pagination, ...",
"position": 2
},
// ...more results up to the specified limit
]
},
"totalResults": 10
}

Request bodyโ€‹

ParameterTypeDefaultDescription
querystring(required)Search query string
limitnumber10Max results per source (capped by plan limits)
langstring"en"Language code (e.g. "en", "de", "fr")
countrystringโ€”Two-letter country code (e.g. "us", "gb")
locationstringโ€”Location for local results (e.g. "San Francisco")
tbsstringโ€”Time filter: "day", "week", "month", "year" (or "qdr:d", "qdr:w", "qdr:m", "qdr:y")
categoriesstring[]โ€”Category filters: "github", "research", "pdf"
sourcesstring[]["web"]Source types: "web", "news", "images"
timeoutnumberserver defaultRequest timeout in milliseconds
scrapeOptionsobjectโ€”When provided, scrapes each result URL (see below)

scrapeOptionsโ€‹

ParameterTypeDefaultDescription
formatsstring[](required)Output formats: "markdown", "html", "links", "screenshot"
onlyMainContentbooleanfalseExtract article body using Readability
stripNonContentTagsbooleanfalseRemove <script>, <style>, <noscript> elements
removeBase64ImagesbooleanfalseStrip inline base64 data URIs from images
includeTagsstring[]โ€”CSS selectors to include
excludeTagsstring[]โ€”CSS selectors to exclude