Skip to main content

Map API

Discover all URLs on a website. Send a base URL and receive a deduplicated list of pages with optional titles and descriptions. Use the search parameter to rank results by relevance, and sitemap to control whether discovery uses the site's XML sitemap, on-page links, or both.

Endpoint

  • Method: POST
  • Path: /map
  • Auth: token query parameter (?token=)
  • Content-Type: application/json
  • Response: application/json

Quickstart

curl --request POST \
--url 'https://production-sfo.browserless.io/map?token=YOUR_API_TOKEN_HERE' \
--header 'Content-Type: application/json' \
--data '{
"url": "https://www.browserless.io"
}'

Response

{
"success": true,
"links": [
{
"url": "https://www.browserless.io",
"title": "Browserless - Headless Browser Automation",
"description": "Headless browser automation, without the hosting headaches."
},
{
"url": "https://www.browserless.io/pricing",
"title": "Pricing",
"description": "Simple, transparent pricing for headless browser automation."
},
{
"url": "https://www.browserless.io/blog",
"title": "Blog",
"description": "Insights, tips, and updates on headless browser automation."
}
// ...more URLs up to the specified limit
]
}

Search relevance

Use the search parameter to order results by relevance to a query. This is useful when you only need URLs related to a specific topic from a large site.

curl --request POST \
--url 'https://production-sfo.browserless.io/map?token=YOUR_API_TOKEN_HERE' \
--header 'Content-Type: application/json' \
--data '{
"url": "https://www.browserless.io",
"search": "pricing",
"limit": 10
}'

Response

{
"success": true,
"links": [
{
"url": "https://www.browserless.io/pricing",
"title": "Pricing",
"description": "Simple, transparent pricing for headless browser automation."
},
{
"url": "https://www.browserless.io/blog/pricing-update"
}
// URLs ordered by relevance to "pricing"
]
}

Sitemap control

The sitemap parameter controls how the API uses sitemaps for URL discovery:

ModeDescription
"include"(default) Uses both sitemap and on-page links for discovery.
"skip"Ignores the sitemap. Only discovers URLs found on the page itself.
"only"Returns only URLs listed in the sitemap.
curl --request POST \
--url 'https://production-sfo.browserless.io/map?token=YOUR_API_TOKEN_HERE' \
--header 'Content-Type: application/json' \
--data '{
"url": "https://www.browserless.io",
"sitemap": "only"
}'

Geo-targeting

Use the location parameter to route requests through a proxy in a specific country. This is useful for discovering region-specific URLs or bypassing geographic restrictions.

curl --request POST \
--url 'https://production-sfo.browserless.io/map?token=YOUR_API_TOKEN_HERE' \
--header 'Content-Type: application/json' \
--data '{
"url": "https://example.com",
"location": {
"country": "gb"
}
}'

Filtering options

Control which URLs are included in results:

curl --request POST \
--url 'https://production-sfo.browserless.io/map?token=YOUR_API_TOKEN_HERE' \
--header 'Content-Type: application/json' \
--data '{
"url": "https://www.browserless.io",
"limit": 100,
"includeSubdomains": false,
"ignoreQueryParameters": true
}'

Request body

FieldTypeRequiredDefaultDescription
urlstringYesThe base URL to discover links from. Must be http:// or https://.
searchstringNoSearch query to order results by relevance.
limitnumberNo5000Maximum number of URLs to return. Must be between 1 and 5000.
timeoutnumberNoServer defaultRequest timeout in milliseconds. Capped at the server-configured maximum.
sitemapstringNo"include"Sitemap behavior: "include", "skip", or "only".
includeSubdomainsbooleanNotrueWhether to include URLs from subdomains of the base URL.
ignoreQueryParametersbooleanNotrueExclude query parameters from discovered URLs, reducing duplicates.
locationobjectNoGeo-targeting settings. See below.
location.countrystringNo"us"Country code for proxy routing (e.g., "us", "gb", "de").
location.languagesstring[]NoPreferred languages for the request.

Response fields

FieldTypeDescription
successbooleanWhether the mapping operation succeeded.
linksMapLink[]Array of discovered URLs with optional metadata.
links[].urlstringThe discovered URL.
links[].titlestring | undefinedPage title, when available.
links[].descriptionstring | undefinedPage description, when available.

Configuration options

The /map API supports a timeout query parameter to control the maximum time allowed for the mapping operation:

POST /map?token=YOUR_API_TOKEN_HERE&timeout=30000

The timeout value is in milliseconds. If not specified, the server default timeout is used.