/export API
Currently, Browserless V2 is available in production via two domains: production-sfo.browserless.io
and production-lon.browserless.io
The export API allows you to capture and save a webpage as a complete archive, including all resources (HTML, CSS, JavaScript, images, etc.) in a single downloadable file. This is particularly useful for creating offline copies of web pages or preserving web content for archival purposes.
You can check the full Open API schema here.
Basic Usage
The export API accepts a JSON payload with the target URL and configuration options.
JSON Payload Format
{
"url": "https://example.com/",
"headers": {
"User-Agent": "Custom User Agent"
},
"gotoOptions": {
"waitUntil": "networkidle0",
"timeout": 30000
},
"waitForSelector": {
"selector": "#main-content",
"timeout": 5000
},
"waitForTimeout": 1000,
"bestAttempt": false
}
Parameters
Required Parameters
url
(string) - The URL of the webpage to export
Optional Parameters
headers
(object) - Custom HTTP headers to send with the requestgotoOptions
(object) - Navigation optionswaitUntil
(string) - When to consider navigation succeeded. Options: 'load', 'domcontentloaded', 'networkidle', 'commit'. Default: 'networkidle0'timeout
(number) - Maximum navigation time in millisecondsreferer
(string) - Referer header value
waitForEvent
(object) - Wait for a specific event before proceedingwaitForFunction
(object) - Wait for a specific function to return truewaitForSelector
(object) - Wait for a specific selector to be presentselector
(string) - CSS selector to wait fortimeout
(number) - Maximum time to wait in milliseconds
waitForTimeout
(number) - Time in milliseconds to wait after page loadbestAttempt
(boolean) - Whether to continue on errors. Default: false
Response
The API returns the content of the page with appropriate content type headers. The response format depends on the content type of the page:
- For HTML content: Returns the HTML with
Content-Type: text/html
- For PDF content: Returns the PDF with
Content-Type: application/pdf
- For other content types: Returns the content with appropriate content type and sets
Content-Disposition: attachment
Error Handling
The API may return the following error responses:
400 Bad Request
- Invalid parameters, missing URL, or no content received404 Not Found
- Page not found408 Request Timeout
- Page load timeout500 Internal Server Error
- Server-side error
Examples
Basic Export Request
- cURL
- Javascript
- Python
- Java
- C#
curl -X POST \
https://production-sfo.browserless.io/export?token=YOUR_API_TOKEN_HERE \
-H 'Content-Type: application/json' \
-d '{
"url": "https://example.com/"
}'
import { writeFile } from 'fs/promises';
const TOKEN = "YOUR_API_TOKEN_HERE";
const url = `https://production-sfo.browserless.io/export?token=${TOKEN}`;
const headers = {
'Content-Type': 'application/json'
};
const data = {
url: "https://example.com/"
};
const exportPage = async () => {
const response = await fetch(url, {
method: 'POST',
headers: headers,
body: JSON.stringify(data)
});
const content = await response.text();
await writeFile("page.html", content);
console.log("Page saved as page.html");
};
exportPage();
import requests
TOKEN = "YOUR_API_TOKEN_HERE"
url = f"https://production-sfo.browserless.io/export?token={TOKEN}"
headers = {
'Content-Type': 'application/json'
}
data = {
"url": "https://example.com/"
}
response = requests.post(url, headers=headers, json=data)
with open("page.html", "w") as file:
file.write(response.text)
print("Page saved as page.html")
import java.io.*;
import java.net.http.*;
import java.net.URI;
public class ExportPage {
public static void main(String[] args) {
String TOKEN = "YOUR_API_TOKEN_HERE";
String url = "https://production-sfo.browserless.io/export?token=" + TOKEN;
String jsonData = """
{
"url": "https://example.com/"
}
""";
HttpClient client = HttpClient.newHttpClient();
HttpRequest request = HttpRequest.newBuilder()
.uri(URI.create(url))
.header("Content-Type", "application/json")
.POST(HttpRequest.BodyPublishers.ofString(jsonData))
.build();
try {
HttpResponse<String> response = client.send(request, HttpResponse.BodyHandlers.ofString());
try (FileWriter fileWriter = new FileWriter("page.html")) {
fileWriter.write(response.body());
System.out.println("Page saved as page.html");
}
} catch (Exception e) {
e.printStackTrace();
}
}
}
using System;
using System.IO;
using System.Net.Http;
using System.Text;
using System.Threading.Tasks;
class Program {
static async Task Main(string[] args) {
string TOKEN = "YOUR_API_TOKEN_HERE";
string url = $"https://production-sfo.browserless.io/export?token={TOKEN}";
string jsonData = @"
{
""url"": ""https://example.com/""
}
";
using var client = new HttpClient();
var content = new StringContent(jsonData, Encoding.UTF8, "application/json");
try {
var response = await client.PostAsync(url, content);
response.EnsureSuccessStatusCode();
var pageContent = await response.Content.ReadAsStringAsync();
await File.WriteAllTextAsync("page.html", pageContent);
Console.WriteLine("Page saved as page.html");
} catch (Exception ex) {
Console.WriteLine("Error: " + ex.Message);
}
}
}
Export with Custom Navigation Options
- cURL
- Javascript
- Python
- Java
- C#
curl -X POST \
https://production-sfo.browserless.io/export?token=YOUR_API_TOKEN_HERE \
-H 'Content-Type: application/json' \
-d '{
"url": "https://example.com/",
"gotoOptions": {
"waitUntil": "networkidle0",
"timeout": 60000
},
"waitForSelector": {
"selector": "#main-content",
"timeout": 5000
}
}'
import { writeFile } from 'fs/promises';
const TOKEN = "YOUR_API_TOKEN_HERE";
const url = `https://production-sfo.browserless.io/export?token=${TOKEN}`;
const headers = {
'Content-Type': 'application/json'
};
const data = {
url: "https://example.com/",
gotoOptions: {
waitUntil: "networkidle0",
timeout: 60000
},
waitForSelector: {
selector: "#main-content",
timeout: 5000
}
};
const exportPage = async () => {
const response = await fetch(url, {
method: 'POST',
headers: headers,
body: JSON.stringify(data)
});
const content = await response.text();
await writeFile("page.html", content);
console.log("Page saved as page.html");
};
exportPage();
import requests
TOKEN = "YOUR_API_TOKEN_HERE"
url = f"https://production-sfo.browserless.io/export?token={TOKEN}"
headers = {
'Content-Type': 'application/json'
}
data = {
"url": "https://example.com/",
"gotoOptions": {
"waitUntil": "networkidle0",
"timeout": 60000
},
"waitForSelector": {
"selector": "#main-content",
"timeout": 5000
}
}
response = requests.post(url, headers=headers, json=data)
with open("page.html", "w") as file:
file.write(response.text)
print("Page saved as page.html")
import java.io.*;
import java.net.http.*;
import java.net.URI;
public class ExportPageWithOptions {
public static void main(String[] args) {
String TOKEN = "YOUR_API_TOKEN_HERE";
String url = "https://production-sfo.browserless.io/export?token=" + TOKEN;
String jsonData = """
{
"url": "https://example.com/",
"gotoOptions": {
"waitUntil": "networkidle0",
"timeout": 60000
},
"waitForSelector": {
"selector": "#main-content",
"timeout": 5000
}
}
""";
HttpClient client = HttpClient.newHttpClient();
HttpRequest request = HttpRequest.newBuilder()
.uri(URI.create(url))
.header("Content-Type", "application/json")
.POST(HttpRequest.BodyPublishers.ofString(jsonData))
.build();
try {
HttpResponse<String> response = client.send(request, HttpResponse.BodyHandlers.ofString());
try (FileWriter fileWriter = new FileWriter("page.html")) {
fileWriter.write(response.body());
System.out.println("Page saved as page.html");
}
} catch (Exception e) {
e.printStackTrace();
}
}
}
using System;
using System.IO;
using System.Net.Http;
using System.Text;
using System.Threading.Tasks;
class Program {
static async Task Main(string[] args) {
string TOKEN = "YOUR_API_TOKEN_HERE";
string url = $"https://production-sfo.browserless.io/export?token={TOKEN}";
string jsonData = @"
{
""url"": ""https://example.com/"",
""gotoOptions"": {
""waitUntil"": ""networkidle0"",
""timeout"": 60000
},
""waitForSelector"": {
""selector"": ""#main-content"",
""timeout"": 5000
}
}
";
using var client = new HttpClient();
var content = new StringContent(jsonData, Encoding.UTF8, "application/json");
try {
var response = await client.PostAsync(url, content);
response.EnsureSuccessStatusCode();
var pageContent = await response.Content.ReadAsStringAsync();
await File.WriteAllTextAsync("page.html", pageContent);
Console.WriteLine("Page saved as page.html");
} catch (Exception ex) {
Console.WriteLine("Error: " + ex.Message);
}
}
}
Export with Custom Headers
- cURL
- Javascript
- Python
- Java
- C#
curl -X POST \
https://production-sfo.browserless.io/export?token=YOUR_API_TOKEN_HERE \
-H 'Content-Type: application/json' \
-d '{
"url": "https://example.com/",
"headers": {
"User-Agent": "Custom User Agent",
"Accept-Language": "en-US"
}
}'
import { writeFile } from 'fs/promises';
const TOKEN = "YOUR_API_TOKEN_HERE";
const url = `https://production-sfo.browserless.io/export?token=${TOKEN}`;
const headers = {
'Content-Type': 'application/json'
};
const data = {
url: "https://example.com/",
headers: {
"User-Agent": "Custom User Agent",
"Accept-Language": "en-US"
}
};
const exportPage = async () => {
const response = await fetch(url, {
method: 'POST',
headers: headers,
body: JSON.stringify(data)
});
const content = await response.text();
await writeFile("page.html", content);
console.log("Page saved as page.html");
};
exportPage();
import requests
TOKEN = "YOUR_API_TOKEN_HERE"
url = f"https://production-sfo.browserless.io/export?token={TOKEN}"
headers = {
'Content-Type': 'application/json'
}
data = {
"url": "https://example.com/",
"headers": {
"User-Agent": "Custom User Agent",
"Accept-Language": "en-US"
}
}
response = requests.post(url, headers=headers, json=data)
with open("page.html", "w") as file:
file.write(response.text)
print("Page saved as page.html")
import java.io.*;
import java.net.http.*;
import java.net.URI;
public class ExportPageWithHeaders {
public static void main(String[] args) {
String TOKEN = "YOUR_API_TOKEN_HERE";
String url = "https://production-sfo.browserless.io/export?token=" + TOKEN;
String jsonData = """
{
"url": "https://example.com/",
"headers": {
"User-Agent": "Custom User Agent",
"Accept-Language": "en-US"
}
}
""";
HttpClient client = HttpClient.newHttpClient();
HttpRequest request = HttpRequest.newBuilder()
.uri(URI.create(url))
.header("Content-Type", "application/json")
.POST(HttpRequest.BodyPublishers.ofString(jsonData))
.build();
try {
HttpResponse<String> response = client.send(request, HttpResponse.BodyHandlers.ofString());
try (FileWriter fileWriter = new FileWriter("page.html")) {
fileWriter.write(response.body());
System.out.println("Page saved as page.html");
}
} catch (Exception e) {
e.printStackTrace();
}
}
}
using System;
using System.IO;
using System.Net.Http;
using System.Text;
using System.Threading.Tasks;
class Program {
static async Task Main(string[] args) {
string TOKEN = "YOUR_API_TOKEN_HERE";
string url = $"https://production-sfo.browserless.io/export?token={TOKEN}";
string jsonData = @"
{
""url"": ""https://example.com/"",
""headers"": {
""User-Agent"": ""Custom User Agent"",
""Accept-Language"": ""en-US""
}
}
";
using var client = new HttpClient();
var content = new StringContent(jsonData, Encoding.UTF8, "application/json");
try {
var response = await client.PostAsync(url, content);
response.EnsureSuccessStatusCode();
var pageContent = await response.Content.ReadAsStringAsync();
await File.WriteAllTextAsync("page.html", pageContent);
Console.WriteLine("Page saved as page.html");
} catch (Exception ex) {
Console.WriteLine("Error: " + ex.Message);
}
}
}
Best Practices
Navigation and Timing
-
Page Load Strategies
- Use appropriate
waitUntil
options based on your needs:load
- Wait for the load event (good for static pages)domcontentloaded
- Wait for the DOMContentLoaded event (faster but may miss dynamic content)networkidle0
- Wait until there are no network connections for at least 500ms (good for single-page applications)networkidle2
- Wait until there are no more than 2 network connections for at least 500ms (good for pages with background activity)
- Use appropriate
-
Timeout Management
- Set reasonable timeout values based on your target page's complexity
- Consider increasing timeouts for:
- Pages with heavy JavaScript execution
- Pages with large media files
- Pages with complex animations
- Pages with slow network conditions
-
Content Waiting
- Use
waitForSelector
when you need to ensure specific content is loaded - Combine with
waitForTimeout
for additional stability - Consider using multiple selectors for critical content
- Use
bestAttempt: true
for more resilient scraping, but be aware it may return incomplete content
- Use
Resource Management
-
Asset Handling
- Use
includeAssets
wisely to control export size - Consider excluding unnecessary resource types:
- Images for text-only exports
- Stylesheets for raw content
- Scripts for static content
- Use
rejectResourceTypes
to filter specific asset types - Implement size limits for large resources
- Use
-
Network Optimization
- Use
rejectRequestPattern
to exclude unnecessary requests - Consider implementing request throttling
- Cache frequently accessed resources
- Monitor and optimize network usage
- Use
Error Handling and Reliability
-
Robust Error Handling
- Implement proper error handling for:
- Network timeouts
- Resource loading failures
- Invalid URLs
- Rate limiting
- Use appropriate HTTP status codes
- Implement retry mechanisms for transient failures
- Implement proper error handling for:
-
Content Validation
- Verify content completeness
- Check for expected elements
- Validate content structure
- Implement checksums for critical content
Security Considerations
-
URL and Content Safety
- Always use HTTPS URLs when possible
- Validate URLs before making requests
- Sanitize user-provided URLs
- Implement content size limits
- Be cautious when setting custom headers
-
Authentication and Authorization
- Use secure methods for API token storage
- Implement proper access controls
- Monitor and log access attempts
- Rotate API tokens regularly
Performance Optimization
-
Export Size Management
- Implement compression where appropriate
- Use appropriate export formats
- Consider splitting large exports
- Implement cleanup mechanisms for temporary files
-
Concurrent Operations
- Implement proper rate limiting
- Use appropriate concurrency levels
- Monitor system resources
- Implement queue management for high-volume operations
Monitoring and Maintenance
-
Logging and Monitoring
- Implement comprehensive logging
- Monitor success/failure rates
- Track export sizes and durations
- Set up alerts for failures
- Monitor rate limit usage
-
Maintenance
- Regularly review and update selectors
- Monitor for changes in target sites
- Update error handling as needed
- Review and optimize timeout values
- Maintain documentation of changes
For additional support, please refer to the Browserless documentation or contact support.