/export API

The export API allows you to retrieve the content of any URL in its native format (HTML, PDF, images, etc.). The response format is determined by the content type of the page being accessed, with appropriate headers set to facilitate downloading or viewing the content.

You can check the full Open API schema here.

Basic Usage

The export API accepts a JSON payload with the target URL and configuration options.

JSON Payload Format

{
  "url": "https://example.com/",
  "headers": {
    "User-Agent": "Custom User Agent"
  },
  "gotoOptions": {
    "waitUntil": "networkidle0",
    "timeout": 30000
  },
  "waitForSelector": {
    "selector": "#main-content",
    "timeout": 5000
  },
  "waitForTimeout": 1000,
  "bestAttempt": false
}

Parameters

Required Parameters

url (string) - The URL of the resource to export

Optional Parameters

headers (object) - Custom HTTP headers to send with the request
gotoOptions (object) - Navigation options
- waitUntil (string) - When to consider navigation succeeded. Options: 'load', 'domcontentloaded', 'networkidle', 'commit'. Default: 'networkidle0'
- timeout (number) - Maximum navigation time in milliseconds
- referer (string) - Referer header value
waitForEvent (object) - Wait for a specific event before proceeding
waitForFunction (object) - Wait for a specific function to return true
waitForSelector (object) - Wait for a specific selector to be present
- selector (string) - CSS selector to wait for
- timeout (number) - Maximum time to wait in milliseconds
waitForTimeout (number) - Time in milliseconds to wait after page load
bestAttempt (boolean) - Whether to continue on errors. Default: false
includeResources (boolean) - Whether to include all linked resources (images, CSS, JavaScript) in a zip file. Default: false

Response

The API returns a streaming response with the content of the requested URL. The behavior depends on the content type detected and the includeResources parameter:

When includeResources is false (default):
- HTML Content: Returns the HTML with Content-Type: text/html. No attachment header is set, allowing the content to be rendered in the browser.
- PDF Content: Returns a PDF buffer with Content-Type: application/pdf and sets a Content-Disposition: attachment header with an appropriate filename.
- Images and Other Binary Content: Returns the binary content with the appropriate MIME type (e.g., image/jpeg, image/png) and sets a Content-Disposition: attachment header with an appropriate filename.
When includeResources is true:
- Returns a zip file containing the HTML and all linked resources (images, CSS, JavaScript) with Content-Type: application/zip and Content-Disposition: attachment header with an appropriate filename.

The streaming nature of the response means you should handle it accordingly in your code, using appropriate methods for reading streams rather than assuming all content can be processed as text.

Handling Different Content Types

The export API can return various content types depending on the URL being accessed. Here's how to properly handle the different response types:

HTML Content

When accessing a standard web page, the API returns HTML content with Content-Type: text/html:

const response = await fetch(url, options);
if (response.headers.get('content-type')?.includes('text/html')) {
  const htmlContent = await response.text();
  // Process HTML content
}

PDF Content

When accessing PDF files or when the server returns PDF content, the API returns a PDF buffer with Content-Type: application/pdf:

const response = await fetch(url, options);
if (response.headers.get('content-type')?.includes('application/pdf')) {
  const arrayBuffer = await response.arrayBuffer();
  const pdfBuffer = Buffer.from(arrayBuffer);
  // Save or process PDF buffer
}

Binary Content (Images, etc.)

For other binary content like images, the API returns the appropriate content type and sets attachment headers:

const response = await fetch(url, options);
const contentType = response.headers.get('content-type');
if (contentType?.includes('image/') || !contentType?.includes('text/')) {
  const arrayBuffer = await response.arrayBuffer();
  const binaryBuffer = Buffer.from(arrayBuffer);
  // Save or process binary buffer
}

Error Handling

The API may return the following error responses:

400 Bad Request - Invalid parameters, missing URL, or no content received
404 Not Found - Page not found
408 Request Timeout - Page load timeout
500 Internal Server Error - Server-side error

Examples

Basic Export Request

This example demonstrates how to export a web page using the most basic configuration. It shows how to properly handle the streamed response by detecting the content type and saving the content with the appropriate file extension.

cURL
Javascript
Python
Java
C#

curl -X POST \
  https://production-sfo.browserless.io/export?token=YOUR_API_TOKEN_HERE \
  -H 'Content-Type: application/json' \
  -d '{
    "url": "https://example.com/"
  }'

import { writeFile } from 'fs/promises';

const TOKEN = "YOUR_API_TOKEN_HERE";
const url = `https://production-sfo.browserless.io/export?token=${TOKEN}`;
const headers = {
  'Content-Type': 'application/json'
};

const data = {
  url: "https://example.com/"
};

const exportPage = async () => {
  const response = await fetch(url, {
    method: 'POST',
    headers: headers,
    body: JSON.stringify(data)
  });
  
  // Get content type to determine how to handle the response
  const contentType = response.headers.get('content-type');
  
  // Get filename from Content-Disposition header or create a default one based on content type
  let filename = 'downloaded-content';
  const disposition = response.headers.get('content-disposition');
  if (disposition && disposition.includes('filename=')) {
    const filenameMatch = disposition.match(/filename="(.+?)"/);
    if (filenameMatch) filename = filenameMatch[1];
  } else if (contentType) {
    // Set appropriate extension based on content type
    if (contentType.includes('text/html')) filename = 'page.html';
    else if (contentType.includes('application/pdf')) filename = 'document.pdf';
    else if (contentType.includes('image/')) {
      const ext = contentType.split('/')[1];
      filename = `image.${ext}`;
    }
  }
  
  // Handle response based on content type
  if (contentType && contentType.includes('text/html')) {
    // Handle HTML content as text
    const content = await response.text();
    await writeFile(filename, content);
  } else {
    // Handle binary content (PDFs, images, etc.)
    const arrayBuffer = await response.arrayBuffer();
    const buffer = Buffer.from(arrayBuffer);
    await writeFile(filename, buffer);
  }
  
  console.log(`Content saved as ${filename}`);
};

exportPage();

import requests
import re

TOKEN = "YOUR_API_TOKEN_HERE"
url = f"https://production-sfo.browserless.io/export?token={TOKEN}"
headers = {
    'Content-Type': 'application/json'
}

data = {
    "url": "https://example.com/"
}

response = requests.post(url, headers=headers, json=data)

# Get content type and determine appropriate filename
content_type = response.headers.get('Content-Type', '')
disposition = response.headers.get('Content-Disposition', '')

# Try to get filename from Content-Disposition header
filename = 'downloaded-content'
if disposition and 'filename=' in disposition:
    match = re.search('filename="(.+?)"', disposition)
    if match:
        filename = match.group(1)
else:
    # Create filename based on content type
    if 'text/html' in content_type:
        filename = 'page.html'
    elif 'application/pdf' in content_type:
        filename = 'document.pdf'
    elif 'image/' in content_type:
        ext = content_type.split('/')[1]
        filename = f"image.{ext}"

# Write content to file using binary mode for all types to ensure proper handling
with open(filename, "wb") as file:
    file.write(response.content)

print(f"Content saved as {filename}")

import java.io.*;
import java.net.http.*;
import java.net.URI;
import java.util.regex.Matcher;
import java.util.regex.Pattern;

public class ExportPage {
    public static void main(String[] args) {
        String TOKEN = "YOUR_API_TOKEN_HERE";
        String url = "https://production-sfo.browserless.io/export?token=" + TOKEN;

        String jsonData = """
        {
            "url": "https://example.com/"
        }
        """;

        HttpClient client = HttpClient.newHttpClient();

        HttpRequest request = HttpRequest.newBuilder()
            .uri(URI.create(url))
            .header("Content-Type", "application/json")
            .POST(HttpRequest.BodyPublishers.ofString(jsonData))
            .build();

        try {
            HttpResponse<byte[]> response = client.send(request, HttpResponse.BodyHandlers.ofByteArray());

            // Get content type and determine appropriate filename
            String contentType = response.headers().firstValue("Content-Type").orElse("");
            String disposition = response.headers().firstValue("Content-Disposition").orElse("");

            // Try to get filename from Content-Disposition header
            String filename = "downloaded-content";
            if (disposition.contains("filename=")) {
                Pattern pattern = Pattern.compile("filename=\"(.+?)\"");
                Matcher matcher = pattern.matcher(disposition);
                if (matcher.find()) {
                    filename = matcher.group(1);
                }
            } else {
                // Create filename based on content type
                if (contentType.contains("text/html")) {
                    filename = "page.html";
                } else if (contentType.contains("application/pdf")) {
                    filename = "document.pdf";
                } else if (contentType.contains("image/")) {
                    String ext = contentType.split("/")[1];
                    filename = "image." + ext;
                }
            }

            // Write content to file
            try (FileOutputStream fileOutputStream = new FileOutputStream(filename)) {
                fileOutputStream.write(response.body());
                System.out.println("Content saved as " + filename);
            }
        } catch (Exception e) {
            e.printStackTrace();
        }
    }
}

using System;
using System.IO;
using System.Net.Http;
using System.Text;
using System.Text.RegularExpressions;
using System.Threading.Tasks;

class Program {
    static async Task Main(string[] args) {
        string TOKEN = "YOUR_API_TOKEN_HERE";
        string url = $"https://production-sfo.browserless.io/export?token={TOKEN}";

        string jsonData = @"
        {
            ""url"": ""https://example.com/""
        }
        ";

        using var client = new HttpClient();
        var content = new StringContent(jsonData, Encoding.UTF8, "application/json");

        try {
            var response = await client.PostAsync(url, content);
            response.EnsureSuccessStatusCode();

            // Get content type and determine appropriate filename
            var contentType = response.Content.Headers.ContentType?.MediaType ?? "";
            var disposition = response.Content.Headers.ContentDisposition?.ToString() ?? "";

            // Try to get filename from Content-Disposition header
            string filename = "downloaded-content";
            if (disposition.Contains("filename=")) {
                var match = Regex.Match(disposition, "filename=\"(.+?)\"");
                if (match.Success) {
                    filename = match.Groups[1].Value;
                }
            } else {
                // Create filename based on content type
                if (contentType.Contains("text/html")) {
                    filename = "page.html";
                } else if (contentType.Contains("application/pdf")) {
                    filename = "document.pdf";
                } else if (contentType.Contains("image/")) {
                    var ext = contentType.Split('/')[1];
                    filename = $"image.{ext}";
                }
            }

            // Handle all content types as binary for consistency
            var bytes = await response.Content.ReadAsByteArrayAsync();
            await File.WriteAllBytesAsync(filename, bytes);

            Console.WriteLine($"Content saved as {filename}");
        } catch (Exception ex) {
            Console.WriteLine("Error: " + ex.Message);
        }
    }
}

This example demonstrates how to export a web page with custom navigation options, such as waiting for specific network events or DOM elements to load. These options help ensure the page is fully rendered before capturing the content.

cURL
Javascript
Python
Java
C#

curl -X POST \
  https://production-sfo.browserless.io/export?token=YOUR_API_TOKEN_HERE \
  -H 'Content-Type: application/json' \
  -d '{
    "url": "https://example.com/",
    "gotoOptions": {
      "waitUntil": "networkidle0",
      "timeout": 60000
    },
    "waitForSelector": {
      "selector": "#main-content",
      "timeout": 5000
    }
  }'

import { writeFile } from 'fs/promises';

const TOKEN = "YOUR_API_TOKEN_HERE";
const url = `https://production-sfo.browserless.io/export?token=${TOKEN}`;
const headers = {
  'Content-Type': 'application/json'
};

const data = {
  url: "https://example.com/",
  gotoOptions: {
    waitUntil: "networkidle0",
    timeout: 60000
  },
  waitForSelector: {
    selector: "#main-content",
    timeout: 5000
  }
};

const exportPage = async () => {
  const response = await fetch(url, {
    method: 'POST',
    headers: headers,
    body: JSON.stringify(data)
  });
  
  // Get content type to determine how to handle the response
  const contentType = response.headers.get('content-type');
  
  // Get filename from Content-Disposition header or create a default one based on content type
  let filename = 'downloaded-content';
  const disposition = response.headers.get('content-disposition');
  if (disposition && disposition.includes('filename=')) {
    const filenameMatch = disposition.match(/filename="(.+?)"/);
    if (filenameMatch) filename = filenameMatch[1];
  } else if (contentType) {
    // Set appropriate extension based on content type
    if (contentType.includes('text/html')) filename = 'page.html';
    else if (contentType.includes('application/pdf')) filename = 'document.pdf';
    else if (contentType.includes('image/')) {
      const ext = contentType.split('/')[1];
      filename = `image.${ext}`;
    }
  }
  
  // Handle response based on content type
  if (contentType && contentType.includes('text/html')) {
    // Handle HTML content as text
    const content = await response.text();
    await writeFile(filename, content);
  } else {
    // Handle binary content (PDFs, images, etc.)
    const arrayBuffer = await response.arrayBuffer();
    const buffer = Buffer.from(arrayBuffer);
    await writeFile(filename, buffer);
  }
  
  console.log(`Content saved as ${filename}`);
};

exportPage();

import requests
import re

TOKEN = "YOUR_API_TOKEN_HERE"
url = f"https://production-sfo.browserless.io/export?token={TOKEN}"
headers = {
    'Content-Type': 'application/json'
}

data = {
    "url": "https://example.com/",
    "gotoOptions": {
        "waitUntil": "networkidle0",
        "timeout": 60000
    },
    "waitForSelector": {
        "selector": "#main-content",
        "timeout": 5000
    }
}

response = requests.post(url, headers=headers, json=data)

# Get content type and determine appropriate filename
content_type = response.headers.get('Content-Type', '')
disposition = response.headers.get('Content-Disposition', '')

# Try to get filename from Content-Disposition header
filename = 'downloaded-content'
if disposition and 'filename=' in disposition:
    match = re.search('filename="(.+?)"', disposition)
    if match:
        filename = match.group(1)
else:
    # Create filename based on content type
    if 'text/html' in content_type:
        filename = 'page.html'
    elif 'application/pdf' in content_type:
        filename = 'document.pdf'
    elif 'image/' in content_type:
        ext = content_type.split('/')[1]
        filename = f"image.{ext}"

# Write content to file using binary mode for all types to ensure proper handling
with open(filename, "wb") as file:
    file.write(response.content)

print(f"Content saved as {filename}")

import java.io.*;
import java.net.http.*;
import java.net.URI;
import java.util.regex.Matcher;
import java.util.regex.Pattern;

public class ExportPageWithOptions {
    public static void main(String[] args) {
        String TOKEN = "YOUR_API_TOKEN_HERE";
        String url = "https://production-sfo.browserless.io/export?token=" + TOKEN;

        String jsonData = """
        {
            "url": "https://example.com/",
            "gotoOptions": {
                "waitUntil": "networkidle0",
                "timeout": 60000
            },
            "waitForSelector": {
                "selector": "#main-content",
                "timeout": 5000
            }
        }
        """;

        HttpClient client = HttpClient.newHttpClient();

        HttpRequest request = HttpRequest.newBuilder()
            .uri(URI.create(url))
            .header("Content-Type", "application/json")
            .POST(HttpRequest.BodyPublishers.ofString(jsonData))
            .build();

        try {
            HttpResponse<byte[]> response = client.send(request, HttpResponse.BodyHandlers.ofByteArray());

            // Get content type and determine appropriate filename
            String contentType = response.headers().firstValue("Content-Type").orElse("");
            String disposition = response.headers().firstValue("Content-Disposition").orElse("");

            // Try to get filename from Content-Disposition header
            String filename = "downloaded-content";
            if (disposition.contains("filename=")) {
                Pattern pattern = Pattern.compile("filename=\"(.+?)\"");
                Matcher matcher = pattern.matcher(disposition);
                if (matcher.find()) {
                    filename = matcher.group(1);
                }
            } else {
                // Create filename based on content type
                if (contentType.contains("text/html")) {
                    filename = "page.html";
                } else if (contentType.contains("application/pdf")) {
                    filename = "document.pdf";
                } else if (contentType.contains("image/")) {
                    String ext = contentType.split("/")[1];
                    filename = "image." + ext;
                }
            }

            // Write content to file
            try (FileOutputStream fileOutputStream = new FileOutputStream(filename)) {
                fileOutputStream.write(response.body());
                System.out.println("Content saved as " + filename);
            }
        } catch (Exception e) {
            e.printStackTrace();
        }
    }
}

using System;
using System.IO;
using System.Net.Http;
using System.Text;
using System.Text.RegularExpressions;
using System.Threading.Tasks;

class Program {
    static async Task Main(string[] args) {
        string TOKEN = "YOUR_API_TOKEN_HERE";
        string url = $"https://production-sfo.browserless.io/export?token={TOKEN}";

        string jsonData = @"
        {
            ""url"": ""https://example.com/"",
            ""gotoOptions"": {
                ""waitUntil"": ""networkidle0"",
                ""timeout"": 60000
            },
            ""waitForSelector"": {
                ""selector"": ""#main-content"",
                ""timeout"": 5000
            }
        }
        ";

        using var client = new HttpClient();
        var content = new StringContent(jsonData, Encoding.UTF8, "application/json");

        try {
            var response = await client.PostAsync(url, content);
            response.EnsureSuccessStatusCode();

            // Get content type and determine appropriate filename
            var contentType = response.Content.Headers.ContentType?.MediaType ?? "";
            var disposition = response.Content.Headers.ContentDisposition?.ToString() ?? "";

            // Try to get filename from Content-Disposition header
            string filename = "downloaded-content";
            if (disposition.Contains("filename=")) {
                var match = Regex.Match(disposition, "filename=\"(.+?)\"");
                if (match.Success) {
                    filename = match.Groups[1].Value;
                }
            } else {
                // Create filename based on content type
                if (contentType.Contains("text/html")) {
                    filename = "page.html";
                } else if (contentType.Contains("application/pdf")) {
                    filename = "document.pdf";
                } else if (contentType.Contains("image/")) {
                    var ext = contentType.Split('/')[1];
                    filename = $"image.{ext}";
                }
            }

            // Handle all content types as binary for consistency
            var bytes = await response.Content.ReadAsByteArrayAsync();
            await File.WriteAllBytesAsync(filename, bytes);

            Console.WriteLine($"Content saved as {filename}");
        } catch (Exception ex) {
            Console.WriteLine("Error: " + ex.Message);
        }
    }
}

Export with Custom Headers

This example demonstrates how to export a web page with custom HTTP headers. Custom headers allow you to modify the browser's behavior when accessing the page, such as changing the User-Agent or setting language preferences.

cURL
Javascript
Python
Java
C#

curl -X POST \
  https://production-sfo.browserless.io/export?token=YOUR_API_TOKEN_HERE \
  -H 'Content-Type: application/json' \
  -d '{
    "url": "https://example.com/",
    "headers": {
      "User-Agent": "Custom User Agent",
      "Accept-Language": "en-US"
    }
  }'

import { writeFile } from 'fs/promises';

const TOKEN = "YOUR_API_TOKEN_HERE";
const url = `https://production-sfo.browserless.io/export?token=${TOKEN}`;
const headers = {
  'Content-Type': 'application/json'
};

const data = {
  url: "https://example.com/",
  headers: {
    "User-Agent": "Custom User Agent",
    "Accept-Language": "en-US"
  }
};

const exportPage = async () => {
  const response = await fetch(url, {
    method: 'POST',
    headers: headers,
    body: JSON.stringify(data)
  });
  
  // Get content type to determine how to handle the response
  const contentType = response.headers.get('content-type');
  
  // Get filename from Content-Disposition header or create a default one based on content type
  let filename = 'downloaded-content';
  const disposition = response.headers.get('content-disposition');
  if (disposition && disposition.includes('filename=')) {
    const filenameMatch = disposition.match(/filename="(.+?)"/);
    if (filenameMatch) filename = filenameMatch[1];
  } else if (contentType) {
    // Set appropriate extension based on content type
    if (contentType.includes('text/html')) filename = 'page.html';
    else if (contentType.includes('application/pdf')) filename = 'document.pdf';
    else if (contentType.includes('image/')) {
      const ext = contentType.split('/')[1];
      filename = `image.${ext}`;
    }
  }
  
  // Handle response based on content type
  if (contentType && contentType.includes('text/html')) {
    // Handle HTML content as text
    const content = await response.text();
    await writeFile(filename, content);
  } else {
    // Handle binary content (PDFs, images, etc.)
    const arrayBuffer = await response.arrayBuffer();
    const buffer = Buffer.from(arrayBuffer);
    await writeFile(filename, buffer);
  }
  
  console.log(`Content saved as ${filename}`);
};

exportPage();

import requests
import re

TOKEN = "YOUR_API_TOKEN_HERE"
url = f"https://production-sfo.browserless.io/export?token={TOKEN}"
headers = {
    'Content-Type': 'application/json'
}

data = {
    "url": "https://example.com/",
    "headers": {
        "User-Agent": "Custom User Agent",
        "Accept-Language": "en-US"
    }
}

response = requests.post(url, headers=headers, json=data)

# Get content type and determine appropriate filename
content_type = response.headers.get('Content-Type', '')
disposition = response.headers.get('Content-Disposition', '')

# Try to get filename from Content-Disposition header
filename = 'downloaded-content'
if disposition and 'filename=' in disposition:
    match = re.search('filename="(.+?)"', disposition)
    if match:
        filename = match.group(1)
else:
    # Create filename based on content type
    if 'text/html' in content_type:
        filename = 'page.html'
    elif 'application/pdf' in content_type:
        filename = 'document.pdf'
    elif 'image/' in content_type:
        ext = content_type.split('/')[1]
        filename = f"image.{ext}"

# Write content to file using binary mode for all types to ensure proper handling
with open(filename, "wb") as file:
    file.write(response.content)

print(f"Content saved as {filename}")

import java.io.*;
import java.net.http.*;
import java.net.URI;
import java.util.regex.Matcher;
import java.util.regex.Pattern;

public class ExportPageWithHeaders {
    public static void main(String[] args) {
        String TOKEN = "YOUR_API_TOKEN_HERE";
        String url = "https://production-sfo.browserless.io/export?token=" + TOKEN;

        String jsonData = """
        {
            "url": "https://example.com/",
            "headers": {
                "User-Agent": "Custom User Agent",
                "Accept-Language": "en-US"
            }
        }
        """;

        HttpClient client = HttpClient.newHttpClient();

        HttpRequest request = HttpRequest.newBuilder()
            .uri(URI.create(url))
            .header("Content-Type", "application/json")
            .POST(HttpRequest.BodyPublishers.ofString(jsonData))
            .build();

        try {
            HttpResponse<byte[]> response = client.send(request, HttpResponse.BodyHandlers.ofByteArray());

            // Get content type and determine appropriate filename
            String contentType = response.headers().firstValue("Content-Type").orElse("");
            String disposition = response.headers().firstValue("Content-Disposition").orElse("");

            // Try to get filename from Content-Disposition header
            String filename = "downloaded-content";
            if (disposition.contains("filename=")) {
                Pattern pattern = Pattern.compile("filename=\"(.+?)\"");
                Matcher matcher = pattern.matcher(disposition);
                if (matcher.find()) {
                    filename = matcher.group(1);
                }
            } else {
                // Create filename based on content type
                if (contentType.contains("text/html")) {
                    filename = "page.html";
                } else if (contentType.contains("application/pdf")) {
                    filename = "document.pdf";
                } else if (contentType.contains("image/")) {
                    String ext = contentType.split("/")[1];
                    filename = "image." + ext;
                }
            }

            // Write content to file
            try (FileOutputStream fileOutputStream = new FileOutputStream(filename)) {
                fileOutputStream.write(response.body());
                System.out.println("Content saved as " + filename);
            }
        } catch (Exception e) {
            e.printStackTrace();
        }
    }
}

using System;
using System.IO;
using System.Net.Http;
using System.Text;
using System.Text.RegularExpressions;
using System.Threading.Tasks;

class Program {
    static async Task Main(string[] args) {
        string TOKEN = "YOUR_API_TOKEN_HERE";
        string url = $"https://production-sfo.browserless.io/export?token={TOKEN}";

        string jsonData = @"
        {
            ""url"": ""https://example.com/"",
            ""headers"": {
                ""User-Agent"": ""Custom User Agent"",
                ""Accept-Language"": ""en-US""
            }
        }
        ";

        using var client = new HttpClient();
        var content = new StringContent(jsonData, Encoding.UTF8, "application/json");

        try {
            var response = await client.PostAsync(url, content);
            response.EnsureSuccessStatusCode();

            // Get content type and determine appropriate filename
            var contentType = response.Content.Headers.ContentType?.MediaType ?? "";
            var disposition = response.Content.Headers.ContentDisposition?.ToString() ?? "";

            // Try to get filename from Content-Disposition header
            string filename = "downloaded-content";
            if (disposition.Contains("filename=")) {
                var match = Regex.Match(disposition, "filename=\"(.+?)\"");
                if (match.Success) {
                    filename = match.Groups[1].Value;
                }
            } else {
                // Create filename based on content type
                if (contentType.Contains("text/html")) {
                    filename = "page.html";
                } else if (contentType.Contains("application/pdf")) {
                    filename = "document.pdf";
                } else if (contentType.Contains("image/")) {
                    var ext = contentType.Split('/')[1];
                    filename = $"image.{ext}";
                }
            }

            // Handle all content types as binary for consistency
            var bytes = await response.Content.ReadAsByteArrayAsync();
            await File.WriteAllBytesAsync(filename, bytes);

            Console.WriteLine($"Content saved as {filename}");
        } catch (Exception ex) {
            Console.WriteLine("Error: " + ex.Message);
        }
    }
}

Export with Resource Download

cURL
Javascript
Python
Java
C#

curl -X POST \
  https://production-sfo.browserless.io/export?token=YOUR_API_TOKEN_HERE \
  -H 'Content-Type: application/json' \
  -d '{
    "url": "https://example.com/",
    "includeResources": true
  }' \
  --output "webpage.zip"

import { writeFile } from 'fs/promises';

const TOKEN = "YOUR_API_TOKEN_HERE";
const url = `https://production-sfo.browserless.io/export?token=${TOKEN}`;
const headers = {
  'Content-Type': 'application/json'
};

const data = {
  url: "https://example.com/",
  includeResources: true
};

const exportPage = async () => {
  const response = await fetch(url, {
    method: 'POST',
    headers: headers,
    body: JSON.stringify(data)
  });

  const buffer = await response.arrayBuffer();
  await writeFile("webpage.zip", Buffer.from(buffer));
  console.log("Page with resources saved as webpage.zip");
};

exportPage();

import requests

TOKEN = "YOUR_API_TOKEN_HERE"
url = f"https://production-sfo.browserless.io/export?token={TOKEN}"
headers = {
    'Content-Type': 'application/json'
}

data = {
    "url": "https://example.com/",
    "includeResources": True
}

response = requests.post(url, headers=headers, json=data)

with open("webpage.zip", "wb") as file:
    file.write(response.content)

print("Page with resources saved as webpage.zip")

import java.io.*;
import java.net.http.*;
import java.net.URI;
import java.nio.file.Files;
import java.nio.file.Paths;

public class ExportPageWithResources {
    public static void main(String[] args) {
        String TOKEN = "YOUR_API_TOKEN_HERE";
        String url = "https://production-sfo.browserless.io/export?token=" + TOKEN;

        String jsonData = """
        {
            "url": "https://example.com/",
            "includeResources": true
        }
        """;

        HttpClient client = HttpClient.newHttpClient();

        HttpRequest request = HttpRequest.newBuilder()
            .uri(URI.create(url))
            .header("Content-Type", "application/json")
            .POST(HttpRequest.BodyPublishers.ofString(jsonData))
            .build();

        try {
            HttpResponse<byte[]> response = client.send(request, HttpResponse.BodyHandlers.ofByteArray());
            Files.write(Paths.get("webpage.zip"), response.body());
            System.out.println("Page with resources saved as webpage.zip");
        } catch (Exception e) {
            e.printStackTrace();
        }
    }
}

using System;
using System.IO;
using System.Net.Http;
using System.Text;
using System.Threading.Tasks;

class Program {
    static async Task Main(string[] args) {
        string TOKEN = "YOUR_API_TOKEN_HERE";
        string url = $"https://production-sfo.browserless.io/export?token={TOKEN}";

        string jsonData = @"
        {
            ""url"": ""https://example.com/"",
            ""includeResources"": true
        }
        ";

        using var client = new HttpClient();
        var content = new StringContent(jsonData, Encoding.UTF8, "application/json");

        try {
            var response = await client.PostAsync(url, content);
            response.EnsureSuccessStatusCode();

            var bytes = await response.Content.ReadAsByteArrayAsync();
            await File.WriteAllBytesAsync("webpage.zip", bytes);

            Console.WriteLine("Page with resources saved as webpage.zip");
        } catch (Exception ex) {
            Console.WriteLine("Error: " + ex.Message);
        }
    }
}

Handling Different Content Types

The export API can return various content types depending on the URL being accessed. Here's how to properly handle the different response types:

HTML Content

When accessing a standard web page, the API returns HTML content with Content-Type: text/html:

const response = await fetch(url, options);
if (response.headers.get('content-type')?.includes('text/html')) {
  const htmlContent = await response.text();
  // Process HTML content
}

PDF Content

When accessing PDF files or when the server returns PDF content, the API returns a PDF buffer with Content-Type: application/pdf:

const response = await fetch(url, options);
if (response.headers.get('content-type')?.includes('application/pdf')) {
  const arrayBuffer = await response.arrayBuffer();
  const pdfBuffer = Buffer.from(arrayBuffer);
  // Save or process PDF buffer
}

Binary Content (Images, etc.)

For other binary content like images, the API returns the appropriate content type and sets attachment headers:

const response = await fetch(url, options);
const contentType = response.headers.get('content-type');
if (contentType?.includes('image/') || !contentType?.includes('text/')) {
  const arrayBuffer = await response.arrayBuffer();
  const binaryBuffer = Buffer.from(arrayBuffer);
  // Save or process binary buffer
}

Best Practices

Page Load Strategies
- Use appropriate waitUntil options based on your needs:
  - load - Wait for the load event (good for static pages)
  - domcontentloaded - Wait for the DOMContentLoaded event (faster but may miss dynamic content)
  - networkidle0 - Wait until there are no network connections for at least 500ms (good for single-page applications)
  - networkidle2 - Wait until there are no more than 2 network connections for at least 500ms (good for pages with background activity)
Timeout Management
- Set reasonable timeout values based on your target page's complexity
- Consider increasing timeouts for:
  - Pages with heavy JavaScript execution
  - Pages with large media files
  - Pages with complex animations
  - Pages with slow network conditions
Content Waiting
- Use waitForSelector when you need to ensure specific content is loaded
- Combine with waitForTimeout for additional stability
- Consider using multiple selectors for critical content
- Use bestAttempt: true for more resilient scraping, but be aware it may return incomplete content

Resource Management

Asset Handling
- Use includeAssets wisely to control export size
- Consider excluding unnecessary resource types:
  - Images for text-only exports
  - Stylesheets for raw content
  - Scripts for static content
- Use rejectResourceTypes to filter specific asset types
- Implement size limits for large resources
Network Optimization
- Use rejectRequestPattern to exclude unnecessary requests
- Consider implementing request throttling
- Cache frequently accessed resources
- Monitor and optimize network usage

Error Handling and Reliability

Robust Error Handling
- Implement proper error handling for:
  - Network timeouts
  - Resource loading failures
  - Invalid URLs
  - Rate limiting
- Use appropriate HTTP status codes
- Implement retry mechanisms for transient failures
Content Validation
- Verify content completeness
- Check for expected elements
- Validate content structure
- Implement checksums for critical content

Security Considerations

URL and Content Safety
- Always use HTTPS URLs when possible
- Validate URLs before making requests
- Sanitize user-provided URLs
- Implement content size limits
- Be cautious when setting custom headers
Authentication and Authorization
- Use secure methods for API token storage
- Implement proper access controls
- Monitor and log access attempts
- Rotate API tokens regularly

Performance Optimization

Export Size Management
- Implement compression where appropriate
- Use appropriate export formats
- Consider splitting large exports
- Implement cleanup mechanisms for temporary files
Concurrent Operations
- Implement proper rate limiting
- Use appropriate concurrency levels
- Monitor system resources
- Implement queue management for high-volume operations

Monitoring and Maintenance

Logging and Monitoring
- Implement comprehensive logging
- Monitor success/failure rates
- Track export sizes and durations
- Set up alerts for failures
- Monitor rate limit usage
Maintenance
- Regularly review and update selectors
- Monitor for changes in target sites
- Update error handling as needed
- Review and optimize timeout values
- Maintain documentation of changes

For additional support, please refer to the Browserless documentation or contact support.

/export API

Basic Usage

JSON Payload Format

Parameters

Required Parameters

Optional Parameters

Response

Handling Different Content Types

HTML Content

PDF Content

Binary Content (Images, etc.)

Error Handling

Examples

Basic Export Request

Export with Custom Navigation Options

Export with Custom Headers

Export with Resource Download

Handling Different Content Types

HTML Content

PDF Content

Binary Content (Images, etc.)

Best Practices

Navigation and Timing

Resource Management

Error Handling and Reliability

Security Considerations

Performance Optimization

Monitoring and Maintenance

Basic Usage​

JSON Payload Format​

Parameters​

Required Parameters​

Optional Parameters​

Response​

Handling Different Content Types​

HTML Content​

PDF Content​

Binary Content (Images, etc.)​

Error Handling​

Examples​

Basic Export Request​

Export with Custom Navigation Options​

Export with Custom Headers​

Export with Resource Download​

Handling Different Content Types​

HTML Content​

PDF Content​

Binary Content (Images, etc.)​

Best Practices​

Navigation and Timing​

Resource Management​

Error Handling and Reliability​

Security Considerations​

Performance Optimization​

Monitoring and Maintenance​

Basic Usage

JSON Payload Format

Parameters

Required Parameters

Optional Parameters

Response

Handling Different Content Types

HTML Content

PDF Content

Binary Content (Images, etc.)

Error Handling

Examples

Basic Export Request

Export with Custom Navigation Options

Export with Custom Headers

Export with Resource Download

Handling Different Content Types

HTML Content

PDF Content

Binary Content (Images, etc.)

Best Practices

Navigation and Timing

Resource Management

Error Handling and Reliability

Security Considerations

Performance Optimization

Monitoring and Maintenance