browserless docs

browserless docs

  • Quick Start
  • Docker
  • Libraries
  • FAQ
  • Blog
  • Sign-up

›APIs

Hosted Service

  • Quick Start
  • How it works
  • GraphQL API
  • Best Practices
  • Terminology
  • Using your token

Docker

  • Quick Start
  • Configuration
  • Webhooks
  • Extending

APIs

  • /content
  • /download
  • /function
  • /pdf
  • /screencast
  • /screenshot
  • /scrape
  • /stats
  • /workspace

Libraries

  • Puppeteer
  • Playwright
  • Selenium
  • Capybara
  • .NET
  • Java
  • Python
  • Go (chromedp)

Recipes

  • Dealing with downloads
  • Using a Proxy
  • Watching sessions

Options

  • Launch Options
  • Using API /GET

/function API

If you're not running NodeJS in your infrastructure you can still use browserless to run your headless browser work! We've got several endpoints for common tasks like /screenshot or /pdf, and a serverless-style endpoint called /function that gives you total control of the browser. Read more about each below.

The /function endpoint allows for POSTing of custom code and context to run them with. The code function gets called with an object containing several properties: a page property, which is a puppeteer page object, and context which is the context you provide in the JSON body.

Functions should return an object with two properties: data and type. data can be whatever you'd like (Buffer, JSON, or plain text) and type is a string describing the content-type of data. Browserless reads both of these from your functions return value and resolves the request appropriately.

If you want to see more examples, checkout how other REST API's are handled in our GitHub project.

Getting page content with /function

code

// Read the `url` from context, goto the page and return the results
module.exports = async ({ page, context }) => {
  const { url } = context;
  await page.goto(url);

  const data = await page.content();

  return {
    data,
    // Make sure to match the appropriate content here
    // You'll likely want 'application/json'
    type: 'application/html',
  };
};

context

{
  "url": "https://example.com"
}

Since you can't have multi-line strings in JSON we've minified our above code with the online babel repl and use it in the below curl call.

cURL (with an API token)

curl -X POST \
  'https://chrome.browserless.io/function?TOKEN=YOUR-API-TOKEN' \
  -H 'Content-Type: application/json' \
  -d '{
  "code": "module.exports=async({page:a,context:b})=>{const{url:c}=b;await a.goto(c);const d=await a.content();return{data:d,type:\"application/html\"}};",
  "context": {
    "url": "https://example.com/"
  }
}'

Detached functions

Functions that are "detached" resolve immediately with a JSON payload exposing the job's ID. These are useful if you don't want to hold onto the connection with browserless while it's running, but are relying on the side-effects of having a browserless run the function. You can also call external services with any results (like webhooks or your own API's). Since it's harder to track detached functions, browserless will call them with another parameter if id which is the same id that they immediately resolve with.

In order to detach a function you'll need to pass in a third parameter of detached: true in the JSON POST body.

You can currently require 'url', 'util', 'path', 'querystring', 'lodash', 'node-fetch', and 'request' in your functions. Please contact us for adding a module

code

const fetch = require('node-fetch');

// ID here is dynamic, and matches up with the immediate response from
// browserless but allows you to track it in third-party systems
module.exports = async ({ page, context, id }) => {
  const { url } = context;
  await page.goto(url);

  const data = await page.content();

  // POST the content to a third-party service
  return fetch('https://my-third-party-service.com/content', {
    headers: {
      'Content-Type': 'application/json',
    },
    body: JSON.stringify({
      pageContent: data,
      sessionId: id,
    }),
    method: 'POST',
  });
};

context

{
  "url": "https://example.com"
}

To force functions to operate in a detached way, simply add a detached: true to the POSTed body.

cURL(with an API token)

curl -X POST \
  https://chrome.browserless.io/function?token=MY_API_TOKEN \
  -H 'Cache-Control: no-cache' \
  -H 'Content-Type: application/json' \
  -d '{
    "code": "const fetch=require('\''node-fetch'\'');module.exports=async({page:a,context:b,id:c})=>{const{url:d}=b;await a.goto(d);const e=await a.content();return fetch('\''https://my-third-party-service.com/content'\'',{headers:{'\''Content-Type'\'':'\''application/json'\''},body:JSON.stringify({pageContent:e,sessionId:c}),method:'\''POST'\''})};",
    "context": {
        "url": "https://example.com/"
    },
    "detached": true
}'
← /download/pdf →
browserless docs
Docs
Quick StartDocker DocsChrome API
Community
SlackTwitter
More
GitHubStar