Anthropic Computer Use Integration

Anthropic Computer Use enables Claude to interact with browsers like a human — taking screenshots, moving the cursor, clicking elements, and typing text. By connecting Computer Use to Browserless, you get cloud-hosted browser infrastructure without managing local browsers, enabling scalable AI-powered browser automation with stealth mode, residential proxies, and enterprise-grade reliability.

Prerequisites

Node.js 18+ (for TypeScript) or Python 3.9+ (for Python)
Browserless API token (available in your account dashboard)
Anthropic API key (from your Anthropic Settings)

How it works

This integration uses a screenshot + action loop with Playwright CDP:

Connect — Playwright connects to Browserless via CDP WebSocket
Screenshot — Capture the browser screen as an image
Send to Claude — Claude analyzes the screenshot using the Computer Use API
Execute action — Claude requests an action (click, type, scroll) and you execute it via Playwright
Repeat — Take a new screenshot and continue until the task is complete

Anthropic calls this the "agent loop": Claude responds with a tool use request, and your application responds with the results. You implement the loop that coordinates between Claude and the browser.

Step-by-Step Setup

1. Get your API keys

Go to your Browserless account dashboard and copy your API token.

Then set your environment variables:

.env file
Command line

BROWSERLESS_API_KEY=your-browserless-token
ANTHROPIC_API_KEY=your-anthropic-key

export BROWSERLESS_API_KEY="your-browserless-token"
export ANTHROPIC_API_KEY="your-anthropic-key"

2. Install dependencies

TypeScript
Python

npm install playwright-core @anthropic-ai/sdk dotenv typescript ts-node @types/node

pip install playwright anthropic python-dotenv

3. Create the agent

Computer Use requires you to implement the agent loop yourself. Below is a complete working example that navigates to Hacker News and identifies the top post.

Model configuration

When using the Anthropic API directly, you must use the correct combination of model, tool version, and beta flag. Check the Anthropic Models Documentation and Computer Use Documentation for the latest values.

Current working values (as of this writing):

Model: claude-sonnet-4-6
Beta: computer-use-2025-11-24
Tool type: computer_20251124

TypeScript
Python

import { chromium, Page } from "playwright-core";
import Anthropic from "@anthropic-ai/sdk";
import * as dotenv from "dotenv";

dotenv.config();

const BROWSERLESS_URL = `wss://production-sfo.browserless.io?token=${process.env.BROWSERLESS_API_KEY}`;
const DISPLAY_WIDTH = 1024;
const DISPLAY_HEIGHT = 768;

async function captureScreenshot(page: Page): Promise<string> {
  const screenshot = await page.screenshot({ type: "png" });
  return screenshot.toString("base64");
}

async function executeComputerAction(
  page: Page,
  action: string,
  params: Record<string, unknown>
): Promise<string> {
  const coordinate = params.coordinate as [number, number] | undefined;

  switch (action) {
    case "screenshot":
      return "Screenshot captured";
    case "left_click":
      if (coordinate) {
        await page.mouse.click(coordinate[0], coordinate[1]);
        return `Clicked at (${coordinate[0]}, ${coordinate[1]})`;
      }
      return "Missing coordinate for left_click";
    case "type":
      await page.keyboard.type(params.text as string);
      return `Typed: "${params.text}"`;
    case "key":
      await page.keyboard.press(params.text as string);
      return `Pressed key: ${params.text}`;
    case "scroll": {
      const direction = params.scroll_direction as string;
      const amount = (params.scroll_amount as number) ?? 3;
      const scrollCoord = coordinate ?? [DISPLAY_WIDTH / 2, DISPLAY_HEIGHT / 2];
      await page.mouse.move(scrollCoord[0], scrollCoord[1]);
      const deltaX = direction === "left" ? -amount * 100 : direction === "right" ? amount * 100 : 0;
      const deltaY = direction === "down" ? amount * 100 : direction === "up" ? -amount * 100 : 0;
      await page.mouse.wheel(deltaX, deltaY);
      return `Scrolled ${direction} by ${amount}`;
    }
    default:
      return `Unknown action: ${action}`;
  }
}

async function runComputerUseLoop(
  client: Anthropic,
  page: Page,
  task: string
): Promise<string> {
  const initialScreenshot = await captureScreenshot(page);

  const messages: Anthropic.Beta.Messages.BetaMessageParam[] = [
    {
      role: "user",
      content: [
        {
          type: "image",
          source: {
            type: "base64",
            media_type: "image/png",
            data: initialScreenshot,
          },
        },
        { type: "text", text: task },
      ],
    },
  ];

  const tools: Anthropic.Beta.Messages.BetaToolUnion[] = [
    {
      type: "computer_20251124",
      name: "computer",
      display_width_px: DISPLAY_WIDTH,
      display_height_px: DISPLAY_HEIGHT,
    },
  ];

  const MAX_ITERATIONS = 10;

  for (let i = 0; i < MAX_ITERATIONS; i++) {
    const response = await client.beta.messages.create({
      model: "claude-sonnet-4-6",
      max_tokens: 4096,
      tools,
      messages,
      betas: ["computer-use-2025-11-24"],
    });

    const assistantContent: Anthropic.Beta.Messages.BetaContentBlock[] = [];
    const toolResults: Anthropic.Beta.Messages.BetaToolResultBlockParam[] = [];

    for (const block of response.content) {
      assistantContent.push(block);

      if (block.type === "tool_use") {
        const input = block.input as Record<string, unknown>;
        const action = input.action as string;

        if (action === "screenshot") {
          const screenshot = await captureScreenshot(page);
          toolResults.push({
            type: "tool_result",
            tool_use_id: block.id,
            content: [
              {
                type: "image",
                source: {
                  type: "base64",
                  media_type: "image/png",
                  data: screenshot,
                },
              },
            ],
          });
        } else {
          const result = await executeComputerAction(page, action, input);
          toolResults.push({
            type: "tool_result",
            tool_use_id: block.id,
            content: result,
          });
        }
      }
    }

    messages.push({ role: "assistant", content: assistantContent });

    if (toolResults.length === 0) {
      return response.content
        .filter(
          (b): b is Anthropic.Beta.Messages.BetaTextBlock => b.type === "text"
        )
        .map((b) => b.text)
        .join("\n");
    }

    messages.push({ role: "user", content: toolResults });
  }

  return "Max iterations reached";
}

async function main() {
  const browser = await chromium.connectOverCDP(BROWSERLESS_URL);
  const page = await browser.newPage();
  await page.setViewportSize({ width: DISPLAY_WIDTH, height: DISPLAY_HEIGHT });
  await page.goto("https://news.ycombinator.com", { waitUntil: "networkidle" });

  const client = new Anthropic();
  const result = await runComputerUseLoop(
    client,
    page,
    "What is the title of the top post on this page? Take a screenshot first to see the page."
  );

  console.log("Result:", result);
  await browser.close();
}

main();

import asyncio
import base64
import os
from dotenv import load_dotenv
from playwright.async_api import async_playwright
import anthropic

load_dotenv()

BROWSERLESS_URL = f"wss://production-sfo.browserless.io?token={os.environ['BROWSERLESS_API_KEY']}"
DISPLAY_WIDTH = 1024
DISPLAY_HEIGHT = 768


async def capture_screenshot(page) -> str:
    screenshot = await page.screenshot(type="png")
    return base64.b64encode(screenshot).decode()


async def execute_computer_action(page, action: str, params: dict) -> str:
    if action == "screenshot":
        return "Screenshot captured"
    elif action == "left_click":
        x, y = params["coordinate"]
        await page.mouse.click(x, y)
        return f"Clicked at ({x}, {y})"
    elif action == "type":
        await page.keyboard.type(params["text"])
        return f'Typed: "{params["text"]}"'
    elif action == "key":
        await page.keyboard.press(params["text"])
        return f'Pressed key: {params["text"]}'
    elif action == "scroll":
        direction = params.get("scroll_direction", "down")
        amount = params.get("scroll_amount", 3)
        coord = params.get("coordinate", [DISPLAY_WIDTH // 2, DISPLAY_HEIGHT // 2])
        await page.mouse.move(coord[0], coord[1])
        delta_x = -amount * 100 if direction == "left" else amount * 100 if direction == "right" else 0
        delta_y = amount * 100 if direction == "down" else -amount * 100 if direction == "up" else 0
        await page.mouse.wheel(delta_x, delta_y)
        return f"Scrolled {direction} by {amount}"
    return f"Unknown action: {action}"


async def run_computer_use_loop(client, page, task: str) -> str:
    initial_screenshot = await capture_screenshot(page)

    messages = [
        {
            "role": "user",
            "content": [
                {
                    "type": "image",
                    "source": {
                        "type": "base64",
                        "media_type": "image/png",
                        "data": initial_screenshot,
                    },
                },
                {"type": "text", "text": task},
            ],
        }
    ]

    tools = [
        {
            "type": "computer_20251124",
            "name": "computer",
            "display_width_px": DISPLAY_WIDTH,
            "display_height_px": DISPLAY_HEIGHT,
        }
    ]

    max_iterations = 10

    for _ in range(max_iterations):
        response = await client.beta.messages.create(
            model="claude-sonnet-4-6",
            max_tokens=4096,
            tools=tools,
            messages=messages,
            betas=["computer-use-2025-11-24"],
        )

        assistant_content = []
        tool_results = []

        for block in response.content:
            assistant_content.append(block)

            if block.type == "tool_use":
                action = block.input.get("action")

                if action == "screenshot":
                    screenshot = await capture_screenshot(page)
                    tool_results.append(
                        {
                            "type": "tool_result",
                            "tool_use_id": block.id,
                            "content": [
                                {
                                    "type": "image",
                                    "source": {
                                        "type": "base64",
                                        "media_type": "image/png",
                                        "data": screenshot,
                                    },
                                }
                            ],
                        }
                    )
                else:
                    result = await execute_computer_action(page, action, block.input)
                    tool_results.append(
                        {
                            "type": "tool_result",
                            "tool_use_id": block.id,
                            "content": result,
                        }
                    )

        messages.append({"role": "assistant", "content": assistant_content})

        if not tool_results:
            return "\n".join(b.text for b in response.content if b.type == "text")

        messages.append({"role": "user", "content": tool_results})

    return "Max iterations reached"


async def main():
    async with async_playwright() as p:
        browser = await p.chromium.connect_over_cdp(BROWSERLESS_URL)
        page = await browser.new_page()
        await page.set_viewport_size({"width": DISPLAY_WIDTH, "height": DISPLAY_HEIGHT})
        await page.goto("https://news.ycombinator.com", wait_until="networkidle")

        client = anthropic.AsyncAnthropic()
        result = await run_computer_use_loop(
            client,
            page,
            "What is the title of the top post on this page? Take a screenshot first to see the page.",
        )

        print("Result:", result)
        await browser.close()


asyncio.run(main())

Available actions

Claude can request these actions through the Computer Use tool:

Action	Description
`screenshot`	Capture current screen state
`left_click`	Click at coordinates `[x, y]`
`right_click`	Right-click at coordinates
`double_click`	Double-click at coordinates
`type`	Type a text string
`key`	Press a key or combo via `text` param (e.g., `Enter`, `ctrl+s`)
`mouse_move`	Move cursor to coordinates
`scroll`	Scroll at coordinates with `scroll_direction` and `scroll_amount`

Advanced Configuration

Stealth mode

For sites with bot detection, use the stealth route:

TypeScript
Python

const browser = await chromium.connectOverCDP(
  `wss://production-sfo.browserless.io/stealth?token=${process.env.BROWSERLESS_API_KEY}`
);

browser = await p.chromium.connect_over_cdp(
    f"wss://production-sfo.browserless.io/stealth?token={os.environ['BROWSERLESS_API_KEY']}"
)

Residential proxies

Route traffic through residential IPs for geo-targeting and rate-limit avoidance:

TypeScript
Python

const browser = await chromium.connectOverCDP(
  `wss://production-sfo.browserless.io?token=${process.env.BROWSERLESS_API_KEY}&proxy=residential&proxyCountry=us`
);

browser = await p.chromium.connect_over_cdp(
    f"wss://production-sfo.browserless.io?token={os.environ['BROWSERLESS_API_KEY']}&proxy=residential&proxyCountry=us"
)

Regional endpoints

Pick the region closest to you for lower latency:

Region	Endpoint
US West (San Francisco)	`wss://production-sfo.browserless.io`
Europe (London)	`wss://production-lon.browserless.io`
Europe (Amsterdam)	`wss://production-ams.browserless.io`

Why use Browserless with Anthropic Computer Use

Stealth mode: bypass bot detection by adding /stealth to the endpoint URL
Residential proxies: route traffic through real residential IPs to avoid blocks
Global regions: choose from US West (San Francisco), Europe (London), or Europe (Amsterdam) endpoints for lower latency
No infrastructure: skip managing Chrome installations, updates, or scaling
Parallel sessions: run multiple browser sessions simultaneously
Enterprise reliability: 99.9% uptime SLA with automatic failover

How it works​

Step-by-Step Setup​

1. Get your API keys​

2. Install dependencies​

3. Create the agent​

Available actions​

Advanced Configuration​

Stealth mode​

Residential proxies​

Regional endpoints​

Why use Browserless with Anthropic Computer Use​

Resources​