Anthropic Computer Use Integration
Anthropic Computer Use enables Claude to interact with browsers like a human — taking screenshots, moving the cursor, clicking elements, and typing text. By connecting Computer Use to Browserless, you get cloud-hosted browser infrastructure without managing local browsers, enabling scalable AI-powered browser automation with stealth mode, residential proxies, and enterprise-grade reliability.
How it works
This integration uses a screenshot + action loop with Playwright CDP:
- Connect — Playwright connects to Browserless via CDP WebSocket
- Screenshot — Capture the browser screen as an image
- Send to Claude — Claude analyzes the screenshot using the Computer Use API
- Execute action — Claude requests an action (click, type, scroll) and you execute it via Playwright
- Repeat — Take a new screenshot and continue until the task is complete
Anthropic calls this the "agent loop": Claude responds with a tool use request, and your application responds with the results. You implement the loop that coordinates between Claude and the browser.
Prerequisites
- Node.js 18+ (for TypeScript) or Python 3.9+ (for Python)
- Browserless API token (available in your account dashboard)
- Anthropic API key (from your Anthropic Settings)
Step-by-Step Setup
1. Get your API keys
Go to your Browserless account dashboard and copy your API token.
Then set your environment variables:
- .env file
- Command line
BROWSERLESS_API_KEY=your-browserless-token
ANTHROPIC_API_KEY=your-anthropic-key
export BROWSERLESS_API_KEY="your-browserless-token"
export ANTHROPIC_API_KEY="your-anthropic-key"
2. Install dependencies
- TypeScript
- Python
npm install playwright-core @anthropic-ai/sdk dotenv typescript ts-node @types/node
pip install playwright anthropic python-dotenv
3. Create the agent
Computer Use requires you to implement the agent loop yourself. Below is a complete working example that navigates to Hacker News and identifies the top post.
When using the Anthropic API directly, you must use the correct combination of model, tool version, and beta flag. Check the Anthropic Models Documentation and Computer Use Documentation for the latest values.
Current working values (as of this writing):
- Model:
claude-sonnet-4-6 - Beta:
computer-use-2025-11-24 - Tool type:
computer_20251124
- TypeScript
- Python
import { chromium, Page } from "playwright-core";
import Anthropic from "@anthropic-ai/sdk";
import * as dotenv from "dotenv";
dotenv.config();
const BROWSERLESS_URL = `wss://production-sfo.browserless.io?token=${process.env.BROWSERLESS_API_KEY}`;
const DISPLAY_WIDTH = 1024;
const DISPLAY_HEIGHT = 768;
async function captureScreenshot(page: Page): Promise<string> {
const screenshot = await page.screenshot({ type: "png" });
return screenshot.toString("base64");
}
async function executeComputerAction(
page: Page,
action: string,
params: Record<string, unknown>
): Promise<string> {
const coordinate = params.coordinate as [number, number] | undefined;
switch (action) {
case "screenshot":
return "Screenshot captured";
case "left_click":
if (coordinate) {
await page.mouse.click(coordinate[0], coordinate[1]);
return `Clicked at (${coordinate[0]}, ${coordinate[1]})`;
}
return "Missing coordinate for left_click";
case "type":
await page.keyboard.type(params.text as string);
return `Typed: "${params.text}"`;
case "key":
await page.keyboard.press(params.text as string);
return `Pressed key: ${params.text}`;
case "scroll": {
const direction = params.scroll_direction as string;
const amount = (params.scroll_amount as number) ?? 3;
const scrollCoord = coordinate ?? [DISPLAY_WIDTH / 2, DISPLAY_HEIGHT / 2];
await page.mouse.move(scrollCoord[0], scrollCoord[1]);
const deltaX = direction === "left" ? -amount * 100 : direction === "right" ? amount * 100 : 0;
const deltaY = direction === "down" ? amount * 100 : direction === "up" ? -amount * 100 : 0;
await page.mouse.wheel(deltaX, deltaY);
return `Scrolled ${direction} by ${amount}`;
}
default:
return `Unknown action: ${action}`;
}
}
async function runComputerUseLoop(
client: Anthropic,
page: Page,
task: string
): Promise<string> {
const initialScreenshot = await captureScreenshot(page);
const messages: Anthropic.Beta.Messages.BetaMessageParam[] = [
{
role: "user",
content: [
{
type: "image",
source: {
type: "base64",
media_type: "image/png",
data: initialScreenshot,
},
},
{ type: "text", text: task },
],
},
];
const tools: Anthropic.Beta.Messages.BetaToolUnion[] = [
{
type: "computer_20251124",
name: "computer",
display_width_px: DISPLAY_WIDTH,
display_height_px: DISPLAY_HEIGHT,
},
];
const MAX_ITERATIONS = 10;
for (let i = 0; i < MAX_ITERATIONS; i++) {
const response = await client.beta.messages.create({
model: "claude-sonnet-4-6",
max_tokens: 4096,
tools,
messages,
betas: ["computer-use-2025-11-24"],
});
const assistantContent: Anthropic.Beta.Messages.BetaContentBlock[] = [];
const toolResults: Anthropic.Beta.Messages.BetaToolResultBlockParam[] = [];
for (const block of response.content) {
assistantContent.push(block);
if (block.type === "tool_use") {
const input = block.input as Record<string, unknown>;
const action = input.action as string;
if (action === "screenshot") {
const screenshot = await captureScreenshot(page);
toolResults.push({
type: "tool_result",
tool_use_id: block.id,
content: [
{
type: "image",
source: {
type: "base64",
media_type: "image/png",
data: screenshot,
},
},
],
});
} else {
const result = await executeComputerAction(page, action, input);
toolResults.push({
type: "tool_result",
tool_use_id: block.id,
content: result,
});
}
}
}
messages.push({ role: "assistant", content: assistantContent });
if (toolResults.length === 0) {
return response.content
.filter(
(b): b is Anthropic.Beta.Messages.BetaTextBlock => b.type === "text"
)
.map((b) => b.text)
.join("\n");
}
messages.push({ role: "user", content: toolResults });
}
return "Max iterations reached";
}
async function main() {
const browser = await chromium.connectOverCDP(BROWSERLESS_URL);
const page = await browser.newPage();
await page.setViewportSize({ width: DISPLAY_WIDTH, height: DISPLAY_HEIGHT });
await page.goto("https://news.ycombinator.com", { waitUntil: "networkidle" });
const client = new Anthropic();
const result = await runComputerUseLoop(
client,
page,
"What is the title of the top post on this page? Take a screenshot first to see the page."
);
console.log("Result:", result);
await browser.close();
}
main();
import asyncio
import base64
import os
from dotenv import load_dotenv
from playwright.async_api import async_playwright
import anthropic
load_dotenv()
BROWSERLESS_URL = f"wss://production-sfo.browserless.io?token={os.environ['BROWSERLESS_API_KEY']}"
DISPLAY_WIDTH = 1024
DISPLAY_HEIGHT = 768
async def capture_screenshot(page) -> str:
screenshot = await page.screenshot(type="png")
return base64.b64encode(screenshot).decode()
async def execute_computer_action(page, action: str, params: dict) -> str:
if action == "screenshot":
return "Screenshot captured"
elif action == "left_click":
x, y = params["coordinate"]
await page.mouse.click(x, y)
return f"Clicked at ({x}, {y})"
elif action == "type":
await page.keyboard.type(params["text"])
return f'Typed: "{params["text"]}"'
elif action == "key":
await page.keyboard.press(params["text"])
return f'Pressed key: {params["text"]}'
elif action == "scroll":
direction = params.get("scroll_direction", "down")
amount = params.get("scroll_amount", 3)
coord = params.get("coordinate", [DISPLAY_WIDTH // 2, DISPLAY_HEIGHT // 2])
await page.mouse.move(coord[0], coord[1])
delta_x = -amount * 100 if direction == "left" else amount * 100 if direction == "right" else 0
delta_y = amount * 100 if direction == "down" else -amount * 100 if direction == "up" else 0
await page.mouse.wheel(delta_x, delta_y)
return f"Scrolled {direction} by {amount}"
return f"Unknown action: {action}"
async def run_computer_use_loop(client, page, task: str) -> str:
initial_screenshot = await capture_screenshot(page)
messages = [
{
"role": "user",
"content": [
{
"type": "image",
"source": {
"type": "base64",
"media_type": "image/png",
"data": initial_screenshot,
},
},
{"type": "text", "text": task},
],
}
]
tools = [
{
"type": "computer_20251124",
"name": "computer",
"display_width_px": DISPLAY_WIDTH,
"display_height_px": DISPLAY_HEIGHT,
}
]
max_iterations = 10
for _ in range(max_iterations):
response = await client.beta.messages.create(
model="claude-sonnet-4-6",
max_tokens=4096,
tools=tools,
messages=messages,
betas=["computer-use-2025-11-24"],
)
assistant_content = []
tool_results = []
for block in response.content:
assistant_content.append(block)
if block.type == "tool_use":
action = block.input.get("action")
if action == "screenshot":
screenshot = await capture_screenshot(page)
tool_results.append(
{
"type": "tool_result",
"tool_use_id": block.id,
"content": [
{
"type": "image",
"source": {
"type": "base64",
"media_type": "image/png",
"data": screenshot,
},
}
],
}
)
else:
result = await execute_computer_action(page, action, block.input)
tool_results.append(
{
"type": "tool_result",
"tool_use_id": block.id,
"content": result,
}
)
messages.append({"role": "assistant", "content": assistant_content})
if not tool_results:
return "\n".join(b.text for b in response.content if b.type == "text")
messages.append({"role": "user", "content": tool_results})
return "Max iterations reached"
async def main():
async with async_playwright() as p:
browser = await p.chromium.connect_over_cdp(BROWSERLESS_URL)
page = await browser.new_page()
await page.set_viewport_size({"width": DISPLAY_WIDTH, "height": DISPLAY_HEIGHT})
await page.goto("https://news.ycombinator.com", wait_until="networkidle")
client = anthropic.AsyncAnthropic()
result = await run_computer_use_loop(
client,
page,
"What is the title of the top post on this page? Take a screenshot first to see the page.",
)
print("Result:", result)
await browser.close()
asyncio.run(main())
Available actions
Claude can request these actions through the Computer Use tool:
| Action | Description |
|---|---|
screenshot | Capture current screen state |
left_click | Click at coordinates [x, y] |
right_click | Right-click at coordinates |
double_click | Double-click at coordinates |
type | Type a text string |
key | Press a key or combo via text param (e.g., Enter, ctrl+s) |
mouse_move | Move cursor to coordinates |
scroll | Scroll at coordinates with scroll_direction and scroll_amount |
Advanced Configuration
Stealth mode
For sites with bot detection, use the stealth route:
- TypeScript
- Python
const browser = await chromium.connectOverCDP(
`wss://production-sfo.browserless.io/stealth?token=${process.env.BROWSERLESS_API_KEY}`
);
browser = await p.chromium.connect_over_cdp(
f"wss://production-sfo.browserless.io/stealth?token={os.environ['BROWSERLESS_API_KEY']}"
)
Residential proxies
Route traffic through residential IPs for geo-targeting and rate-limit avoidance:
- TypeScript
- Python
const browser = await chromium.connectOverCDP(
`wss://production-sfo.browserless.io?token=${process.env.BROWSERLESS_API_KEY}&proxy=residential&proxyCountry=us`
);
browser = await p.chromium.connect_over_cdp(
f"wss://production-sfo.browserless.io?token={os.environ['BROWSERLESS_API_KEY']}&proxy=residential&proxyCountry=us"
)
Regional endpoints
Pick the region closest to you for lower latency:
| Region | Endpoint |
|---|---|
| US West (San Francisco) | wss://production-sfo.browserless.io |
| Europe (London) | wss://production-lon.browserless.io |
| Europe (Amsterdam) | wss://production-ams.browserless.io |
Why use Browserless with Anthropic Computer Use
- Stealth mode: bypass bot detection by adding
/stealthto the endpoint URL - Residential proxies: route traffic through real residential IPs to avoid blocks
- Global regions: choose from US West (San Francisco), Europe (London), or Europe (Amsterdam) endpoints for lower latency
- No infrastructure: skip managing Chrome installations, updates, or scaling
- Parallel sessions: run multiple browser sessions simultaneously
- Enterprise reliability: 99.9% uptime SLA with automatic failover