Scrape Indeed job listings
Search Indeed for jobs and extract titles, companies, locations, salaries, and descriptions from the results.
- A Browserless API token from your account dashboard
Steps
Indeed renders job listings with JavaScript and employs bot-detection measures that block standard headless browsers. The examples below search for "data scientist" jobs and route through stealth mode with a residential proxy.
Indeed updates its markup frequently. If .job_seen_beacon or nested selectors stop returning results, inspect the live page with browser DevTools to find the current element names.
- AI Agent
- REST API
- Frameworks
- BQL
Use the Browserless MCP server to scrape job listings from Indeed from any MCP-compatible AI agent (Claude Desktop, Cursor, Windsurf, ChatGPT, etc.).
1. Connect the MCP server
Send this prompt to your AI agent to install the Browserless MCP server:
Go to https://github.com/browserless/browserless-mcp/blob/main/install.md
and follow the instructions to install the Browserless MCP server
for my client.
2. Scrape Indeed
Use browserless_smartscraper. It handles Indeed's dynamic content and bot protection automatically.
Use the browserless_smartscraper tool to scrape job listings
from https://www.indeed.com/jobs?q=data+scientist&l=Remote
and return the results as markdown
Send the BQL mutation over HTTP to the stealth endpoint. No browser library or BQL IDE required.
- cURL
- JavaScript
- Python
- Java
- C#
1. Send the request
curl -X POST \
"https://production-sfo.browserless.io/stealth/bql?token=YOUR_API_TOKEN_HERE&proxy=residential&proxyCountry=us" \
-H "Content-Type: application/json" \
-d '{
"query": "mutation ScrapeIndeedJobs { goto(url: \"https://www.indeed.com/jobs?q=data+scientist&l=Remote\", waitUntil: networkIdle) { status } waitForSelector(selector: \".job_seen_beacon\", timeout: 15000) { time } jobs: mapSelector(selector: \".job_seen_beacon\") { title: mapSelector(selector: \".jobTitle a span\") { innerText } company: mapSelector(selector: \".companyName\") { innerText } location: mapSelector(selector: \".companyLocation\") { innerText } salary: mapSelector(selector: \".salary-snippet-container\") { innerText } snippet: mapSelector(selector: \".job-snippet\") { innerText } link: mapSelector(selector: \".jobTitle a\") { href: attribute(name: \"href\") { value } } } }",
"variables": {}
}'
2. Check the output
{
"data": {
"goto": { "status": 200 },
"waitForSelector": { "time": 2956 },
"jobs": [
{
"title": [{ "innerText": "Senior Data Scientist" }],
"company": [{ "innerText": "Meta" }],
"location": [{ "innerText": "Remote" }],
"salary": [{ "innerText": "$140,000 - $180,000 a year" }],
"snippet": [{ "innerText": "Build and deploy ML models for content recommendations..." }],
"link": [{ "href": { "value": "/rc/clk?jk=abc123" } }]
},
{
"title": [{ "innerText": "Data Scientist" }],
"company": [{ "innerText": "Spotify" }],
"location": [{ "innerText": "Remote" }],
"salary": [{ "innerText": "$120,000 - $155,000 a year" }],
"snippet": [{ "innerText": "Analyze user listening patterns and build recommendation engines..." }],
"link": [{ "href": { "value": "/rc/clk?jk=def456" } }]
}
]
}
}
1. Send the request
const query = `mutation ScrapeIndeedJobs {
goto(url: "https://www.indeed.com/jobs?q=data+scientist&l=Remote", waitUntil: networkIdle) {
status
}
waitForSelector(selector: ".job_seen_beacon", timeout: 15000) {
time
}
jobs: mapSelector(selector: ".job_seen_beacon") {
title: mapSelector(selector: ".jobTitle a span") { innerText }
company: mapSelector(selector: ".companyName") { innerText }
location: mapSelector(selector: ".companyLocation") { innerText }
salary: mapSelector(selector: ".salary-snippet-container") { innerText }
snippet: mapSelector(selector: ".job-snippet") { innerText }
link: mapSelector(selector: ".jobTitle a") {
href: attribute(name: "href") { value }
}
}
}`;
const response = await fetch(
'https://production-sfo.browserless.io/stealth/bql?token=YOUR_API_TOKEN_HERE&proxy=residential&proxyCountry=us',
{
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({ query, variables: {} }),
}
);
const { data } = await response.json();
console.log(JSON.stringify(data.jobs, null, 2));
2. Check the output
[
{
"title": [{ "innerText": "Senior Data Scientist" }],
"company": [{ "innerText": "Meta" }],
"location": [{ "innerText": "Remote" }],
"salary": [{ "innerText": "$140,000 - $180,000 a year" }],
"snippet": [{ "innerText": "Build and deploy ML models for content recommendations..." }],
"link": [{ "href": { "value": "/rc/clk?jk=abc123" } }]
},
{
"title": [{ "innerText": "Data Scientist" }],
"company": [{ "innerText": "Spotify" }],
"location": [{ "innerText": "Remote" }],
"salary": [{ "innerText": "$120,000 - $155,000 a year" }],
"snippet": [{ "innerText": "Analyze user listening patterns and build recommendation engines..." }],
"link": [{ "href": { "value": "/rc/clk?jk=def456" } }]
}
]
1. Install dependencies
pip install requests
2. Send the request
import requests
query = """
mutation ScrapeIndeedJobs {
goto(url: "https://www.indeed.com/jobs?q=data+scientist&l=Remote", waitUntil: networkIdle) {
status
}
waitForSelector(selector: ".job_seen_beacon", timeout: 15000) {
time
}
jobs: mapSelector(selector: ".job_seen_beacon") {
title: mapSelector(selector: ".jobTitle a span") { innerText }
company: mapSelector(selector: ".companyName") { innerText }
location: mapSelector(selector: ".companyLocation") { innerText }
salary: mapSelector(selector: ".salary-snippet-container") { innerText }
snippet: mapSelector(selector: ".job-snippet") { innerText }
link: mapSelector(selector: ".jobTitle a") {
href: attribute(name: "href") { value }
}
}
}
"""
response = requests.post(
'https://production-sfo.browserless.io/stealth/bql',
params={
'token': 'YOUR_API_TOKEN_HERE',
'proxy': 'residential',
'proxyCountry': 'us',
},
json={'query': query, 'variables': {}},
)
data = response.json()['data']
for job in data['jobs']:
title = job['title'][0]['innerText']
company = job['company'][0]['innerText']
location = job['location'][0]['innerText']
salary = job['salary'][0]['innerText'] if job['salary'] else 'Not listed'
print(f'{title} at {company} ({location}) - {salary}')
3. Check the output
Senior Data Scientist at Meta (Remote) - $140,000 - $180,000 a year
Data Scientist at Spotify (Remote) - $120,000 - $155,000 a year
1. Send the request
import java.net.URI;
import java.net.http.*;
String token = "YOUR_API_TOKEN_HERE";
String endpoint = "https://production-sfo.browserless.io/stealth/bql?token=" + token
+ "&proxy=residential&proxyCountry=us";
String query = "mutation ScrapeIndeedJobs {"
+ " goto(url: \\\"https://www.indeed.com/jobs?q=data+scientist&l=Remote\\\", waitUntil: networkIdle) { status }"
+ " waitForSelector(selector: \\\".job_seen_beacon\\\", timeout: 15000) { time }"
+ " jobs: mapSelector(selector: \\\".job_seen_beacon\\\") {"
+ " title: mapSelector(selector: \\\".jobTitle a span\\\") { innerText }"
+ " company: mapSelector(selector: \\\".companyName\\\") { innerText }"
+ " location: mapSelector(selector: \\\".companyLocation\\\") { innerText }"
+ " salary: mapSelector(selector: \\\".salary-snippet-container\\\") { innerText }"
+ " snippet: mapSelector(selector: \\\".job-snippet\\\") { innerText }"
+ " }"
+ " }";
String payload = "{\"query\": \"" + query + "\", \"variables\": {}}";
HttpClient client = HttpClient.newHttpClient();
HttpRequest request = HttpRequest.newBuilder()
.uri(URI.create(endpoint))
.header("Content-Type", "application/json")
.POST(HttpRequest.BodyPublishers.ofString(payload))
.build();
HttpResponse<String> response = client.send(request, HttpResponse.BodyHandlers.ofString());
System.out.println(response.body());
2. Check the output
{
"data": {
"goto": { "status": 200 },
"waitForSelector": { "time": 2956 },
"jobs": [
{
"title": [{ "innerText": "Senior Data Scientist" }],
"company": [{ "innerText": "Meta" }],
"location": [{ "innerText": "Remote" }]
}
]
}
}
1. Send the request
using System.Net.Http;
using System.Text;
using System.Text.Json;
string token = "YOUR_API_TOKEN_HERE";
string endpoint = $"https://production-sfo.browserless.io/stealth/bql?token={token}&proxy=residential&proxyCountry=us";
var payload = new
{
query = @"mutation ScrapeIndeedJobs {
goto(url: ""https://www.indeed.com/jobs?q=data+scientist&l=Remote"", waitUntil: networkIdle) { status }
waitForSelector(selector: "".job_seen_beacon"", timeout: 15000) { time }
jobs: mapSelector(selector: "".job_seen_beacon"") {
title: mapSelector(selector: "".jobTitle a span"") { innerText }
company: mapSelector(selector: "".companyName"") { innerText }
location: mapSelector(selector: "".companyLocation"") { innerText }
salary: mapSelector(selector: "".salary-snippet-container"") { innerText }
snippet: mapSelector(selector: "".job-snippet"") { innerText }
}
}",
variables = new { },
};
using (HttpClient httpClient = new HttpClient())
{
var content = new StringContent(
JsonSerializer.Serialize(payload), Encoding.UTF8, "application/json");
var response = await httpClient.PostAsync(endpoint, content);
string body = await response.Content.ReadAsStringAsync();
Console.WriteLine(body);
}
2. Check the output
{
"data": {
"goto": { "status": 200 },
"waitForSelector": { "time": 2956 },
"jobs": [
{
"title": [{ "innerText": "Senior Data Scientist" }],
"company": [{ "innerText": "Meta" }],
"location": [{ "innerText": "Remote" }]
}
]
}
}
Connect through stealth mode and a residential proxy so Indeed sees traffic from a real browser, then extract job data from the rendered results.
- Puppeteer
- Playwright
1. Install dependencies
npm install puppeteer-core
2. Connect and scrape
import puppeteer from 'puppeteer-core';
const browser = await puppeteer.connect({
browserWSEndpoint:
'wss://production-sfo.browserless.io/stealth?token=YOUR_API_TOKEN_HERE&proxy=residential&proxyCountry=us',
});
try {
const page = await browser.newPage();
await page.goto('https://www.indeed.com/jobs?q=data+scientist&l=Remote', {
waitUntil: 'networkidle2',
});
await page.waitForSelector('.job_seen_beacon');
const jobs = await page.evaluate(() =>
Array.from(document.querySelectorAll('.job_seen_beacon')).map((card) => ({
title: card.querySelector('.jobTitle a span')?.innerText?.trim() ?? '',
company: card.querySelector('.companyName')?.innerText?.trim() ?? '',
location: card.querySelector('.companyLocation')?.innerText?.trim() ?? '',
salary: card.querySelector('.salary-snippet-container')?.innerText?.trim() ?? '',
link: card.querySelector('.jobTitle a')?.href ?? '',
}))
);
console.log(JSON.stringify(jobs, null, 2));
} finally {
await browser.close();
}
3. Check the output
Run with node scrape-indeed-jobs.mjs. Each object has title, company, location, salary, and link fields.
[
{
"title": "Senior Data Scientist",
"company": "Meta",
"location": "Remote",
"salary": "$140,000 - $180,000 a year",
"link": "https://www.indeed.com/rc/clk?jk=abc123"
}
]
1. Install dependencies
npm install playwright-core
2. Connect and scrape
import { chromium } from 'playwright-core';
const browser = await chromium.connectOverCDP(
'wss://production-sfo.browserless.io?token=YOUR_API_TOKEN_HERE&stealth&proxy=residential&proxyCountry=us'
);
try {
const context = browser.contexts()[0];
const page = await context.newPage();
await page.goto('https://www.indeed.com/jobs?q=data+scientist&l=Remote', {
waitUntil: 'networkidle',
});
await page.waitForSelector('.job_seen_beacon');
const jobs = await page.evaluate(() =>
Array.from(document.querySelectorAll('.job_seen_beacon')).map((card) => ({
title: card.querySelector('.jobTitle a span')?.innerText?.trim() ?? '',
company: card.querySelector('.companyName')?.innerText?.trim() ?? '',
location: card.querySelector('.companyLocation')?.innerText?.trim() ?? '',
salary: card.querySelector('.salary-snippet-container')?.innerText?.trim() ?? '',
link: card.querySelector('.jobTitle a')?.href ?? '',
}))
);
console.log(JSON.stringify(jobs, null, 2));
} finally {
await browser.close();
}
3. Check the output
Run with node scrape-indeed-jobs.mjs. Each object has title, company, location, salary, and link fields.
[
{
"title": "Senior Data Scientist",
"company": "Meta",
"location": "Remote",
"salary": "$140,000 - $180,000 a year",
"link": "https://www.indeed.com/rc/clk?jk=abc123"
}
]
1. Write the mutation
Navigate to Indeed's job search results, wait for listings to render, then extract job details. We use /stealth/bql because Indeed's bot detection blocks standard headless browsers.
mutation ScrapeIndeedJobs {
goto(url: "https://www.indeed.com/jobs?q=data+scientist&l=Remote", waitUntil: networkIdle) {
status
}
waitForSelector(selector: ".job_seen_beacon", timeout: 15000) {
time
}
jobs: mapSelector(selector: ".job_seen_beacon") {
title: mapSelector(selector: ".jobTitle a span") { innerText }
company: mapSelector(selector: ".companyName") { innerText }
location: mapSelector(selector: ".companyLocation") { innerText }
salary: mapSelector(selector: ".salary-snippet-container") { innerText }
snippet: mapSelector(selector: ".job-snippet") { innerText }
link: mapSelector(selector: ".jobTitle a") {
href: attribute(name: "href") { value }
}
}
}
2. Run it
Paste into the BQL IDE and click Run.
3. Check the output
{
"data": {
"goto": { "status": 200 },
"waitForSelector": { "time": 2956 },
"jobs": [
{
"title": [{ "innerText": "Senior Data Scientist" }],
"company": [{ "innerText": "Meta" }],
"location": [{ "innerText": "Remote" }],
"salary": [{ "innerText": "$140,000 - $180,000 a year" }],
"snippet": [{ "innerText": "Build and deploy ML models for content recommendations..." }],
"link": [{ "href": { "value": "/rc/clk?jk=abc123" } }]
},
{
"title": [{ "innerText": "Data Scientist" }],
"company": [{ "innerText": "Spotify" }],
"location": [{ "innerText": "Remote" }],
"salary": [{ "innerText": "$120,000 - $155,000 a year" }],
"snippet": [{ "innerText": "Analyze user listening patterns and build recommendation engines..." }],
"link": [{ "href": { "value": "/rc/clk?jk=def456" } }]
}
]
}
}
Next steps
- Scrape LinkedIn Job Listings -- scrape another job board with stealth mode
- Scrape Glassdoor Job Listings -- stealth-mode scraping against aggressive bot detection
- Solving Cloudflare Challenges -- bypass Cloudflare's interstitial pages before scraping