Scrape Zillow Agent Listings
Pull agent names and profile URLs from a Zillow agent directory page as structured JSON.
- A Browserless API token from your account dashboard
Steps
Zillow blocks most automated requests without stealth mode and residential proxies. The examples below navigate to the New York agent directory and extract agent data from the rendered page.
Zillow updates its markup regularly. If selectors stop returning results, inspect the current DOM in browser DevTools and update them.
- REST API
- Frameworks
- BQL
Send the BQL mutation over HTTP to the stealth endpoint. No browser library or BQL IDE required.
- cURL
- JavaScript
- Python
- Java
- C#
1. Send the request
curl -X POST \
"https://production-sfo.browserless.io/stealth/bql?token=YOUR_API_TOKEN_HERE&proxy=residential&proxySticky=true&proxyCountry=us&humanlike=true&blockAds=true" \
-H "Content-Type: application/json" \
-d '{
"query": "mutation ScrapeZillow { goto(url: \"https://www.zillow.com/professionals/real-estate-agent-reviews/new-york-ny/ny/\", waitUntil: networkIdle) { status } waitForTimeout(time: 3000) { time } agents: mapSelector(selector: \"a[href^='\''\/profile\/'\'']\" ) { name: mapSelector(selector: \"span, h3\") { innerText } profileUrl: attribute(name: \"href\") { value } } }",
"variables": {},
"operationName": "ScrapeZillow"
}'
2. Check the output
{
"data": {
"goto": { "status": 200 },
"waitForTimeout": { "time": 3000 },
"agents": [
{
"name": [{ "innerText": "Jane Smith" }],
"profileUrl": { "value": "/profile/janesmith/" }
},
{
"name": [{ "innerText": "John Doe" }],
"profileUrl": { "value": "/profile/johndoe/" }
}
]
}
}
1. Send the request
const query = `mutation ScrapeZillow {
goto(
url: "https://www.zillow.com/professionals/real-estate-agent-reviews/new-york-ny/ny/"
waitUntil: networkIdle
) {
status
}
waitForTimeout(time: 3000) {
time
}
agents: mapSelector(selector: "a[href^='/profile/']") {
name: mapSelector(selector: "span, h3") {
innerText
}
profileUrl: attribute(name: "href") {
value
}
}
}`;
const response = await fetch(
'https://production-sfo.browserless.io/stealth/bql?token=YOUR_API_TOKEN_HERE&proxy=residential&proxySticky=true&proxyCountry=us&humanlike=true&blockAds=true',
{
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({ query, variables: {}, operationName: 'ScrapeZillow' }),
}
);
const { data } = await response.json();
const agents = data.agents.map((a) => ({
name: a.name?.[0]?.innerText ?? '',
profileUrl: 'https://www.zillow.com' + (a.profileUrl?.value ?? ''),
}));
console.log(JSON.stringify(agents, null, 2));
2. Check the output
{
"data": {
"goto": { "status": 200 },
"waitForTimeout": { "time": 3000 },
"agents": [
{
"name": [{ "innerText": "Jane Smith" }],
"profileUrl": { "value": "/profile/janesmith/" }
},
{
"name": [{ "innerText": "John Doe" }],
"profileUrl": { "value": "/profile/johndoe/" }
}
]
}
}
1. Install dependencies
pip install requests
2. Send the request
import requests
query = """
mutation ScrapeZillow {
goto(
url: "https://www.zillow.com/professionals/real-estate-agent-reviews/new-york-ny/ny/"
waitUntil: networkIdle
) {
status
}
waitForTimeout(time: 3000) {
time
}
agents: mapSelector(selector: "a[href^='/profile/']") {
name: mapSelector(selector: "span, h3") {
innerText
}
profileUrl: attribute(name: "href") {
value
}
}
}
"""
response = requests.post(
'https://production-sfo.browserless.io/stealth/bql',
params={
'token': 'YOUR_API_TOKEN_HERE',
'proxy': 'residential',
'proxySticky': 'true',
'proxyCountry': 'us',
'humanlike': 'true',
'blockAds': 'true',
},
json={'query': query, 'variables': {}, 'operationName': 'ScrapeZillow'},
)
data = response.json()['data']
for agent in data['agents']:
name = agent['name'][0]['innerText'] if agent['name'] else ''
profile_url = 'https://www.zillow.com' + (agent['profileUrl']['value'] if agent['profileUrl'] else '')
print(f'{name}: {profile_url}')
3. Check the output
{
"data": {
"goto": { "status": 200 },
"waitForTimeout": { "time": 3000 },
"agents": [
{
"name": [{ "innerText": "Jane Smith" }],
"profileUrl": { "value": "/profile/janesmith/" }
},
{
"name": [{ "innerText": "John Doe" }],
"profileUrl": { "value": "/profile/johndoe/" }
}
]
}
}
1. Send the request
import java.net.URI;
import java.net.http.*;
String token = "YOUR_API_TOKEN_HERE";
String endpoint = "https://production-sfo.browserless.io/stealth/bql?token=" + token
+ "&proxy=residential&proxySticky=true&proxyCountry=us&humanlike=true&blockAds=true";
String query = """
mutation ScrapeZillow {
goto(url: "https://www.zillow.com/professionals/real-estate-agent-reviews/new-york-ny/ny/", waitUntil: networkIdle) { status }
waitForTimeout(time: 3000) { time }
agents: mapSelector(selector: "a[href^='/profile/']") {
name: mapSelector(selector: "span, h3") { innerText }
profileUrl: attribute(name: "href") { value }
}
}
""";
String payload = "{\"query\": " + com.google.gson.JsonParser.parseString(query) + ", \"variables\": {}, \"operationName\": \"ScrapeZillow\"}";
HttpClient client = HttpClient.newHttpClient();
HttpRequest request = HttpRequest.newBuilder()
.uri(URI.create(endpoint))
.header("Content-Type", "application/json")
.POST(HttpRequest.BodyPublishers.ofString(payload))
.build();
HttpResponse<String> response = client.send(request, HttpResponse.BodyHandlers.ofString());
System.out.println(response.body());
2. Check the output
{
"data": {
"goto": { "status": 200 },
"waitForTimeout": { "time": 3000 },
"agents": [
{
"name": [{ "innerText": "Jane Smith" }],
"profileUrl": { "value": "/profile/janesmith/" }
},
{
"name": [{ "innerText": "John Doe" }],
"profileUrl": { "value": "/profile/johndoe/" }
}
]
}
}
1. Send the request
using System.Net.Http;
using System.Text;
using System.Text.Json;
string token = "YOUR_API_TOKEN_HERE";
string endpoint = $"https://production-sfo.browserless.io/stealth/bql?token={token}&proxy=residential&proxySticky=true&proxyCountry=us&humanlike=true&blockAds=true";
var payload = new
{
query = @"mutation ScrapeZillow {
goto(url: ""https://www.zillow.com/professionals/real-estate-agent-reviews/new-york-ny/ny/"", waitUntil: networkIdle) { status }
waitForTimeout(time: 3000) { time }
agents: mapSelector(selector: ""a[href^='/profile/']"") {
name: mapSelector(selector: ""span, h3"") { innerText }
profileUrl: attribute(name: ""href"") { value }
}
}",
variables = new { },
operationName = "ScrapeZillow",
};
using (HttpClient httpClient = new HttpClient())
{
var content = new StringContent(
JsonSerializer.Serialize(payload), Encoding.UTF8, "application/json");
var response = await httpClient.PostAsync(endpoint, content);
string body = await response.Content.ReadAsStringAsync();
Console.WriteLine(body);
}
2. Check the output
{
"data": {
"goto": { "status": 200 },
"waitForTimeout": { "time": 3000 },
"agents": [
{
"name": [{ "innerText": "Jane Smith" }],
"profileUrl": { "value": "/profile/janesmith/" }
},
{
"name": [{ "innerText": "John Doe" }],
"profileUrl": { "value": "/profile/johndoe/" }
}
]
}
}
Connect through stealth mode and a residential proxy to get past Zillow's bot detection, then extract agent data from the rendered page.
- Puppeteer
- Playwright
1. Install dependencies
npm install puppeteer-core
2. Connect and scrape
import puppeteer from 'puppeteer-core';
const browser = await puppeteer.connect({
browserWSEndpoint:
'wss://production-sfo.browserless.io/stealth?token=YOUR_API_TOKEN_HERE&proxy=residential&proxySticky=true&proxyCountry=us',
});
try {
const page = await browser.newPage();
await page.goto(
'https://www.zillow.com/professionals/real-estate-agent-reviews/new-york-ny/ny/',
{ waitUntil: 'networkidle2' }
);
const agents = await page.evaluate(() =>
Array.from(document.querySelectorAll('a[href^="/profile/"]')).map((link) => ({
name: link.innerText?.trim() ?? '',
profileUrl: 'https://www.zillow.com' + link.getAttribute('href'),
}))
);
console.log(JSON.stringify(agents, null, 2));
} finally {
await browser.close();
}
3. Check the output
Run with node scrape-zillow.mjs. You'll get a JSON array where each object has a name string and a profileUrl that's already absolute. The code prepends https://www.zillow.com to the relative href before returning it.
1. Install dependencies
npm install playwright-core
2. Connect and scrape
import { chromium } from 'playwright-core';
const browser = await chromium.connectOverCDP(
'wss://production-sfo.browserless.io/stealth?token=YOUR_API_TOKEN_HERE&proxy=residential&proxySticky=true&proxyCountry=us'
);
try {
const context = browser.contexts()[0];
const page = await context.newPage();
await page.goto(
'https://www.zillow.com/professionals/real-estate-agent-reviews/new-york-ny/ny/',
{ waitUntil: 'networkidle' }
);
const agents = await page.evaluate(() =>
Array.from(document.querySelectorAll('a[href^="/profile/"]')).map((link) => ({
name: link.innerText?.trim() ?? '',
profileUrl: 'https://www.zillow.com' + link.getAttribute('href'),
}))
);
console.log(JSON.stringify(agents, null, 2));
} finally {
await browser.close();
}
3. Check the output
Run with node scrape-zillow.mjs. You'll get a JSON array where each object has a name string and a profileUrl that's already absolute. The code prepends https://www.zillow.com to the relative href before returning it.
1. Write the mutation
Navigate to the Zillow agent directory and use mapSelector to pull each agent's name and profile URL. We run this against the stealth endpoint with humanlike and blockAds enabled. humanlike triggers behavioral simulation (mouse movement, realistic timing) so the session looks less like a bot; blockAds blocks third-party trackers that can fingerprint automated sessions and trigger detection. The waitForTimeout is there because the agent grid renders client-side. Without it, the DOM may not have populated yet when the selectors run.
mutation ScrapeZillow {
goto(
url: "https://www.zillow.com/professionals/real-estate-agent-reviews/new-york-ny/ny/"
waitUntil: networkIdle
) {
status
}
waitForTimeout(time: 3000) {
time
}
agents: mapSelector(selector: "a[href^='/profile/']") {
name: mapSelector(selector: "span, h3") {
innerText
}
profileUrl: attribute(name: "href") {
value
}
}
}
2. Run it
Paste into the BQL IDE and click Run.
3. Check the output
Each agent entry in data.agents has a name array (text nodes from the matched child selectors) and a profileUrl object with the relative path. Prepend https://www.zillow.com to get the full URL.
{
"data": {
"goto": { "status": 200 },
"waitForTimeout": { "time": 3000 },
"agents": [
{
"name": [{ "innerText": "Jane Smith" }],
"profileUrl": { "value": "/profile/janesmith/" }
},
{
"name": [{ "innerText": "John Doe" }],
"profileUrl": { "value": "/profile/johndoe/" }
}
]
}
}
Next steps
- Scrape Glassdoor Job Listings — apply the same stealth pattern to job listings
- Scrape Walmart Product Listings — extract product data from another bot-protected site
- Change Your Browser's IP Address Using Proxies — control which IP address target sites see