Scrape Reddit Posts
Extract post titles, scores, comment counts, and links from any subreddit.
- A Browserless API token from your account dashboard
Steps
Reddit's feed is fully JavaScript-rendered and detects automated requests, so a plain HTTP fetch returns no useful data. If you only need basic post data, Reddit's JSON API (reddit.com/r/{subreddit}.json) is simpler and doesn't require a browser. Use the browser approach when you need data the API doesn't expose, want to avoid API rate limits, or are scraping user profiles and comment threads.
The examples below target r/programming and route through stealth mode and a residential proxy to bypass Reddit's fingerprinting and rate limiting.
Reddit updates its markup periodically. If shreddit-post stops returning results or attributes come back null, inspect the live page with browser DevTools to find the current element and attribute names.
- REST API
- Frameworks
- BQL
Send the BQL mutation over HTTP to the stealth endpoint. No browser library or BQL IDE required.
- cURL
- JavaScript
- Python
- Java
- C#
1. Send the request
curl -X POST \
"https://production-sfo.browserless.io/stealth/bql?token=YOUR_API_TOKEN_HERE&proxy=residential&proxyCountry=us" \
-H "Content-Type: application/json" \
-d '{
"query": "mutation ScrapeRedditPosts { goto(url: \"https://www.reddit.com/r/programming/\", waitUntil: networkIdle) { status } waitForSelector(selector: \"shreddit-post\", timeout: 15000) { time } posts: mapSelector(selector: \"shreddit-post\") { title: attribute(name: \"post-title\") { value } score: attribute(name: \"score\") { value } commentCount: attribute(name: \"comment-count\") { value } permalink: attribute(name: \"permalink\") { value } } }",
"variables": {}
}'
2. Check the output
{
"data": {
"goto": { "status": 200 },
"waitForSelector": { "time": 2103 },
"posts": [
{
"title": { "value": "Show HN: I built a static site generator in Go" },
"score": { "value": "1847" },
"commentCount": { "value": "143" },
"permalink": { "value": "/r/programming/comments/abc123/show_hn_i_built_a_static_site_generator" }
},
{
"title": { "value": "Why I still use Vim in 2025" },
"score": { "value": "923" },
"commentCount": { "value": "312" },
"permalink": { "value": "/r/programming/comments/def456/why_i_still_use_vim_in_2025" }
}
]
}
}
1. Send the request
const query = `mutation ScrapeRedditPosts {
goto(url: "https://www.reddit.com/r/programming/", waitUntil: networkIdle) {
status
}
waitForSelector(selector: "shreddit-post", timeout: 15000) {
time
}
posts: mapSelector(selector: "shreddit-post") {
title: attribute(name: "post-title") { value }
score: attribute(name: "score") { value }
commentCount: attribute(name: "comment-count") { value }
permalink: attribute(name: "permalink") { value }
}
}`;
const response = await fetch(
'https://production-sfo.browserless.io/stealth/bql?token=YOUR_API_TOKEN_HERE&proxy=residential&proxyCountry=us',
{
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({ query, variables: {} }),
}
);
const { data } = await response.json();
console.log(JSON.stringify(data.posts, null, 2));
2. Check the output
{
"data": {
"goto": { "status": 200 },
"waitForSelector": { "time": 2103 },
"posts": [
{
"title": { "value": "Show HN: I built a static site generator in Go" },
"score": { "value": "1847" },
"commentCount": { "value": "143" },
"permalink": { "value": "/r/programming/comments/abc123/show_hn_i_built_a_static_site_generator" }
},
{
"title": { "value": "Why I still use Vim in 2025" },
"score": { "value": "923" },
"commentCount": { "value": "312" },
"permalink": { "value": "/r/programming/comments/def456/why_i_still_use_vim_in_2025" }
}
]
}
}
1. Install dependencies
pip install requests
2. Send the request
import requests
query = """
mutation ScrapeRedditPosts {
goto(url: "https://www.reddit.com/r/programming/", waitUntil: networkIdle) {
status
}
waitForSelector(selector: "shreddit-post", timeout: 15000) {
time
}
posts: mapSelector(selector: "shreddit-post") {
title: attribute(name: "post-title") { value }
score: attribute(name: "score") { value }
commentCount: attribute(name: "comment-count") { value }
permalink: attribute(name: "permalink") { value }
}
}
"""
response = requests.post(
'https://production-sfo.browserless.io/stealth/bql',
params={
'token': 'YOUR_API_TOKEN_HERE',
'proxy': 'residential',
'proxyCountry': 'us',
},
json={'query': query, 'variables': {}},
)
data = response.json()['data']
for post in data['posts']:
title = post['title']['value']
score = post['score']['value']
comments = post['commentCount']['value']
permalink = post['permalink']['value']
print(f'{title} | {score} pts | {comments} comments | reddit.com{permalink}')
3. Check the output
{
"data": {
"goto": { "status": 200 },
"waitForSelector": { "time": 2103 },
"posts": [
{
"title": { "value": "Show HN: I built a static site generator in Go" },
"score": { "value": "1847" },
"commentCount": { "value": "143" },
"permalink": { "value": "/r/programming/comments/abc123/show_hn_i_built_a_static_site_generator" }
},
{
"title": { "value": "Why I still use Vim in 2025" },
"score": { "value": "923" },
"commentCount": { "value": "312" },
"permalink": { "value": "/r/programming/comments/def456/why_i_still_use_vim_in_2025" }
}
]
}
}
1. Send the request
import java.net.URI;
import java.net.http.*;
String token = "YOUR_API_TOKEN_HERE";
String endpoint = "https://production-sfo.browserless.io/stealth/bql?token=" + token
+ "&proxy=residential&proxyCountry=us";
String query = "mutation ScrapeRedditPosts {"
+ " goto(url: \\\"https://www.reddit.com/r/programming/\\\", waitUntil: networkIdle) { status }"
+ " waitForSelector(selector: \\\"shreddit-post\\\", timeout: 15000) { time }"
+ " posts: mapSelector(selector: \\\"shreddit-post\\\") {"
+ " title: attribute(name: \\\"post-title\\\") { value }"
+ " score: attribute(name: \\\"score\\\") { value }"
+ " commentCount: attribute(name: \\\"comment-count\\\") { value }"
+ " permalink: attribute(name: \\\"permalink\\\") { value }"
+ " }"
+ " }";
String payload = "{\"query\": \"" + query + "\", \"variables\": {}}";
HttpClient client = HttpClient.newHttpClient();
HttpRequest request = HttpRequest.newBuilder()
.uri(URI.create(endpoint))
.header("Content-Type", "application/json")
.POST(HttpRequest.BodyPublishers.ofString(payload))
.build();
HttpResponse<String> response = client.send(request, HttpResponse.BodyHandlers.ofString());
System.out.println(response.body());
2. Check the output
{
"data": {
"goto": { "status": 200 },
"waitForSelector": { "time": 2103 },
"posts": [
{
"title": { "value": "Show HN: I built a static site generator in Go" },
"score": { "value": "1847" },
"commentCount": { "value": "143" },
"permalink": { "value": "/r/programming/comments/abc123/show_hn_i_built_a_static_site_generator" }
},
{
"title": { "value": "Why I still use Vim in 2025" },
"score": { "value": "923" },
"commentCount": { "value": "312" },
"permalink": { "value": "/r/programming/comments/def456/why_i_still_use_vim_in_2025" }
}
]
}
}
1. Send the request
using System.Net.Http;
using System.Text;
using System.Text.Json;
string token = "YOUR_API_TOKEN_HERE";
string endpoint = $"https://production-sfo.browserless.io/stealth/bql?token={token}&proxy=residential&proxyCountry=us";
var payload = new
{
query = @"mutation ScrapeRedditPosts {
goto(url: ""https://www.reddit.com/r/programming/"", waitUntil: networkIdle) { status }
waitForSelector(selector: ""shreddit-post"", timeout: 15000) { time }
posts: mapSelector(selector: ""shreddit-post"") {
title: attribute(name: ""post-title"") { value }
score: attribute(name: ""score"") { value }
commentCount: attribute(name: ""comment-count"") { value }
permalink: attribute(name: ""permalink"") { value }
}
}",
variables = new { },
};
using (HttpClient httpClient = new HttpClient())
{
var content = new StringContent(
JsonSerializer.Serialize(payload), Encoding.UTF8, "application/json");
var response = await httpClient.PostAsync(endpoint, content);
string body = await response.Content.ReadAsStringAsync();
Console.WriteLine(body);
}
2. Check the output
{
"data": {
"goto": { "status": 200 },
"waitForSelector": { "time": 2103 },
"posts": [
{
"title": { "value": "Show HN: I built a static site generator in Go" },
"score": { "value": "1847" },
"commentCount": { "value": "143" },
"permalink": { "value": "/r/programming/comments/abc123/show_hn_i_built_a_static_site_generator" }
},
{
"title": { "value": "Why I still use Vim in 2025" },
"score": { "value": "923" },
"commentCount": { "value": "312" },
"permalink": { "value": "/r/programming/comments/def456/why_i_still_use_vim_in_2025" }
}
]
}
}
Connect through stealth mode and a residential proxy so Reddit sees traffic from what looks like a real browser, then read post data directly from the rendered feed.
- Puppeteer
- Playwright
1. Install dependencies
npm install puppeteer-core
2. Connect and scrape
import puppeteer from 'puppeteer-core';
const browser = await puppeteer.connect({
browserWSEndpoint:
'wss://production-sfo.browserless.io/stealth?token=YOUR_API_TOKEN_HERE&proxy=residential&proxyCountry=us',
});
try {
const page = await browser.newPage();
await page.goto('https://www.reddit.com/r/programming/', {
waitUntil: 'networkidle2',
});
// Wait for posts to render before reading attributes.
await page.waitForSelector('shreddit-post');
const posts = await page.evaluate(() =>
Array.from(document.querySelectorAll('shreddit-post')).map((post) => ({
title: post.getAttribute('post-title'),
score: post.getAttribute('score'),
commentCount: post.getAttribute('comment-count'),
permalink: post.getAttribute('permalink'),
}))
);
console.log(JSON.stringify(posts, null, 2));
} finally {
// Always close to release the session even on error.
await browser.close();
}
3. Check the output
Run with node scrape-reddit.mjs. An empty array means posts didn't render in time. Increase the waitForSelector timeout or check whether a consent banner intercepted the page.
[
{
"title": "Show HN: I built a static site generator in Go",
"score": "1847",
"commentCount": "143",
"permalink": "/r/programming/comments/abc123/show_hn_i_built_a_static_site_generator"
},
{
"title": "Why I still use Vim in 2025",
"score": "923",
"commentCount": "312",
"permalink": "/r/programming/comments/def456/why_i_still_use_vim_in_2025"
}
]
1. Install dependencies
npm install playwright-core
2. Connect and scrape
import { chromium } from 'playwright-core';
const browser = await chromium.connectOverCDP(
'wss://production-sfo.browserless.io?token=YOUR_API_TOKEN_HERE&stealth&proxy=residential&proxyCountry=us'
);
try {
const context = browser.contexts()[0];
const page = await context.newPage();
await page.goto('https://www.reddit.com/r/programming/', {
waitUntil: 'networkidle',
});
// Wait for posts to render before reading attributes.
await page.waitForSelector('shreddit-post');
const posts = await page.evaluate(() =>
Array.from(document.querySelectorAll('shreddit-post')).map((post) => ({
title: post.getAttribute('post-title'),
score: post.getAttribute('score'),
commentCount: post.getAttribute('comment-count'),
permalink: post.getAttribute('permalink'),
}))
);
console.log(JSON.stringify(posts, null, 2));
} finally {
// Always close to release the session even on error.
await browser.close();
}
3. Check the output
Run with node scrape-reddit.mjs. An empty array means posts didn't render in time. Increase the waitForSelector timeout or check whether a consent banner intercepted the page.
[
{
"title": "Show HN: I built a static site generator in Go",
"score": "1847",
"commentCount": "143",
"permalink": "/r/programming/comments/abc123/show_hn_i_built_a_static_site_generator"
},
{
"title": "Why I still use Vim in 2025",
"score": "923",
"commentCount": "312",
"permalink": "/r/programming/comments/def456/why_i_still_use_vim_in_2025"
}
]
1. Write the mutation
Navigate to the subreddit, wait for posts to appear, then read title, score, comment count, and permalink from each post element's attributes. We send this to /stealth/bql instead of the default /bql because Reddit's bot detection blocks plain browser sessions.
mutation ScrapeRedditPosts {
goto(url: "https://www.reddit.com/r/programming/", waitUntil: networkIdle) {
status
}
waitForSelector(selector: "shreddit-post", timeout: 15000) {
time
}
posts: mapSelector(selector: "shreddit-post") {
title: attribute(name: "post-title") { value }
score: attribute(name: "score") { value }
commentCount: attribute(name: "comment-count") { value }
permalink: attribute(name: "permalink") { value }
}
}
2. Run it
Paste into the BQL IDE and click Run.
3. Check the output
{
"data": {
"goto": { "status": 200 },
"waitForSelector": { "time": 2103 },
"posts": [
{
"title": { "value": "Show HN: I built a static site generator in Go" },
"score": { "value": "1847" },
"commentCount": { "value": "143" },
"permalink": { "value": "/r/programming/comments/abc123/show_hn_i_built_a_static_site_generator" }
},
{
"title": { "value": "Why I still use Vim in 2025" },
"score": { "value": "923" },
"commentCount": { "value": "312" },
"permalink": { "value": "/r/programming/comments/def456/why_i_still_use_vim_in_2025" }
}
]
}
}
Next steps
- Scrape Glassdoor Job Listings — another stealth-mode scrape against aggressive bot detection
- Automate Google Search — pull search results using the same
/stealth/bqlendpoint - Solving Cloudflare Challenges — bypass Cloudflare's interstitial pages before scraping