Scrape Reddit posts

Extract post titles, scores, comment counts, and links from any subreddit.

Prerequisites

A Browserless API token from your account dashboard

Steps

Reddit's feed is fully JavaScript-rendered and detects automated requests, so a plain HTTP fetch returns no useful data. If you only need basic post data, Reddit's JSON API (reddit.com/r/{subreddit}.json) is simpler and doesn't require a browser. Use the browser approach when you need data the API doesn't expose, want to avoid API rate limits, or are scraping user profiles and comment threads.

The examples below target r/programming and route through stealth mode and a residential proxy to bypass Reddit's fingerprinting and rate limiting.

Selector stability

Reddit updates its markup periodically. If shreddit-post stops returning results or attributes come back null, inspect the live page with browser DevTools to find the current element and attribute names.

AI Agent
REST API
Frameworks
BQL

Use the Browserless MCP server to scrape posts from Reddit from any MCP-compatible AI agent (Claude Desktop, Cursor, Windsurf, ChatGPT, etc.).

1. Connect the MCP server

Send this prompt to your AI agent to install the Browserless MCP server:

Go to https://github.com/browserless/browserless-mcp/blob/main/install.md
and follow the instructions to install the Browserless MCP server
for my client.

2. Scrape Reddit

Use browserless_smartscraper. It handles Reddit's dynamic content and bot protection automatically.

Use the browserless_smartscraper tool to scrape the top posts
from https://www.reddit.com/r/programming
and return the results as markdown

Send the BQL mutation over HTTP to the stealth endpoint. No browser library or BQL IDE required.

cURL
JavaScript
Python
Java
C#

View Full Code on GitHub

1. Send the request

curl -X POST \
  "https://production-sfo.browserless.io/stealth/bql?token=YOUR_API_TOKEN_HERE&proxy=residential&proxyCountry=us" \
  -H "Content-Type: application/json" \
  -d '{
    "query": "mutation ScrapeRedditPosts { goto(url: \"https://www.reddit.com/r/programming/\", waitUntil: networkIdle) { status } waitForSelector(selector: \"shreddit-post\", timeout: 15000) { time } posts: mapSelector(selector: \"shreddit-post\") { title: attribute(name: \"post-title\") { value } score: attribute(name: \"score\") { value } commentCount: attribute(name: \"comment-count\") { value } permalink: attribute(name: \"permalink\") { value } } }",
    "variables": {}
  }'

2. Check the output

{
  "data": {
    "goto": { "status": 200 },
    "waitForSelector": { "time": 2103 },
    "posts": [
      {
        "title": { "value": "Show HN: I built a static site generator in Go" },
        "score": { "value": "1847" },
        "commentCount": { "value": "143" },
        "permalink": { "value": "/r/programming/comments/abc123/show_hn_i_built_a_static_site_generator" }
      },
      {
        "title": { "value": "Why I still use Vim in 2025" },
        "score": { "value": "923" },
        "commentCount": { "value": "312" },
        "permalink": { "value": "/r/programming/comments/def456/why_i_still_use_vim_in_2025" }
      }
    ]
  }
}

View Full Code on GitHub

1. Send the request

const query = `mutation ScrapeRedditPosts {
  goto(url: "https://www.reddit.com/r/programming/", waitUntil: networkIdle) {
    status
  }
  waitForSelector(selector: "shreddit-post", timeout: 15000) {
    time
  }
  posts: mapSelector(selector: "shreddit-post") {
    title: attribute(name: "post-title") { value }
    score: attribute(name: "score") { value }
    commentCount: attribute(name: "comment-count") { value }
    permalink: attribute(name: "permalink") { value }
  }
}`;

const response = await fetch(
  'https://production-sfo.browserless.io/stealth/bql?token=YOUR_API_TOKEN_HERE&proxy=residential&proxyCountry=us',
  {
    method: 'POST',
    headers: { 'Content-Type': 'application/json' },
    body: JSON.stringify({ query, variables: {} }),
  }
);

const { data } = await response.json();
console.log(JSON.stringify(data.posts, null, 2));

2. Check the output

{
  "data": {
    "goto": { "status": 200 },
    "waitForSelector": { "time": 2103 },
    "posts": [
      {
        "title": { "value": "Show HN: I built a static site generator in Go" },
        "score": { "value": "1847" },
        "commentCount": { "value": "143" },
        "permalink": { "value": "/r/programming/comments/abc123/show_hn_i_built_a_static_site_generator" }
      },
      {
        "title": { "value": "Why I still use Vim in 2025" },
        "score": { "value": "923" },
        "commentCount": { "value": "312" },
        "permalink": { "value": "/r/programming/comments/def456/why_i_still_use_vim_in_2025" }
      }
    ]
  }
}

View Full Code on GitHub

1. Install dependencies

pip install requests

2. Send the request

import requests

query = """
mutation ScrapeRedditPosts {
  goto(url: "https://www.reddit.com/r/programming/", waitUntil: networkIdle) {
    status
  }
  waitForSelector(selector: "shreddit-post", timeout: 15000) {
    time
  }
  posts: mapSelector(selector: "shreddit-post") {
    title: attribute(name: "post-title") { value }
    score: attribute(name: "score") { value }
    commentCount: attribute(name: "comment-count") { value }
    permalink: attribute(name: "permalink") { value }
  }
}
"""

response = requests.post(
    'https://production-sfo.browserless.io/stealth/bql',
    params={
        'token': 'YOUR_API_TOKEN_HERE',
        'proxy': 'residential',
        'proxyCountry': 'us',
    },
    json={'query': query, 'variables': {}},
)

data = response.json()['data']
for post in data['posts']:
    title = post['title']['value']
    score = post['score']['value']
    comments = post['commentCount']['value']
    permalink = post['permalink']['value']
    print(f'{title} | {score} pts | {comments} comments | reddit.com{permalink}')

3. Check the output

{
  "data": {
    "goto": { "status": 200 },
    "waitForSelector": { "time": 2103 },
    "posts": [
      {
        "title": { "value": "Show HN: I built a static site generator in Go" },
        "score": { "value": "1847" },
        "commentCount": { "value": "143" },
        "permalink": { "value": "/r/programming/comments/abc123/show_hn_i_built_a_static_site_generator" }
      },
      {
        "title": { "value": "Why I still use Vim in 2025" },
        "score": { "value": "923" },
        "commentCount": { "value": "312" },
        "permalink": { "value": "/r/programming/comments/def456/why_i_still_use_vim_in_2025" }
      }
    ]
  }
}

View Full Code on GitHub

1. Send the request

import java.net.URI;
import java.net.http.*;

String token = "YOUR_API_TOKEN_HERE";
String endpoint = "https://production-sfo.browserless.io/stealth/bql?token=" + token
    + "&proxy=residential&proxyCountry=us";

String query = "mutation ScrapeRedditPosts {"
    + " goto(url: \\\"https://www.reddit.com/r/programming/\\\", waitUntil: networkIdle) { status }"
    + " waitForSelector(selector: \\\"shreddit-post\\\", timeout: 15000) { time }"
    + " posts: mapSelector(selector: \\\"shreddit-post\\\") {"
    + "   title: attribute(name: \\\"post-title\\\") { value }"
    + "   score: attribute(name: \\\"score\\\") { value }"
    + "   commentCount: attribute(name: \\\"comment-count\\\") { value }"
    + "   permalink: attribute(name: \\\"permalink\\\") { value }"
    + " }"
    + " }";

String payload = "{\"query\": \"" + query + "\", \"variables\": {}}";

HttpClient client = HttpClient.newHttpClient();
HttpRequest request = HttpRequest.newBuilder()
    .uri(URI.create(endpoint))
    .header("Content-Type", "application/json")
    .POST(HttpRequest.BodyPublishers.ofString(payload))
    .build();

HttpResponse<String> response = client.send(request, HttpResponse.BodyHandlers.ofString());
System.out.println(response.body());

2. Check the output

{
  "data": {
    "goto": { "status": 200 },
    "waitForSelector": { "time": 2103 },
    "posts": [
      {
        "title": { "value": "Show HN: I built a static site generator in Go" },
        "score": { "value": "1847" },
        "commentCount": { "value": "143" },
        "permalink": { "value": "/r/programming/comments/abc123/show_hn_i_built_a_static_site_generator" }
      },
      {
        "title": { "value": "Why I still use Vim in 2025" },
        "score": { "value": "923" },
        "commentCount": { "value": "312" },
        "permalink": { "value": "/r/programming/comments/def456/why_i_still_use_vim_in_2025" }
      }
    ]
  }
}

View Full Code on GitHub

1. Send the request

using System.Net.Http;
using System.Text;
using System.Text.Json;

string token = "YOUR_API_TOKEN_HERE";
string endpoint = $"https://production-sfo.browserless.io/stealth/bql?token={token}&proxy=residential&proxyCountry=us";

var payload = new
{
    query = @"mutation ScrapeRedditPosts {
      goto(url: ""https://www.reddit.com/r/programming/"", waitUntil: networkIdle) { status }
      waitForSelector(selector: ""shreddit-post"", timeout: 15000) { time }
      posts: mapSelector(selector: ""shreddit-post"") {
        title: attribute(name: ""post-title"") { value }
        score: attribute(name: ""score"") { value }
        commentCount: attribute(name: ""comment-count"") { value }
        permalink: attribute(name: ""permalink"") { value }
      }
    }",
    variables = new { },
};

using (HttpClient httpClient = new HttpClient())
{
    var content = new StringContent(
        JsonSerializer.Serialize(payload), Encoding.UTF8, "application/json");
    var response = await httpClient.PostAsync(endpoint, content);
    string body = await response.Content.ReadAsStringAsync();
    Console.WriteLine(body);
}

2. Check the output

{
  "data": {
    "goto": { "status": 200 },
    "waitForSelector": { "time": 2103 },
    "posts": [
      {
        "title": { "value": "Show HN: I built a static site generator in Go" },
        "score": { "value": "1847" },
        "commentCount": { "value": "143" },
        "permalink": { "value": "/r/programming/comments/abc123/show_hn_i_built_a_static_site_generator" }
      },
      {
        "title": { "value": "Why I still use Vim in 2025" },
        "score": { "value": "923" },
        "commentCount": { "value": "312" },
        "permalink": { "value": "/r/programming/comments/def456/why_i_still_use_vim_in_2025" }
      }
    ]
  }
}

Connect through stealth mode and a residential proxy so Reddit sees traffic from what looks like a real browser, then read post data directly from the rendered feed.

Puppeteer
Playwright

View Full Code on GitHub

1. Install dependencies

npm install puppeteer-core

2. Connect and scrape

import puppeteer from 'puppeteer-core';

const browser = await puppeteer.connect({
  browserWSEndpoint:
    'wss://production-sfo.browserless.io/stealth?token=YOUR_API_TOKEN_HERE&proxy=residential&proxyCountry=us',
});

try {
  const page = await browser.newPage();
  await page.goto('https://www.reddit.com/r/programming/', {
    waitUntil: 'networkidle2',
  });

  // Wait for posts to render before reading attributes.
  await page.waitForSelector('shreddit-post');

  const posts = await page.evaluate(() =>
    Array.from(document.querySelectorAll('shreddit-post')).map((post) => ({
      title: post.getAttribute('post-title'),
      score: post.getAttribute('score'),
      commentCount: post.getAttribute('comment-count'),
      permalink: post.getAttribute('permalink'),
    }))
  );

  console.log(JSON.stringify(posts, null, 2));
} finally {
  // Always close to release the session even on error.
  await browser.close();
}

3. Check the output

Run with node scrape-reddit.mjs. An empty array means posts didn't render in time. Increase the waitForSelector timeout or check whether a consent banner intercepted the page.

[
  {
    "title": "Show HN: I built a static site generator in Go",
    "score": "1847",
    "commentCount": "143",
    "permalink": "/r/programming/comments/abc123/show_hn_i_built_a_static_site_generator"
  },
  {
    "title": "Why I still use Vim in 2025",
    "score": "923",
    "commentCount": "312",
    "permalink": "/r/programming/comments/def456/why_i_still_use_vim_in_2025"
  }
]

View Full Code on GitHub

1. Install dependencies

npm install playwright-core

2. Connect and scrape

import { chromium } from 'playwright-core';

const browser = await chromium.connectOverCDP(
  'wss://production-sfo.browserless.io?token=YOUR_API_TOKEN_HERE&stealth&proxy=residential&proxyCountry=us'
);

try {
  const context = browser.contexts()[0];
  const page = await context.newPage();
  await page.goto('https://www.reddit.com/r/programming/', {
    waitUntil: 'networkidle',
  });

  // Wait for posts to render before reading attributes.
  await page.waitForSelector('shreddit-post');

  const posts = await page.evaluate(() =>
    Array.from(document.querySelectorAll('shreddit-post')).map((post) => ({
      title: post.getAttribute('post-title'),
      score: post.getAttribute('score'),
      commentCount: post.getAttribute('comment-count'),
      permalink: post.getAttribute('permalink'),
    }))
  );

  console.log(JSON.stringify(posts, null, 2));
} finally {
  // Always close to release the session even on error.
  await browser.close();
}

3. Check the output

Run with node scrape-reddit.mjs. An empty array means posts didn't render in time. Increase the waitForSelector timeout or check whether a consent banner intercepted the page.

[
  {
    "title": "Show HN: I built a static site generator in Go",
    "score": "1847",
    "commentCount": "143",
    "permalink": "/r/programming/comments/abc123/show_hn_i_built_a_static_site_generator"
  },
  {
    "title": "Why I still use Vim in 2025",
    "score": "923",
    "commentCount": "312",
    "permalink": "/r/programming/comments/def456/why_i_still_use_vim_in_2025"
  }
]

View Full Code on GitHub

1. Write the mutation

Navigate to the subreddit, wait for posts to appear, then read title, score, comment count, and permalink from each post element's attributes. We send this to /stealth/bql instead of the default /bql because Reddit's bot detection blocks plain browser sessions.

mutation ScrapeRedditPosts {
  goto(url: "https://www.reddit.com/r/programming/", waitUntil: networkIdle) {
    status
  }
  waitForSelector(selector: "shreddit-post", timeout: 15000) {
    time
  }
  posts: mapSelector(selector: "shreddit-post") {
    title: attribute(name: "post-title") { value }
    score: attribute(name: "score") { value }
    commentCount: attribute(name: "comment-count") { value }
    permalink: attribute(name: "permalink") { value }
  }
}

2. Run it

Paste into the BQL IDE and click Run.

3. Check the output

{
  "data": {
    "goto": { "status": 200 },
    "waitForSelector": { "time": 2103 },
    "posts": [
      {
        "title": { "value": "Show HN: I built a static site generator in Go" },
        "score": { "value": "1847" },
        "commentCount": { "value": "143" },
        "permalink": { "value": "/r/programming/comments/abc123/show_hn_i_built_a_static_site_generator" }
      },
      {
        "title": { "value": "Why I still use Vim in 2025" },
        "score": { "value": "923" },
        "commentCount": { "value": "312" },
        "permalink": { "value": "/r/programming/comments/def456/why_i_still_use_vim_in_2025" }
      }
    ]
  }
}

Next steps

Scrape Glassdoor Job Listings

another stealth-mode scrape against aggressive bot detection

Automate Google Search

pull search results using the same /stealth/bql endpoint

Solving Cloudflare Challenges

bypass Cloudflare's interstitial pages before scraping

Steps​

Next steps​

Scrape Glassdoor Job Listings

Automate Google Search

Solving Cloudflare Challenges

Steps

Next steps