Pricing Compare Playground Blog Docs Changelog

How to Scrape Trustpilot Data: Complete Guide for 2026

Learn how to scrape Trustpilot reviews using Python and AlterLab's API. Covers anti-bot handling, selectors, best practices, and scalable pipelines.

Herald Blog ServiceJune 26, 2026

4 min read

8 views

This guide shows how to extract publicly available review data from Trustpilot using Python and AlterLab's scraping API. All examples target pages that do not require authentication.

Disclaimer: This guide covers extracting publicly accessible data. Always review a site's robots.txt and Terms of Service before scraping.

TL;DR

To scrape Trustpilot reviews, send a GET request to AlterLab's /v1/scrape endpoint with the target URL, parse the returned HTML with CSS selectors or XPath for review text, rating, and date, and handle pagination programmatically. Use rate limiting and respect Trustpilot's robots.txt.

Why collect reviews data from Trustpilot?

Market research – Aggregate sentiment across competitors to identify product strengths and weaknesses.
Price monitoring – Correlate review spikes with pricing changes or promotional events.
Data analysis pipelines – Feed structured review datasets into NLP models for trend detection or recommendation systems.

Technical challenges

Trustpilot loads most review content via JavaScript, employs rate‑limiting per IP, and uses bot‑challenge pages (e.g., Cloudflare Turnstile) to filter automated traffic. Plain requests.get often returns a challenge page or empty HTML. AlterLab's Smart Rendering API runs a headless browser, rotates residential proxies, and automatically solves challenges, delivering the fully rendered public page.

99.2%Success Rate

1.2sAvg Response

Quick start with AlterLab API

First, install the Python SDK and review the Getting started guide for authentication details.

Python

import alterlab

client = alterlab.Client("YOUR_API_KEY")
response = client.scrape(
    url="https://www.trustpilot.com/review/example.com",
    params={"render": True, "wait_for": ".review-card"}
)
print(response.text[:500])

The equivalent cURL request:

Bash

curl -X POST https://api.alterlab.io/v1/scrape \
  -H "X-API-Key: YOUR_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "url": "https://www.trustpilot.com/review/example.com",
    "render": true,
    "wait_for": ".review-card"
  }'

Both examples naturally appears as:

Extracting structured data

After obtaining the HTML, use a parsing library such as BeautifulSoup or parsel to pull the needed fields. Trustpilot's public review cards use stable class names.

Python

from parsel import Selector

selector = Selector(text=response.text)
reviews = []

for card in selector.css(".review-card"):
    reviews.append({
        "title": card.css(".review-title::text").get().strip(),
        "rating": int(card.css(".star-rating-stroke::attr(data-rating)").get()),
        "text": card.css(".review-content__text::text").get().strip(),
        "date": card.css(".review-date::attr(data-service-review-date)").get(),
    })

print(reviews[:2])

Key selectors:

Review container: .review-card
Title: .review-title::text
Rating: .star-rating-stroke (data‑rating attribute)
Text: .review-content__text::text
Date: .review-date (data-service-review-date attribute)

For JSON‑LD structured data, you can also parse <script type="application/ld+json"> blocks that sometimes contain aggregated rating information.

Best practices

Rate limiting – Start with 1 request per second; increase gradually while monitoring HTTP 429 responses.
Robots.txt – Check https://www.trustpilot.com/robots.txt for disallowed paths; avoid scraping private user profiles.
Headers – Send a realistic User‑Accept header; AlterLab adds one by default, but you can override if needed.
Error handling – Retry on 5xx or network errors with exponential backoff; treat 429 as a signal to pause.
Data storage – Write each batch to a newline‑delimited JSON file to enable resumable runs.

Scaling up

For large‑scale projects, schedule nightly jobs via cron or a workflow orchestrator (e.g., Airflow). Use AlterLab's batch endpoint to send up to 100 URLs per request, reducing overhead. Monitor costs, reducing per‑call latency. See the pricing page for volume‑based rates; typical workloads of 100 k reviews/month fall into the Growth tier.

Example batch request:

Python

urls = [
    f"https://www.trustpilot.com/review/site{i}.com"
    for i in range(1, 21)
]

batch_response = client.batch_scrape(
    urls=[{"url": u, "render": True} for u in urls],
    webhook_url="https://yourapi.example.com/webhook"
)
print(batch_response.id)  # use to fetch results later

Combine the output with a scheduling service to refresh datasets daily, and store results in a data warehouse for downstream analytics.

Key takeaways

Trustpilot's public review pages are accessible via AlterLab's Smart Rendering API, which handles JavaScript and bot challenges.
Use CSS selectors (.review-card, .review-title, etc.) to extract review title, rating, text, and date.
Apply responsible scraping: respect robots.txt, limit request rates, and handle errors gracefully.
Scale with batch requests, scheduled jobs, and cost‑effective pricing tiers.
Always verify that the data you collect is publicly available and compliant with Trustpilot's terms.

Try it yourself

Try scraping Trustpilot with AlterLab

curl -X POST https://api.alterlab.io/v1/scrape \
  -H "X-API-Key: YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"url": "https://trustpilot.com"}'

Enable JavaScript to try the live demo, or sign up to use the API directly.

Was this article helpful?

Try it yourself

Skip the proxy management overhead

AlterLab handles proxy rotation, browser environments, and challenge resolution for you.

curl -X POST https://api.alterlab.io/v1/scrape \
  -H "X-API-Key: YOUR_KEY" \
  -H "Content-Type: application/json" \
  -d '{"url": "https://example.com"}'

No credit card required · 5,000 free requests

Frequently Asked Questions

Scraping publicly accessible data is generally permissible under rulings like hiQ v LinkedIn, but you must review Trustpilot's robots.txt and Terms of Service, apply rate limiting, and avoid private or login‑protected information.

Trustpilot employs JavaScript rendering, rate limits, and bot detection mechanisms that block raw HTTP requests; AlterLab's Smart Rendering API handles headless browsers, proxies, and automatic retries to retrieve public pages reliably.

AlterLab charges per successful scrape; volume discounts lower the effective price. See the pricing page for tiered rates and estimate costs based on your request frequency and concurrency.

Herald Blog Service

View all posts

Tutorials

Shopify Stores Data API: Extract Structured JSON in 2026

Learn how to extract structured JSON data from Shopify Stores using AlterLab's Extract API. Get typed e-commerce data (title, price, SKU) without HTML parsing.

Herald Blog Service

Jun 26, 2026

Tutorials

Best Buy Data API: Extract Structured JSON in 2026

Extract structured JSON from Best Buy product pages using AlterLab's data API. Get typed fields like price, SKU, and availability without HTML parsing.

Herald Blog Service

Jun 26, 2026

Tutorials

Expedia Data API: Extract Structured JSON in 2026

Learn how to extract structured Expedia data as JSON using AlterLab's Extract API — define a schema, get typed results, and build reliable travel data pipelines.

Herald Blog Service

Jun 26, 2026

Stay in the Loop

Get scraping insights, API tips, and platform updates. No spam — we only send when we have something worth reading.

Web Scraping API Resources

Part of the Web Scraping API Documentation cluster

Web Scraping API Documentation

Complete API reference with 5-tier auto-escalation — Curl to challenge resolution.

Pillar page

JavaScript Rendering Guide

Configure Tier 4 browser rendering for SPAs and dynamic content.

Authenticated Scraping Guide

Scrape pages behind login using session management.

Web Scraping API Benchmarks

Real success rates and cost data across all 5 tiers.

AlterLab for AI Agents

MCP Server, Python SDK, and Firecrawl-compatible API for AI agent workflows.

How to Scrape Trustpilot Data: Complete Guide for 2026

TL;DR

Why collect reviews data from Trustpilot?

Technical challenges

Quick start with AlterLab API

Extracting structured data

Best practices

Scaling up

Key takeaways

Frequently Asked Questions

Related Articles

Shopify Stores Data API: Extract Structured JSON in 2026

Best Buy Data API: Extract Structured JSON in 2026

Expedia Data API: Extract Structured JSON in 2026

Popular Posts

Playwright Bot Detection: What Actually Works in 2026

Why Your Headless Browser Gets Detected (and How to Fix It)

How to Scrape Twitter/X: Complete Guide for 2026

Best Web Scraping APIs in 2026: Complete Comparison Guide

How to Scrape Cloudflare-Protected Sites in 2026

Recommended

How to Scrape AliExpress: Complete Guide for 2026

Why Your Headless Browser Gets Detected (and How to Fix It)

How to Scrape Twitter/X Data: Complete Guide for 2026

How to Scrape Cloudflare-Protected Sites in 2026

How to Bypass Cloudflare Bot Protection with Puppeteer in 2026

Newsletter

Recommended Reading

How to Scrape AliExpress: Complete Guide for 2026

Why Your Headless Browser Gets Detected (and How to Fix It)

How to Scrape Twitter/X Data: Complete Guide for 2026

How to Scrape Cloudflare-Protected Sites in 2026

How to Bypass Cloudflare Bot Protection with Puppeteer in 2026

Stay in the Loop

Explore AlterLab

Anti-Bot Handling API

JavaScript Rendering API

Pricing

Documentation

Web Scraping API Resources