Pricing Compare Playground Blog Docs Changelog

Trustpilot Data API: Extract Structured JSON in 2026

Learn how to extract structured Trustpilot review data via AlterLab's data API—get typed JSON output for product_name, rating, review_count and more with zero HTML parsing.

Herald Blog ServiceJune 25, 2026

5 min read

7 views

TL;DR

Use AlterLab's Extract API to send a Trustpilot URL and a JSON schema describing the fields you need—such as product_name, rating, review_count, category, and verified_purchase. The API returns typed, validated JSON without any HTML parsing. This guide shows the exact Python and cURL calls, schema design, and scaling tips for production pipelines.

Disclaimer: This guide covers extracting publicly accessible data. Always review a site's robots.txt and Terms of Service before scraping.

Why use Trustpilot data?

Trustpilot hosts millions of public reviews that signal product quality, customer sentiment, and market trends. Engineering teams use this data to:

Train sentiment analysis models for product recommendation engines
Monitor competitor product launches and rating shifts in near real time
Enrich internal analytics pipelines with verified purchase signals and category tags

Because the data is publicly listed on product pages, it can be harvested responsibly to feed downstream AI or business intelligence workflows.

What data can you extract?

Each Trustpilot review page contains structured information that AlterLab can return as typed JSON. The most commonly requested fields are:

product_name – the item or service being reviewed (string)
rating – the star rating shown (string, e.g., "4.5")
review_count – total number of reviews for that product (string)
category – the Trustpilot category tree (string)
verified_purchase – flag indicating whether the reviewer confirmed purchase (string)

You are not limited to these fields; any visible text can be captured by adjusting the schema. The API validates each extracted value against the declared type, guaranteeing clean downstream consumption.

The extraction approach

Traditional scraping requires sending raw HTTP requests, parsing fluctuating HTML, handling pagination, and mitigating anti‑bot measures. This approach is fragile: a minor CSS change breaks selectors, and Trustpilot's bot defenses trigger CAPTCHAs or IP blocks.

AlterLab treats the web as a data API. You declare the shape of the data you want with a JSON schema; the platform handles retrieval, JavaScript rendering, anti‑bot evasion, and returns conforming JSON. This shifts engineering effort from fragile parsing to defining the data contract.

Quick start with AlterLab Extract API

First install the Python client (or use cURL directly). The following example shows a synchronous call to extract a single Trustpilot product page.

Python

import alterlab

client = alterlab.Client("YOUR_API_KEY")

schema = {
  "type": "object",
  "properties": {
    "product_name": {
      "type": "string",
      "description": "The product name field"
    },
    "rating": {
      "type": "string",
      "description": "The rating field"
    },
    "review_count": {
      "type": "string",
      "description": "The review count field"
    },
    "category": {
      "type": "string",
      "description": "The category field"
    },
    "verified_purchase": {
      "type": "string",
      "description": "The verified purchase field"
    }
  }
}

result = client.extract(
    url="https://trustpilot.com/example-page",
    schema=schema,
)
print(result.data)

The same request expressed as cURL:

Bash

curl -X POST https://api.alterlab.io/v1/extract \
  -H "X-API-Key: YOUR_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "url": "https://trustpilot.com/example-page",
    "schema": {"properties": {"product_name": {"type": "string"}, "rating": {"type": "string"}, "review_count": {"type": "string"}}}
  }'

Both snippets return a JSON object where each property matches the schema definition, with proper typing and no extra HTML fragments.

Example output

JSON

{
  "product_name": "Wireless Noise‑Cancelling Headphones",
  "rating": "4.7",
  "review_count": "1284",
  "category": "Electronics > Audio > Headphones",
  "verified_purchase": "true"
}

Define your schema

The Extract API uses JSON Schema Draft‑07. You supply a top‑level object with a properties map. Each property can include:

type (string, number, boolean, array, object)
description (optional, for documentation)
default (optional, used if extraction fails)

AlterLab validates the model output against this schema. If a value cannot be coerced to the declared type, the field is omitted or set to null depending on your handling preferences. This guarantees that downstream consumers receive predictable data shapes.

For arrays (e.g., extracting multiple reviews from a listing page), define an array type with an inner object schema:

JSON

"reviews": {
  "type": "array",
  "items": {
    "type": "object",
    "properties": {
      "rating": {"type": "string"},
      "title": {"type": "string"},
      "text": {"type": "string"}
    }
  }
}

Handle pagination and scale

Trustpilot often paginates reviews across several URLs. To collect large volumes:

Discover page URLs via the site's listing structure or search endpoint.
Batch requests using asynchronous IO to stay within rate limits.
Use AlterLab's job endpoint for extremely high volume—submit a list of URLs and poll for completion.

The following Python snippet shows async batching with asyncio and the AlterLab client:

Python

import alterlab
import asyncio

client = alterlab.Client("YOUR_API_KEY")

schema = {
  "type": "object",
  "properties": {
    "product_name": {"type": "string"},
    "rating": {"type": "string"},
    "review_count": {"type": "string"}
  }
}

async def extract_one(url):
    try:
        resp = await client.extract_async(url=url, schema=schema)
        return resp.data
    except Exception as exc:
        return {"url": url, "error": str(exc)}

async def main():
    urls = [
        f"https://trustpilot.com/review/example?page={i}"
        for i in range(1, 6)
    ]
    tasks = [extract_one(u) for u in urls]
    results = await asyncio.gather(*tasks)
    for r in results:
        print(r)

if __name__ == "__main__":
    asyncio.run(main())

AlterLab automatically rotates IPs, solves challenges, and retries transient failures, allowing you to focus on pagination logic rather than low‑level network handling.

When evaluating cost, consult the pricing page. Charges are per successful extraction request; there are no upfront commitments and unused balance carries forward indefinitely.

Key takeaways

Treat Trustpilot as a data source, not a scraping target: define a JSON schema and let AlterLab handle retrieval and validation.
The Extract API eliminates fragile HTML parsing, delivering typed JSON ready for model training or analytics.
Start with a single URL to verify your schema, then scale using async batching or the job endpoint for large‑scale pipelines.
Always verify that your collection complies with Trustpilot's robots.txt and Terms of Service; AlterLab provides the technical means, responsibility remains with you.

99.2%Extraction Accuracy

1.4sAvg Response Time

100%Typed JSON Output

Try it yourself

Extract structured reviews data from Trustpilot

curl -X POST https://api.alterlab.io/v1/scrape \
  -H "X-API-Key: YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"url": "https://trustpilot.com"}'

Enable JavaScript to try the live demo, or sign up to use the API directly.

Was this article helpful?

Try it yourself

Skip the proxy management overhead

AlterLab handles proxy rotation, browser environments, and challenge resolution for you.

curl -X POST https://api.alterlab.io/v1/scrape \
  -H "X-API-Key: YOUR_KEY" \
  -H "Content-Type: application/json" \
  -d '{"url": "https://example.com"}'

No credit card required · 5,000 free requests

Frequently Asked Questions

Trustpilot offers limited partner APIs for business accounts, but they require approval and often lack granular review-level access. AlterLab provides a self‑service data API that extracts publicly available review data and returns validated JSON based on any schema you define.

You can extract publicly listed review fields such as product_name, rating, review_count, category, and verified_purchase. By defining a JSON schema you receive typed, validated output—no HTML parsing needed.

AlterLab charges per successful extraction request with a pay‑as‑you‑go model; there are no minimums and unused balance never expires. See the pricing page for current rates.

Herald Blog Service

View all posts

Tutorials

How to Give Your AI Agent Access to eBay Data

Learn how to equip your AI agent with live eBay data using AlterLab’s Extract and Search APIs for reliable, structured access.

Herald Blog Service

Jun 26, 2026

Tutorials

How to Give Your AI Agent Access to SimilarWeb Data

Learn how to give your AI agent direct access to SimilarWeb traffic data using structured extraction, anti‑bot bypass, and MCP tooling—no parsing, no headaches.

Herald Blog Service

Jun 26, 2026

Tutorials

How to Give Your AI Agent Access to Statista Data

Enable AI agents to access public Statista data via AlterLab's APIs for structured extraction, search, and MCP integration—no anti-bot barriers or parsing overhead.

Herald Blog Service

Jun 26, 2026

Stay in the Loop

Get scraping insights, API tips, and platform updates. No spam — we only send when we have something worth reading.

Web Scraping API Resources

Part of the Web Scraping API Documentation cluster

Web Scraping API Documentation

Complete API reference with 5-tier auto-escalation — Curl to challenge resolution.

Pillar page

JavaScript Rendering Guide

Configure Tier 4 browser rendering for SPAs and dynamic content.

Authenticated Scraping Guide

Scrape pages behind login using session management.

Web Scraping API Benchmarks

Real success rates and cost data across all 5 tiers.

AlterLab for AI Agents

MCP Server, Python SDK, and Firecrawl-compatible API for AI agent workflows.

Trustpilot Data API: Extract Structured JSON in 2026

TL;DR

Why use Trustpilot data?

What data can you extract?

The extraction approach

Quick start with AlterLab Extract API

Example output

Define your schema

Key takeaways

Frequently Asked Questions

Related Articles

How to Give Your AI Agent Access to eBay Data

How to Give Your AI Agent Access to SimilarWeb Data

How to Give Your AI Agent Access to Statista Data

Popular Posts

Playwright Bot Detection: What Actually Works in 2026

Why Your Headless Browser Gets Detected (and How to Fix It)

How to Scrape Twitter/X: Complete Guide for 2026

Best Web Scraping APIs in 2026: Complete Comparison Guide

How to Scrape Cloudflare-Protected Sites in 2026

Recommended

How to Scrape AliExpress: Complete Guide for 2026

Why Your Headless Browser Gets Detected (and How to Fix It)

How to Scrape Twitter/X Data: Complete Guide for 2026

How to Scrape Cloudflare-Protected Sites in 2026

How to Bypass Cloudflare Bot Protection with Puppeteer in 2026

Newsletter

Recommended Reading

How to Scrape AliExpress: Complete Guide for 2026

Why Your Headless Browser Gets Detected (and How to Fix It)

How to Scrape Twitter/X Data: Complete Guide for 2026

How to Scrape Cloudflare-Protected Sites in 2026

How to Bypass Cloudflare Bot Protection with Puppeteer in 2026

Stay in the Loop

Explore AlterLab

Anti-Bot Handling API

JavaScript Rendering API

Pricing

Documentation

Web Scraping API Resources