format

JSON-LD

JSON-LD is a W3C standard for embedding structured, machine-readable data inside an HTML page's script tags using JSON syntax and schema.org vocabulary.

JSON-LD (JavaScript Object Notation for Linked Data) encodes semantic data as a `<script type='application/ld+json'>` block within the page's HTML. The data describes the page's content in a vocabulary defined by schema.org — product prices, review ratings, event dates, organisation details, and more. Search engines parse JSON-LD blocks to power rich results in search listings.

For scrapers, JSON-LD is a high-quality extraction target because the data is clean and structured by the site operator — no fragile CSS selector or XPath is needed to extract product prices from rendered text. The schema.org vocabulary provides a predictable key set across different publishers using the same content type.

JSON-LD blocks can be extracted with a simple regex or HTML parser targeting the script tag, then parsed as standard JSON. Multiple JSON-LD blocks may appear on a single page for different schema types (e.g., a `Product` block alongside a `BreadcrumbList` block).

Examples

from bs4 import BeautifulSoup
import json

soup = BeautifulSoup(html, "html.parser")
for tag in soup.find_all("script", type="application/ld+json"):
    data = json.loads(tag.string)
    if data.get("@type") == "Product":
        print(data.get("name"), data.get("offers", {}).get("price"))

Related Terms

Extract JSON-LD data from any website

AlterLab returns clean, structured data from any public URL — no scraper infrastructure needed. Start free, no credit card required.

View API docs

Your first scrape.
Sixty seconds.

$1 free balance. No credit card. No SDK.Just a POST request.

terminal
curl -X POST https://api.alterlab.io/v1/scrape \
-H "X-API-Key: YOUR_KEY" \
-H "Content-Type: application/json" \
-d '{"url": "https://example.com", "formats": ["markdown"]}'

No credit card required · Up to 5,000 free scrapes · Balance never expires

    JSON-LD — Web Scraping Glossary | AlterLab