format

JSON

JavaScript Object Notation — the universal text format for structured data exchange between APIs and web services.

JSON (JavaScript Object Notation) is a lightweight, text-based data interchange format that represents structured data as key-value pairs (objects) and ordered sequences (arrays) using JavaScript syntax. It is the dominant format for web API responses, configuration files, and data exchange between services.

In web scraping, JSON is significant in two ways. First, many modern sites make their data available via internal JSON APIs — fetching products, prices, or search results — rather than (or in addition to) embedding it in HTML. Intercepting these XHR/fetch requests directly provides cleaner, more structured data than parsing HTML. Browser DevTools' Network tab or Playwright's request interception can capture these API calls.

Second, JSON is the standard output format for structured data extraction. AlterLab's API returns extracted data as JSON objects matching the schema provided in the request, eliminating the need for HTML parsing on the caller side. Schema.org structured data embedded in pages (`<script type="application/ld+json">`) is also JSON and can provide structured product, article, or event data without scraping the rendered HTML.

Examples

// JSON response from AlterLab structured extraction
{
  "url": "https://example.com/product/123",
  "status": 200,
  "data": {
    "name": "Wireless Headphones Pro",
    "price": 149.99,
    "currency": "USD",
    "in_stock": true,
    "rating": 4.7
  }
}

Related Terms

    JSON — Web Scraping Glossary | AlterLab