JSON (JavaScript Object Notation) is a lightweight, text-based data interchange format that represents structured data as key-value pairs (objects) and ordered sequences (arrays) using JavaScript syntax. It is the dominant format for web API responses, configuration files, and data exchange between services.
In web scraping, JSON is significant in two ways. First, many modern sites make their data available via internal JSON APIs — fetching products, prices, or search results — rather than (or in addition to) embedding it in HTML. Intercepting these XHR/fetch requests directly provides cleaner, more structured data than parsing HTML. Browser DevTools' Network tab or Playwright's request interception can capture these API calls.
Second, JSON is the standard output format for structured data extraction. AlterLab's API returns extracted data as JSON objects matching the schema provided in the request, eliminating the need for HTML parsing on the caller side. Schema.org structured data embedded in pages (`<script type="application/ld+json">`) is also JSON and can provide structured product, article, or event data without scraping the rendered HTML.