Modern websites are often SPAs that load data from backend APIs rather than serving it in the initial HTML. These APIs — while not documented publicly — can be discovered by intercepting the browser's network traffic using browser DevTools or a proxy. Once discovered, these endpoints can be called directly with the appropriate headers and authentication tokens, bypassing HTML parsing entirely and returning clean, structured JSON.
API discovery is one of the most high-value techniques in scraping because API responses are inherently structured, versioned, and stable compared to HTML which changes with every design update. An API endpoint that powers a product listing page returns JSON that maps directly to data fields; the equivalent HTML scraping requires fragile CSS selectors that break with any redesign.
Network interception tools (Playwright's `page.on('response')`, mitmproxy, Charles Proxy, browser DevTools Network tab) make API discovery accessible. Look for requests to `/api/`, `/graphql`, `/v1/`, or JSON responses (`Content-Type: application/json`) triggered by page interactions.