extraction

API Discovery

API discovery is the process of identifying undocumented JSON or GraphQL endpoints used by a website's frontend that can be called directly for cleaner data than HTML scraping.

Modern websites are often SPAs that load data from backend APIs rather than serving it in the initial HTML. These APIs — while not documented publicly — can be discovered by intercepting the browser's network traffic using browser DevTools or a proxy. Once discovered, these endpoints can be called directly with the appropriate headers and authentication tokens, bypassing HTML parsing entirely and returning clean, structured JSON.

API discovery is one of the most high-value techniques in scraping because API responses are inherently structured, versioned, and stable compared to HTML which changes with every design update. An API endpoint that powers a product listing page returns JSON that maps directly to data fields; the equivalent HTML scraping requires fragile CSS selectors that break with any redesign.

Network interception tools (Playwright's `page.on('response')`, mitmproxy, Charles Proxy, browser DevTools Network tab) make API discovery accessible. Look for requests to `/api/`, `/graphql`, `/v1/`, or JSON responses (`Content-Type: application/json`) triggered by page interactions.

Examples

// Browser DevTools: capture API calls during page interaction
// Open DevTools → Network → filter by "Fetch/XHR" → interact with page
// Look for JSON responses — those are your API endpoints

// Playwright: auto-discover APIs
page.on("response", async response => {
  if (response.headers()["content-type"]?.includes("application/json")) {
    const url = response.url();
    const body = await response.json();
    console.log("API endpoint found:", url, body);
  }
});

Related Terms

Extract API Discovery data from any website

AlterLab returns clean, structured data from any public URL — no scraper infrastructure needed. Start free, no credit card required.

View API docs

Your first scrape.
Sixty seconds.

$1 free balance. No credit card. No SDK.Just a POST request.

terminal
curl -X POST https://api.alterlab.io/v1/scrape \
-H "X-API-Key: YOUR_KEY" \
-H "Content-Type: application/json" \
-d '{"url": "https://example.com", "formats": ["markdown"]}'

No credit card required · Up to 5,000 free scrapes · Balance never expires

    API Discovery — Web Scraping Glossary | AlterLab