web crawling apiwebsite crawler apicrawl api

Web Crawling API

Crawl entire websites with depth control, link discovery, and sitemap-aware traversal. AlterLab's crawling API handles anti-bot protection, JavaScript rendering, and proxy rotation automatically. Extract structured data from thousands of pages in a single job.

No credit card
SOC 2 aligned
99.9% uptime
Simple Pricing
$1
One dollar
=
5,000
Requests
Pay as you go
No subscriptions
Never expires
2,847,653+
Requests processed this week
Developer Experience

Simple API, Powerful Results

Get started in minutes with our intuitive API. One request gives you structured data, screenshots, PDFs, and more. No browser management, no infrastructure headaches.

Multi-Format Output

Markdown, JSON, HTML, text

Adaptive Rendering

JS, SPAs, shadow DOM

3 Lines to Integrate

Any language, any stack

Request
Response
200 OK·1.2s

Up to 5,000 free scrapes included. No credit card required.

How Web Crawling Works

Submit a seed URL — AlterLab handles link discovery, anti-bot protection, and structured data extraction automatically.

1

Seed URL Submission

Submit your starting URL along with depth limits, page count caps, and include/exclude patterns. AlterLab begins crawling immediately — no setup required. Sitemap.xml files are parsed automatically to discover all indexable URLs before the crawl begins.

2

Link Discovery & Queueing

Each scraped page is parsed for outbound links. Links matching your include patterns and within the depth limit are queued for crawling. AlterLab deduplicates URLs and respects robots.txt by default — override when needed for competitive intelligence use cases.

3

Automatic Website Compatibility Per Page

Each URL in the crawl queue is scraped through AlterLab's 5-tier pipeline. Basic pages use lightweight TLS fingerprinting at $0.0002/page. JavaScript-heavy pages automatically escalate to Playwright browser rendering. Anti-bot protection is handled automatically per page — no configuration needed.

4

Structured Results via Webhook

Results are delivered to your webhook when the crawl completes. Each page includes HTML, Markdown, extracted metadata, discovered links, and cost per page. Failed pages are retried automatically — you only pay for successful scrapes.

Built for Production Crawls

Enterprise-grade crawling with automatic anti-bot handling and structured output.

Depth Control

Set crawl depth from 1 to unlimited. Crawl single pages or full site hierarchies.

Sitemap-Aware

Automatically parses sitemap.xml for efficient, complete site coverage.

Automatic Website Compatibility

5-tier escalation handles complex access controls and website protections on every crawled page.

Include/Exclude Patterns

Glob patterns to scope crawls to specific sections — /blog/*, /products/*, etc.

Crawling Use Cases

From content indexing to competitive intelligence — web crawling at any scale.

Content Indexing

Crawl entire sites to build searchable indexes for internal search engines or AI knowledge bases

Competitive Monitoring

Track competitor sites for pricing changes, product launches, and content updates

Data Pipeline Feeds

Build automated crawl pipelines that feed structured data into databases and analytics platforms

SEO Auditing

Discover broken links, missing metadata, and crawl errors across entire site structures

Part of a Bigger Pipeline

Crawl is most powerful when combined with AlterLab's Search and Map APIs. Discover URLs first, then crawl only what you need.

/search
Search API
Find relevant URLs from SERP results
/map
Map API
Discover all URLs on a site cheaply
/crawl
Crawl API
Fetch full content with automatic website compatibility

Map → Crawl: Index an Entire Site

Use Map API ($0.001/call) to discover all URLs, then Crawl only the pages you need — no wasted scraping on irrelevant pages.

Python
import alterlab

client = alterlab.Client(api_key="YOUR_KEY")

# Step 1: Discover all URLs on the site ($0.001 flat)
map_result = client.map(
    "https://docs.example.com",
    max_urls=2000,
    include_patterns=["/docs/*"]
)
doc_urls = [u.url for u in map_result.urls if "/docs/" in u.url]
print(f"Found {len(doc_urls)} documentation pages")

# Step 2: Crawl them all for full content ($0.0002+/page)
crawl_result = client.crawl(
    url="https://docs.example.com",
    urls=doc_urls,           # target only what Map found
    formats=["markdown"],
    max_pages=500
)
for page in crawl_result.pages:
    print(page.url, len(page.markdown), "chars")

Crawl Cost Estimator

See how much a full-site crawl costs with AlterLab's map-first approach vs competitors' blind crawl pricing.

10K
1001K10K100K1M

AlterLab

1. Map the site

Discover all URLs first

$0.001

2. Crawl 10K pages

$0.0003/page (Standard)

$3.00
Total$3.00

Map once, crawl selectively. Only pay for pages you actually need.

Same crawl on competitors

Firecrawl

$0.0063/page flat rate

$63.00

ScrapingBee

$0.0033/page effective rate

$33.00

Save 95% vs Firecrawl

$60.00 saved on 10K pages

Crawl Pricing Comparison

Feature-by-feature comparison for full-site crawling. AlterLab's map-first approach and tier routing deliver the lowest effective cost.

AlterLab

Simple page crawl cost$0.0002/page
JS-rendered page crawl cost$0.004/page
Map-first recon$0.001/call
Selective crawlingIncluded
Smart tier routing5-tier auto
Anti-bot handling (crawl)Included
Pricing modelPay per page
Minimum spend$10 one-time
Failed pages billed?Never
Depth controlIncluded
Sitemap-aware crawlingIncluded

Firecrawl

Simple page crawl cost$0.0063/page
JS-rendered page crawl cost$0.0063/page
Map-first reconNot available
Selective crawlingURL filters only
Smart tier routingSingle tier
Anti-bot handling (crawl)Not available
Pricing model$19-99/month
Minimum spend$19/month
Failed pages billed?Counted as credits
Depth controlIncluded
Sitemap-aware crawlingIncluded

ScrapingBee

Simple page crawl cost$0.00066/page
JS-rendered page crawl cost$0.0033/page
Map-first reconNot available
Selective crawlingNot available
Smart tier routingManual
Anti-bot handling (crawl)Extra cost
Pricing model$49-249/month
Minimum spend$49/month
Failed pages billed?Counted as credits
Depth controlIncluded
Sitemap-aware crawlingNot available

Apify

Simple page crawl cost~$0.0025/page
JS-rendered page crawl cost~$0.005/page
Map-first reconNot available
Selective crawlingActor config
Smart tier routingManual
Anti-bot handling (crawl)Extra cost
Pricing model$49-499/month
Minimum spend$49/month
Failed pages billed?Compute time billed
Depth controlActor config
Sitemap-aware crawlingActor-dependent

Prices sourced from public pricing pages as of April 2026. Apify costs are approximate (compute-time-based billing varies by actor configuration).

Web Crawling API FAQ

Your first scrape.
Sixty seconds.

$1 free balance. No credit card. No SDK.Just a POST request.

terminal
curl -X POST https://api.alterlab.io/v1/scrape \
-H "X-API-Key: YOUR_KEY" \
-H "Content-Type: application/json" \
-d '{"url": "https://example.com", "formats": ["markdown"]}'

No credit card required · Up to 5,000 free scrapes · Balance never expire

    Web Crawling API — Crawl Any Site at Scale | AlterLab | AlterLab