How does AlterLab's web crawling API work?

Submit a seed URL and depth limit to the AlterLab crawl endpoint. The crawler discovers all links on the page, filters by domain and depth rules, then scrapes each discovered URL — handling anti-bot protection, JavaScript rendering, and proxy rotation automatically. Results are returned as structured data with HTML, metadata, and discovered links for each page.

How does AlterLab compare to Firecrawl?

AlterLab's crawling costs from $0.0002 per page — up to 20x cheaper than Firecrawl at $0.004/page. AlterLab also includes 5-tier auto-escalation for handling complex and protected sites, while Firecrawl's simpler headless browser often fails on such pages. AlterLab has no monthly subscription requirement.

Can the crawler handle JavaScript-heavy sites?

Yes. AlterLab automatically uses JavaScript rendering (Playwright) for pages that require it — single-page apps, React/Vue/Angular sites, and pages with lazy-loaded content. The crawler detects when a page needs JS rendering and escalates to the appropriate tier without any configuration.

What depth limits are supported?

AlterLab supports crawl depths from 1 (single page) to unlimited (full site traversal). You can also control crawl scope with include/exclude patterns, domain restrictions, and sitemap-guided crawling that prioritizes important pages.

How do I handle large crawl jobs with thousands of pages?

Submit your crawl job via the batch API endpoint. AlterLab processes pages concurrently and delivers results via webhook or polling. Large crawl jobs are automatically parallelized across our proxy infrastructure with built-in rate limiting to avoid triggering anti-bot systems.

Is sitemap.xml used during crawling?

Yes. AlterLab's crawler automatically fetches and parses sitemap.xml files when available, using them to discover all indexable URLs efficiently. Sitemap-guided crawling is faster and ensures complete coverage of important pages, including those not linked from the main navigation.

web crawling apiwebsite crawler apicrawl api

Web Crawling API

Crawl entire websites with depth control, link discovery, and sitemap-aware traversal. AlterLab's crawling API handles anti-bot protection, JavaScript rendering, and proxy rotation automatically. Extract structured data from thousands of pages in a single job.

Documentation

No credit card

SOC 2 aligned

99.9% uptime

Simple Pricing

One dollar

5,000

Requests

Pay as you go

No subscriptions

Never expires

2,847,653+

Requests processed this week

Developer Experience

Simple API, Powerful Results

Get started in minutes with our intuitive API. One request gives you structured data, screenshots, PDFs, and more. No browser management, no infrastructure headaches.

Multi-Format Output

Markdown, JSON, HTML, text

Adaptive Rendering

JS, SPAs, shadow DOM

3 Lines to Integrate

Any language, any stack

RequestTab to switch

Response

200 OK·1.2s

Up to 5,000 free scrapes included. No credit card required.

How Web Crawling Works

Submit a seed URL — AlterLab handles link discovery, anti-bot protection, and structured data extraction automatically.

Seed URL Submission

Submit your starting URL along with depth limits, page count caps, and include/exclude patterns. AlterLab begins crawling immediately — no setup required. Sitemap.xml files are parsed automatically to discover all indexable URLs before the crawl begins.

Link Discovery & Queueing

Each scraped page is parsed for outbound links. Links matching your include patterns and within the depth limit are queued for crawling. AlterLab deduplicates URLs and respects robots.txt by default — override when needed for competitive intelligence use cases.

Automatic Website Compatibility Per Page

Each URL in the crawl queue is scraped through AlterLab's 5-tier pipeline. Basic pages use lightweight TLS fingerprinting at $0.0002/page. JavaScript-heavy pages automatically escalate to Playwright browser rendering. Anti-bot protection is handled automatically per page — no configuration needed.

Structured Results via Webhook

Results are delivered to your webhook when the crawl completes. Each page includes HTML, Markdown, extracted metadata, discovered links, and cost per page. Failed pages are retried automatically — you only pay for successful scrapes.

Built for Production Crawls

Enterprise-grade crawling with automatic anti-bot handling and structured output.

Depth Control

Set crawl depth from 1 to unlimited. Crawl single pages or full site hierarchies.

Sitemap-Aware

Automatically parses sitemap.xml for efficient, complete site coverage.

Automatic Website Compatibility

5-tier escalation handles complex access controls and website protections on every crawled page.

Include/Exclude Patterns

Glob patterns to scope crawls to specific sections — /blog/*, /products/*, etc.

Crawling Use Cases

From content indexing to competitive intelligence — web crawling at any scale.

Content Indexing

Crawl entire sites to build searchable indexes for internal search engines or AI knowledge bases

Competitive Monitoring

Track competitor sites for pricing changes, product launches, and content updates

Data Pipeline Feeds

Build automated crawl pipelines that feed structured data into databases and analytics platforms

SEO Auditing

Discover broken links, missing metadata, and crawl errors across entire site structures

Part of a Bigger Pipeline

Crawl is most powerful when combined with AlterLab's Search and Map APIs. Discover URLs first, then crawl only what you need.

/search

Search API

Find relevant URLs from SERP results

→

/map

Map API

Discover all URLs on a site cheaply

→

/crawl

Crawl API

Fetch full content with automatic website compatibility

Map → Crawl: Index an Entire Site

Use Map API ($0.001/call) to discover all URLs, then Crawl only the pages you need — no wasted scraping on irrelevant pages.

Python

import alterlab

client = alterlab.Client(api_key="YOUR_KEY")

# Step 1: Discover all URLs on the site ($0.001 flat)
map_result = client.map(
    "https://docs.example.com",
    max_urls=2000,
    include_patterns=["/docs/*"]
)
doc_urls = [u.url for u in map_result.urls if "/docs/" in u.url]
print(f"Found {len(doc_urls)} documentation pages")

# Step 2: Crawl them all for full content ($0.0002+/page)
crawl_result = client.crawl(
    url="https://docs.example.com",
    urls=doc_urls,           # target only what Map found
    formats=["markdown"],
    max_pages=500
)
for page in crawl_result.pages:
    print(page.url, len(page.markdown), "chars")

Crawl Cost Estimator

See how much a full-site crawl costs with AlterLab's map-first approach vs competitors' blind crawl pricing.

Pages to crawl10K

1001K10K100K1M

Site complexity

AlterLab

1. Map the site

Discover all URLs first

$0.001

2. Crawl 10K pages

$0.0003/page (Standard)

$3.00

Total$3.00

Map once, crawl selectively. Only pay for pages you actually need.

Same crawl on competitors

Firecrawl

$0.0063/page flat rate

$63.00

ScrapingBee

$0.0033/page effective rate

$33.00

Save 95% vs Firecrawl

$60.00 saved on 10K pages

Crawl Pricing Comparison

Feature-by-feature comparison for full-site crawling. AlterLab's map-first approach and tier routing deliver the lowest effective cost.

Feature	AlterLab	Firecrawl	ScrapingBee	Apify
Simple page crawl cost HTML-only pages, static content	$0.0002/page	$0.0063/page	$0.00066/page	~$0.0025/page
JS-rendered page crawl cost SPAs, dynamic content requiring browser	$0.004/page	$0.0063/page	$0.0033/page	~$0.005/page
Map-first recon Discover all URLs before crawling	$0.001/call	Not available	Not available	Not available
Selective crawling Crawl only matched URL patterns	Included	URL filters only	Not available	Actor config
Smart tier routing Auto-selects cheapest tier per page	5-tier auto	Single tier	Manual	Manual
Anti-bot handling (crawl) all major anti-bot protections during crawl	Included	Not available	Extra cost	Extra cost
Pricing model How you pay for crawls	Pay per page	$19-99/month	$49-249/month	$49-499/month
Minimum spend Lowest entry point for crawling	$10 one-time	$19/month	$49/month	$49/month
Failed pages billed? Do you pay for pages that fail to scrape	Never	Counted as credits	Counted as credits	Compute time billed
Depth control Limit crawl by link depth	Included	Included	Included	Actor config
Sitemap-aware crawling Uses sitemap.xml for URL discovery	Included	Included	Not available	Actor-dependent

AlterLab

Simple page crawl cost$0.0002/page

JS-rendered page crawl cost$0.004/page

Map-first recon$0.001/call

Selective crawlingIncluded

Smart tier routing5-tier auto

Anti-bot handling (crawl)Included

Pricing modelPay per page

Minimum spend$10 one-time

Failed pages billed?Never

Depth controlIncluded

Sitemap-aware crawlingIncluded

Firecrawl

Simple page crawl cost$0.0063/page

JS-rendered page crawl cost$0.0063/page

Map-first reconNot available

Selective crawlingURL filters only

Smart tier routingSingle tier

Anti-bot handling (crawl)Not available

Pricing model$19-99/month

Minimum spend$19/month

Failed pages billed?Counted as credits

Depth controlIncluded

Sitemap-aware crawlingIncluded

ScrapingBee

Simple page crawl cost$0.00066/page

JS-rendered page crawl cost$0.0033/page

Map-first reconNot available

Selective crawlingNot available

Smart tier routingManual

Anti-bot handling (crawl)Extra cost

Pricing model$49-249/month

Minimum spend$49/month

Failed pages billed?Counted as credits

Depth controlIncluded

Sitemap-aware crawlingNot available

Apify

Simple page crawl cost~$0.0025/page

JS-rendered page crawl cost~$0.005/page

Map-first reconNot available

Selective crawlingActor config

Smart tier routingManual

Anti-bot handling (crawl)Extra cost

Pricing model$49-499/month

Minimum spend$49/month

Failed pages billed?Compute time billed

Depth controlActor config

Sitemap-aware crawlingActor-dependent

Prices sourced from public pricing pages as of April 2026. Apify costs are approximate (compute-time-based billing varies by actor configuration).

Web Crawling API FAQ

Your first scrape.
Sixty seconds.

$1 free credit — up to 5,000 scrapes. No credit card.
Just a POST request.

terminal

curl -X POST https://api.alterlab.io/v1/scrape \

-H "X-API-Key: YOUR_KEY" \

-H "Content-Type: application/json" \

-d '{"url": "https://example.com", "formats": ["markdown"]}'

Start building free

No credit card required · $1 free credit, up to 5,000 scrapes · Balance never expires

Web Crawling API

Simple API, Powerful Results

Multi-Format Output

Adaptive Rendering

3 Lines to Integrate

How Web Crawling Works

Seed URL Submission

Link Discovery & Queueing

Automatic Website Compatibility Per Page

Structured Results via Webhook

Built for Production Crawls

Depth Control

Sitemap-Aware

Automatic Website Compatibility

Include/Exclude Patterns

Crawling Use Cases

Content Indexing

Competitive Monitoring

Data Pipeline Feeds

SEO Auditing

Part of a Bigger Pipeline

Map → Crawl: Index an Entire Site

Crawl Cost Estimator

AlterLab

Same crawl on competitors

Crawl Pricing Comparison

AlterLab

Firecrawl

ScrapingBee

Apify

Web Crawling API FAQ

Crawling & Scraping Resources

Batch Scraping API

Anti-Bot Handling API

JavaScript Rendering API

View Pricing

Your first scrape.
Sixty seconds.

Web Crawling API

Simple API, Powerful Results

Multi-Format Output

Adaptive Rendering

3 Lines to Integrate

How Web Crawling Works

Seed URL Submission

Link Discovery & Queueing

Automatic Website Compatibility Per Page

Structured Results via Webhook

Built for Production Crawls

Depth Control

Sitemap-Aware

Automatic Website Compatibility

Include/Exclude Patterns

Crawling Use Cases

Content Indexing

Competitive Monitoring

Data Pipeline Feeds

SEO Auditing

Part of a Bigger Pipeline

Map → Crawl: Index an Entire Site

Crawl Cost Estimator

AlterLab

Same crawl on competitors

Crawl Pricing Comparison

AlterLab

Firecrawl

ScrapingBee

Apify

Web Crawling API FAQ

How does AlterLab's web crawling API work?

How does AlterLab compare to Firecrawl?

Can the crawler handle JavaScript-heavy sites?

What depth limits are supported?

How do I handle large crawl jobs with thousands of pages?

Is sitemap.xml used during crawling?

Crawling & Scraping Resources

Batch Scraping API

Anti-Bot Handling API

JavaScript Rendering API

View Pricing

Your first scrape. Sixty seconds.

Your first scrape.
Sixty seconds.