The full data collection
pipeline in one place.
Search finds pages. Map discovers site structure. Crawl extracts content at scale. Three APIs, one workflow — with anti-bot bypass built in at every step.
Three APIs. One Pipeline.
Each API is useful standalone. Together they cover the full data collection workflow — from discovery to extraction.
Search API
Find the right pages
Submit a query — get structured SERP results from Google, Bing, or DuckDuckGo. Each result includes title, URL, snippet, position, and domain as structured JSON.
Map API
Discover site structure
Submit a domain — get every URL on the site. AlterLab parses sitemaps and crawls link graphs to return a complete URL inventory. Filter by pattern or depth.
Crawl API
Extract content at scale
Submit a URL list — get structured content for every page. HTML, Markdown, metadata, screenshots. 5-tier anti-bot bypass on every page automatically.
How the Pipeline Works
Each stage feeds the next. The output of Search or Map is a URL list — exactly what Crawl takes as input.
Stage 1 — Search
$0.001 / searchSubmit a query — get structured SERP results from Google, Bing, or DuckDuckGo. Each result includes title, URL, snippet, position, and domain as structured JSON.
Stage 2 — Map
$0.0005 / domain mapSubmit a domain — get every URL on the site. AlterLab parses sitemaps and crawls link graphs to return a complete URL inventory. Filter by pattern or depth.
Stage 3 — Crawl
$0.0002 – $0.004 / pageSubmit a URL list — get structured content for every page. HTML, Markdown, metadata, screenshots. 5-tier anti-bot bypass on every page automatically.
Pipeline output format: Search returns organic_results[].url. Map returns urls[]. Both are URL lists — pass them directly to Crawl's urls parameter. No transformation needed.
Real-World Pipeline Examples
Three complete end-to-end workflows — copy and adapt for your use case.
Competitive Intelligence Pipeline
Map a competitor's site structure, crawl all their blog posts, and build a searchable knowledge base in minutes.
import alterlab
client = alterlab.Client(api_key="YOUR_KEY")
# Step 1: Map competitor site to discover all URLs
map_result = client.map("https://competitor.com", max_urls=2000)
blog_urls = [u for u in map_result.urls if "/blog/" in u]
# Step 2: Crawl all blog pages for full content
crawl = client.crawl(
urls=blog_urls,
formats=["markdown"],
max_concurrent=10,
)
# Step 3: Search for competitor brand mentions
mentions = client.search(
"competitor brand name site:reddit.com",
num_results=20,
scrape_results=True,
)
print(f"Crawled {len(crawl.results)} blog posts")
print(f"Found {len(mentions.organic_results)} brand mentions")SEO Audit Pipeline
Discover all pages on a domain, search for ranking keywords, and cross-reference content to find gaps.
import alterlab
client = alterlab.Client(api_key="YOUR_KEY")
# Step 1: Map the full site structure
site_map = client.map("https://yoursite.com", max_urls=5000)
product_pages = [u for u in site_map.urls if "/products/" in u]
# Step 2: Search for keywords you want to rank for
serp = client.search(
"best web scraping api 2026",
engine="google",
num_results=10,
scrape_results=True,
)
# Step 3: Crawl your product pages to compare content
your_pages = client.crawl(
urls=product_pages[:50],
formats=["markdown", "metadata"],
)
# Analyze content gaps vs top-ranking pages
top_competitor_content = [r.content for r in serp.organic_results]
your_content = [p.markdown for p in your_pages.results]AI Knowledge Base Builder
Search for authoritative sources, map their full site structure, and crawl all pages into a Markdown corpus for RAG or fine-tuning.
import alterlab
client = alterlab.Client(api_key="YOUR_KEY")
# Step 1: Find authoritative sources via search
sources = client.search(
"machine learning documentation tutorials",
num_results=5,
)
source_domains = list({r.domain for r in sources.organic_results})
# Step 2: Map each source domain for full URL inventory
all_urls = []
for domain in source_domains:
result = client.map(f"https://{domain}", max_urls=500)
doc_urls = [u for u in result.urls if "/docs/" in u or "/guide" in u]
all_urls.extend(doc_urls)
# Step 3: Crawl everything into Markdown
corpus = client.crawl(
urls=all_urls,
formats=["markdown"],
max_concurrent=20,
)
print(f"Knowledge base: {len(corpus.results)} documents")
print(f"Total size: {sum(len(p.markdown) for p in corpus.results):,} chars")Pipeline Pricing
Pay only for what you use. No subscriptions, no monthly minimums. Balance never expires.
Typical competitive intelligence run
Per query. SERP results from Google, Bing, or DuckDuckGo.
Per domain. Full URL inventory including sitemap parsing.
Per page. Basic HTML to $0.004 for JS-rendered pages.
Why One Platform
Most teams stitch together separate vendors for search, crawling, and proxies. AlterLab removes the integration overhead.
One API key
Search, Map, and Crawl all authenticate with the same key. One integration, one dashboard, one billing balance.
Consistent response format
Every API returns the same base envelope. The URL list from Map feeds directly into Crawl — no transformation or adapter code needed.
Anti-bot bypass at every stage
SERP bypass, sitemap fetching, and page crawling all run through the same 5-tier anti-bot pipeline. No per-step configuration.
Unified pricing
One balance covers all three APIs. No per-product subscriptions, no separate billing systems to reconcile.
Pay only for successes
Failed pages, blocked requests, and timeouts are retried automatically. You are only charged when content is successfully returned.
No monthly minimum
Run one pipeline a month or ten thousand. Your balance never expires. Scale up and down without contract changes.
Explore the Pipeline APIs
Search API
Structured SERP results from Google, Bing, and DuckDuckGo. $0.001/query. Pass scrape_results=true to fetch each result page in one call.
Web Crawling API
Full-site crawling with depth control, anti-bot bypass, and structured output. From $0.0002/page.
Web Scraping Pipelines for AI Agents
Build LLM-ready data pipelines that minimize token waste and extraction cost.
Pricing
Pay only for what you use across all three APIs. Balance never expires.
Your first scrape.
Sixty seconds.
$1 free balance. No credit card. No SDK.
Just a POST request.
No credit card required · Up to 5,000 free scrapes · Balance never expire