AlterLabAlterLab
PricingComparePlaygroundBlogDocsChangelog
    AlterLabAlterLab
    PricingComparePlaygroundBlogDocsChangelog
    IntroductionQuickstartInstallationYour First Request
    REST APIJob PollingAPI KeysSessions APINew
    OverviewPythonNode.js
    JavaScript RenderingOutput FormatsPDF & OCRCachingWebhooksJSON Schema FilteringWebSocket Real-TimeBring Your Own ProxyProAuthenticated ScrapingNewWeb CrawlingBatch ScrapingSchedulerChange DetectionCloud Storage ExportSpend LimitsOrganizations & TeamsAlerts & Notifications
    Structured ExtractionAIE-commerce ScrapingNews MonitoringPrice MonitoringMulti-Page CrawlingMonitoring DashboardAI Agent / MCPMCPData Pipeline to Cloud
    PricingRate LimitsError Codes
    From FirecrawlFrom ApifyFrom ScrapingBee / ScraperAPI
    PlaygroundPricingStatus
    Guide
    New

    Search Guide

    Find relevant web pages when you know what you need but not where it lives. Search is the discovery layer that feeds into scraping and extraction.

    Prerequisite

    This guide covers patterns and workflows. For the full parameter reference, see the Search API Reference.

    When to Use Search

    The Search endpoint is for cases where you need to discover URLs before scraping them. Compare with direct scraping:

    ScenarioUseWhy
    You have the exact URL/v1/scrapeDirect scrape is cheaper and faster
    You need to find URLs by keyword/v1/searchSearch discovers, then scrape fetches
    You need all pages on a domain/v1/crawlCrawl follows links systematically
    You want specific pages on a domain/v1/search + domainDomain-scoped search finds relevant pages without crawling everything

    Domain-Scoped Search

    Use the domain parameter to restrict results to a specific website. This is faster and more targeted than crawling an entire site.

    Python
    import requests
    
    # Find all documentation pages about authentication on GitHub Docs
    response = requests.post(
        "https://api.alterlab.io/api/v1/search",
        headers={"X-API-Key": "YOUR_API_KEY"},
        json={
            "query": "OAuth2 authentication tokens",
            "domain": "docs.github.com",
            "num_results": 20
        }
    )
    
    data = response.json()
    print(f"Found {data['results_count']} pages about auth on GitHub Docs")
    
    for r in data["results"]:
        print(f"  {r['position']}. {r['title']}")
        print(f"     {r['url']}")

    How Domain Scoping Works

    The domain parameter adds a site: prefix to your query. You can pass just the domain (e.g., docs.github.com) without the protocol.

    Search + Scrape Workflow

    The most powerful pattern: find pages and scrape them in a single API call. Set scrape_results: true to get full page content alongside search results.

    Python
    import requests
    import time
    
    # Search and scrape in one call
    response = requests.post(
        "https://api.alterlab.io/api/v1/search",
        headers={"X-API-Key": "YOUR_API_KEY"},
        json={
            "query": "Python async best practices 2026",
            "num_results": 5,
            "scrape_results": True,
            "formats": ["text", "markdown"]
        }
    )
    
    data = response.json()
    
    # For <= 5 results, content may be available immediately
    # For > 5, poll using search_id
    if response.status_code == 202:
        search_id = data["search_id"]
        while True:
            status = requests.get(
                f"https://api.alterlab.io/api/v1/search/{search_id}",
                headers={"X-API-Key": "YOUR_API_KEY"}
            ).json()
            print(f"Progress: {status['completed']}/{status['results_count']}")
            if status["status"] == "completed":
                data = status
                break
            time.sleep(2)
    
    # All results now have full content
    for result in data["results"]:
        print(f"\n--- {result['title']} ---")
        if result.get("content"):
            text = result["content"].get("text", "")
            print(f"  {len(text)} characters of content")

    Time-Filtered Search

    Use time_range to find recent content. Useful for news monitoring, trend tracking, and finding up-to-date information.

    Python
    # Find articles published in the last week
    response = requests.post(
        "https://api.alterlab.io/api/v1/search",
        headers={"X-API-Key": "YOUR_API_KEY"},
        json={
            "query": "AI regulation Europe",
            "time_range": "week",
            "num_results": 10
        }
    )
    
    data = response.json()
    print(f"Found {data['results_count']} recent articles")
    
    # Available time ranges:
    # "hour"  — last hour
    # "day"   — last 24 hours
    # "week"  — last 7 days
    # "month" — last 30 days
    # "year"  — last 12 months

    Geo-Targeted Search

    Combine country and language parameters to get localized search results — essential for competitive analysis across markets.

    Python
    # Search from a German perspective, in German
    response = requests.post(
        "https://api.alterlab.io/api/v1/search",
        headers={"X-API-Key": "YOUR_API_KEY"},
        json={
            "query": "beste Webhosting Anbieter",
            "country": "DE",
            "language": "de",
            "num_results": 10
        }
    )
    
    # Compare with US results
    response_us = requests.post(
        "https://api.alterlab.io/api/v1/search",
        headers={"X-API-Key": "YOUR_API_KEY"},
        json={
            "query": "best web hosting providers",
            "country": "US",
            "language": "en",
            "num_results": 10
        }
    )
    
    de_urls = {r["url"] for r in response.json()["results"]}
    us_urls = {r["url"] for r in response_us.json()["results"]}
    print(f"Overlap: {len(de_urls & us_urls)} URLs in common")

    Search + Extract Pipeline

    Combine search, scraping, and structured extraction in a single call. Pass an extraction_schema with scrape_results: true to get structured data from every result.

    Python
    import requests
    import time
    
    # Find and extract pricing from competitor pages
    response = requests.post(
        "https://api.alterlab.io/api/v1/search",
        headers={"X-API-Key": "YOUR_API_KEY"},
        json={
            "query": "web scraping API pricing",
            "num_results": 5,
            "scrape_results": True,
            "formats": ["text"],
            "extraction_schema": {
                "type": "object",
                "properties": {
                    "company_name": {"type": "string"},
                    "plans": {
                        "type": "array",
                        "items": {
                            "type": "object",
                            "properties": {
                                "name": {"type": "string"},
                                "price": {"type": "string"},
                                "requests_per_month": {"type": "string"}
                            }
                        }
                    },
                    "free_tier": {"type": "boolean"}
                }
            }
        }
    )
    
    data = response.json()
    
    # Poll if async
    if response.status_code == 202:
        while True:
            status = requests.get(
                f"https://api.alterlab.io/api/v1/search/{data['search_id']}",
                headers={"X-API-Key": "YOUR_API_KEY"}
            ).json()
            if status["status"] == "completed":
                data = status
                break
            time.sleep(2)
    
    # Extracted pricing data from each competitor
    for result in data["results"]:
        ext = result.get("content", {})
        if ext and ext.get("extraction"):
            pricing = ext["extraction"]
            print(f"\n{pricing.get('company_name', result['title'])}:")
            print(f"  Free tier: {pricing.get('free_tier', 'Unknown')}")
            for plan in pricing.get("plans", []):
                print(f"  {plan['name']}: {plan['price']}")

    AI Agent Patterns

    Search is the discovery tool for AI agents. The typical flow is: search → scrape → extract → reason. Here is a minimal agent loop:

    Python
    import requests
    
    API_KEY = "YOUR_API_KEY"
    BASE = "https://api.alterlab.io/api/v1"
    
    def research(topic: str, num_sources: int = 5) -> list[dict]:
        """Search, scrape, and extract key facts about a topic."""
    
        # Step 1: Discover relevant pages
        search = requests.post(
            f"{BASE}/search",
            headers={"X-API-Key": API_KEY},
            json={
                "query": topic,
                "num_results": num_sources,
                "time_range": "month",       # Recent content only
                "scrape_results": True,       # Fetch full text
                "formats": ["text"],
                "extraction_schema": {
                    "type": "object",
                    "properties": {
                        "title": {"type": "string"},
                        "key_facts": {
                            "type": "array",
                            "items": {"type": "string"},
                            "description": "3-5 key facts or findings"
                        },
                        "date_published": {"type": "string"}
                    }
                }
            }
        ).json()
    
        # Step 2: Collect extracted data
        sources = []
        for result in search.get("results", []):
            ext = (result.get("content") or {}).get("extraction")
            sources.append({
                "url": result["url"],
                "title": result["title"],
                "facts": ext.get("key_facts", []) if ext else [],
                "date": ext.get("date_published") if ext else None,
            })
    
        return sources
    
    # Use it
    sources = research("quantum computing breakthroughs 2026")
    for s in sources:
        print(f"\n{s['title']} ({s['url']})")
        for fact in s["facts"]:
            print(f"  - {fact}")

    Full Tutorial

    For a complete AI research agent with LangChain integration, see the AI Research Agent Tutorial.

    Best Practices

    1. Start with Search-Only

    Run a search-only call first (2 credits) to verify results are relevant before committing to scrape credits. Then pass the URLs you want to /v1/scrape or /v1/batch.

    2. Use Domain Scoping for Site Search

    Instead of crawling an entire site, use domain to find the specific pages you need. This is faster and cheaper than a full crawl.

    3. Limit num_results When Scraping

    Each scraped result costs additional credits. Start with 5 results, verify quality, then scale up. Use num_results: 5 or less to get inline results without polling.

    4. Add Time Ranges for Freshness

    For news, trends, or rapidly changing topics, always set time_range. Without it, results may include outdated content.

    5. Use Extraction Schemas for Structured Output

    When building pipelines, pass an extraction_schema to get consistent, machine-readable data from every result page.

    ← Previous: SchedulerNext: AI Research Agent →
    Last updated: March 2026

    On this page