AlterLabAlterLab
PricingComparePlaygroundBlogDocsChangelog
    AlterLabAlterLab
    PricingComparePlaygroundBlogDocsChangelog
    IntroductionQuickstartInstallationYour First Request
    REST APIJob PollingAPI KeysSessions APINew
    OverviewPythonNode.js
    JavaScript RenderingOutput FormatsPDF & OCRCachingWebhooksJSON Schema FilteringWebSocket Real-TimeBring Your Own ProxyProAuthenticated ScrapingNewWeb CrawlingBatch ScrapingSchedulerChange DetectionCloud Storage ExportSpend LimitsOrganizations & TeamsAlerts & Notifications
    Structured ExtractionAIE-commerce ScrapingNews MonitoringPrice MonitoringMulti-Page CrawlingMonitoring DashboardAI Agent / MCPMCPData Pipeline to Cloud
    PricingRate LimitsError Codes
    From FirecrawlFrom ApifyFrom ScrapingBee / ScraperAPI
    PlaygroundPricingStatus
    Guide

    Caching

    Reduce costs and improve performance by caching scrape results. Pay once, reuse many times.

    Opt-In Caching

    Caching is disabled by default. You must explicitly enable it with cache: true. This ensures you always get fresh data unless you specifically want cached results.

    How Caching Works

    1

    First Request

    When you make a request with cache: true, we scrape the page, store the result, and return it. You're charged normally.

    2

    Subsequent Requests

    Within the TTL window, same URL requests return cached data instantly. Free for cache hits.

    3

    Cache Expiry

    After TTL expires, the next request fetches fresh data and refreshes the cache.

    Cost Savings

    Cache hits are completely free. If you're scraping the same pages repeatedly (e.g., monitoring, testing), caching can reduce your costs by 90%+.

    Enabling Cache

    Add cache: true to your request:

    Bash
    curl -X POST https://api.alterlab.io/api/v1/scrape \
      -H "X-API-Key: YOUR_API_KEY" \
      -H "Content-Type: application/json" \
      -d '{
        "url": "https://example.com/products",
        "cache": true
      }'

    Cache TTL

    Control how long results are cached with cache_ttl (in seconds):

    TTL ValueDurationUse Case
    601 minuteReal-time data, stock prices
    90015 minutes (default)General scraping
    36001 hourProduct pages, articles
    8640024 hours (max)Static content, documentation
    Python
    # Cache for 1 hour
    response = requests.post(
        "https://api.alterlab.io/api/v1/scrape",
        headers={"X-API-Key": "YOUR_API_KEY"},
        json={
            "url": "https://example.com/blog/article",
            "cache": True,
            "cache_ttl": 3600  # 1 hour in seconds
        }
    )
    
    # Cache for 24 hours (maximum)
    response = requests.post(
        "https://api.alterlab.io/api/v1/scrape",
        headers={"X-API-Key": "YOUR_API_KEY"},
        json={
            "url": "https://docs.example.com/api-reference",
            "cache": True,
            "cache_ttl": 86400  # 24 hours
        }
    )

    TTL Limits

    • Minimum: 60 seconds
    • Maximum: 86400 seconds (24 hours)
    • Default: 900 seconds (15 minutes) when not specified

    Cache Key Strategy

    Cache keys are generated from the combination of your API key, the target URL, and scraping options. This means:

    Same URL + same options = cache hit (free)

    Same URL + different options (e.g., different extraction_profile) = separate cache entry

    Different API keys = separate cache entries (isolated per user)

    Force Refresh

    Need fresh data but want to update the cache? Use force_refresh: true:

    Python
    # Force a fresh scrape and update the cache
    response = requests.post(
        "https://api.alterlab.io/api/v1/scrape",
        headers={"X-API-Key": "YOUR_API_KEY"},
        json={
            "url": "https://example.com/products",
            "cache": True,
            "force_refresh": True  # Bypass cache, fetch fresh, update cache
        }
    )
    
    # This is charged normally (not a cache hit)
    print(f"Cost: {data['credits_used']}")  # Normal pricing

    When to Use Force Refresh:

    • You know the page content has changed
    • User explicitly requests fresh data
    • Debugging/testing cache behavior

    Cache Response Headers

    The response includes information about cache status:

    Response
    JSON
    {
      "content": "...",
      "cached": true,                          // true if served from cache
      "cached_at": "2026-03-24T10:30:00Z",    // when result was cached
      "expires_at": "2026-03-24T10:45:00Z",   // when cache entry expires
      "stale_cache": false,                    // true if stale cache served after scrape failure
      "response_time_ms": 12,                 // much faster for cache hits
      "billing": {
        "total_credits": 0,                   // $0 for cache hits
        "tier_used": "cache"
      }
    }

    Best Practices

    1. Match TTL to Content Freshness

    Set TTL based on how often the content changes. News sites need shorter TTLs than documentation.

    2. Use Consistent URLs

    Cache keys are based on the exact URL. example.com/page and example.com/page/ are different cache entries.

    3. Cache Static Resources Aggressively

    Documentation, help pages, and rarely-changing content can use 24-hour TTL.

    4. Don't Cache Dynamic Content

    Search results, personalized pages, and time-sensitive data should use cache: false.

    Cache Invalidation

    To invalidate a cached entry before TTL expires:

    Option 1: Force Refresh

    Use force_refresh: true to fetch fresh data and update the cache.

    Option 2: Disable Cache

    Set cache: false to bypass cache entirely (doesn't update cache).

    Option 3: Wait for TTL

    Cache automatically expires after TTL. No action needed.

    No Manual Purge

    Currently, there's no API to manually purge specific cache entries. Use force_refresh if you need to update a cached page.
    PDF & OCRWebhooks
    Last updated: March 2026

    On this page