AlterLabAlterLab
Guide

Caching

Reduce costs and improve performance by caching scrape results. Pay once, reuse many times.

Opt-In Caching

Caching is disabled by default. You must explicitly enable it with cache: true. This ensures you always get fresh data unless you specifically want cached results.

How Caching Works

1

First Request

When you make a request with cache: true, we scrape the page, store the result, and return it. You're charged normally.

2

Subsequent Requests

Within the TTL window, same URL requests return cached data instantly. Free for cache hits.

3

Cache Expiry

After TTL expires, the next request fetches fresh data and refreshes the cache.

Cost Savings

Cache hits are completely free. If you're scraping the same pages repeatedly (e.g., monitoring, testing), caching can reduce your costs by 90%+.

Enabling Cache

Add cache: true to your request:

curl -X POST https://api.alterlab.io/api/v1/scrape \
  -H "X-API-Key: YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "url": "https://example.com/products",
    "cache": true
  }'

Cache TTL

Control how long results are cached with cache_ttl (in seconds):

TTL ValueDurationUse Case
601 minuteReal-time data, stock prices
90015 minutes (default)General scraping
36001 hourProduct pages, articles
8640024 hours (max)Static content, documentation
# Cache for 1 hour
response = requests.post(
    "https://api.alterlab.io/api/v1/scrape",
    headers={"X-API-Key": "YOUR_API_KEY"},
    json={
        "url": "https://example.com/blog/article",
        "cache": True,
        "cache_ttl": 3600  # 1 hour in seconds
    }
)

# Cache for 24 hours (maximum)
response = requests.post(
    "https://api.alterlab.io/api/v1/scrape",
    headers={"X-API-Key": "YOUR_API_KEY"},
    json={
        "url": "https://docs.example.com/api-reference",
        "cache": True,
        "cache_ttl": 86400  # 24 hours
    }
)

TTL Limits

  • Minimum: 60 seconds
  • Maximum: 86400 seconds (24 hours)
  • Default: 900 seconds (15 minutes) when not specified

Force Refresh

Need fresh data but want to update the cache? Use force_refresh: true:

# Force a fresh scrape and update the cache
response = requests.post(
    "https://api.alterlab.io/api/v1/scrape",
    headers={"X-API-Key": "YOUR_API_KEY"},
    json={
        "url": "https://example.com/products",
        "cache": True,
        "force_refresh": True  # Bypass cache, fetch fresh, update cache
    }
)

# This is charged normally (not a cache hit)
print(f"Credits used: {data['credits_used']}")  # Normal pricing

When to Use Force Refresh:

  • You know the page content has changed
  • User explicitly requests fresh data
  • Debugging/testing cache behavior

Cache Response Headers

The response includes information about cache status:

Response
{
  "success": true,
  "content": "...",
  "cached": true,           // true if served from cache
  "cache_age": 342,         // seconds since cached (if cached)
  "cache_ttl": 900,         // original TTL setting
  "credits_used": 0,        // $0 for cache hits
  "timing": {
    "total_ms": 12          // Much faster for cache hits
  }
}

Best Practices

1. Match TTL to Content Freshness

Set TTL based on how often the content changes. News sites need shorter TTLs than documentation.

2. Use Consistent URLs

Cache keys are based on the exact URL. example.com/page and example.com/page/ are different cache entries.

3. Cache Static Resources Aggressively

Documentation, help pages, and rarely-changing content can use 24-hour TTL.

4. Don't Cache Dynamic Content

Search results, personalized pages, and time-sensitive data should use cache: false.

Cache Invalidation

To invalidate a cached entry before TTL expires:

Option 1: Force Refresh

Use force_refresh: true to fetch fresh data and update the cache.

Option 2: Disable Cache

Set cache: false to bypass cache entirely (doesn't update cache).

Option 3: Wait for TTL

Cache automatically expires after TTL. No action needed.

No Manual Purge

Currently, there's no API to manually purge specific cache entries. Use force_refresh if you need to update a cached page.